New Arm codes
Colin Ferris (399) 1822 posts |
MOVW & MOVT Two new Arm codes used together to load a 32bit constant. |
Stuart Swales (8827) 1367 posts |
New?! ;-) From the ARM blog: “ARMv7-M was the first profile of the ARMv7 architecture to result in physical products, with the release of the Cortex-M3 in 2004. ARMv7-A was launched just a year later with the release of the Cortex-A8 processor in 2005.” And ARMv6-T2 even earlier (2003) for Thumb code. Remember the myriad archaic stuff still in use that will barf if you use these destructions. |
Rick Murray (539) 13908 posts |
Around here, yeah. ;) When Vince tried my Tea app for a Riscository article, it choked under RPCEmu because my MakeFiles usually target a later architecture (usually ARMv6 with instructions scheduled for an A8, IIRC). (another reason why emulating a near thirty year old machine isn’t great…) |
Simon Willcocks (1499) 540 posts |
I think you can reasonably call it 30 years old now, it’s only a month off it! |
Cameron Cawley (3514) 159 posts |
Which compile flags were you using for this? My understanding is that MOVW and MOVT are specific to ARMv6T2 and aren’t on the Raspberry Pi 1 or 0, so if the compiler emits those instructions when just ARMv6 is specified, that sounds like a bug. |
Colin Ferris (399) 1822 posts |
Pity the ACE emulator of extra ARM codes doesn’t catch these two – not much point adding new codes if older hardware can’t be used Bit like going to 64bit – if there is no available software. |
Stuart Swales (8827) 1367 posts |
Bit hard when they used to be valid (albeit not useful) instructions and hence didn’t take the undefined instruction trap. Possibly they do on ARMv4. [Edit: See https://www.riscosopen.org/forum/forums/1/topics/17498#posts-135498, because the ACE emulator DOES support them.]
Disagree here – if software can be usefully speeded up by using ARMv7 instructions (rather than trivially, as would be the case for 99.9% uses of MOVW/MOVT instead of LDR) then it can be useful to release a separate compiled-for-ARMv7 version for those who’d benefit (i.e. everyone with a post-Pi 1 system). |
Colin Ferris (399) 1822 posts |
If ACE works ok – Rick could have advised his user with one of the emulators – allowing Rick to use the Compiler setting’s of choice |
Rick Murray (539) 13908 posts |
Might it have been ARMv7? The compile options for the earlier command line fsg program are
That’s not really how most of the rest of the ARM world operates. You usually either get an OS pre-built for the device (iOS, Android, Raspian, etc) and most software is either built from source or, in the case of things like Android, are supplied as a sort of P-code that is given final compilation during installation. It’s just us on RISC OS that get upset if a program written in 1997 doesn’t work properly.
It’s worth looking at what those two actually do. One loads a 16 bit value into the lower 16 bits of a register (caveat: clears the upper 16 bits), the other loads a 16 bit value into the upper 16 bits of a register (not touching the lower 16 bits). Why on earth would you want to do such a thing? Well, there is no way to MOV a 32 bit value into a register, because you can’t specify the value as a full 32 bits in an instruction whose encoding is that same 32 bits. This is typically resolved by either MOVing and ORing values into the register, or by pointing at a word in memory and fetching it into the register (like an LDR or ADR/LDR combo). With multiple instructions, some pathological cases can require up to four instructions. The worst case way of loading any value would be something like: MOV R0, #&xx000000 ORR R0, R0, #&00xx0000 ORR R0, R0, #&0000xx00 ORR R0, R0, #&000000xx With loading from memory, you have either this: LDR R0,=&xxxxxxxx or: LDR R0, myword ... .myword DCD &xxxxxxxx Or even… ADR R0, myword ; or fake up an ADRW if far away LDR R0, [R0] ... myword DCD &xxxxxxxx These are all broadly equivalent… The first one says “let the assembler figure it out” and it’ll construct a pair of LDR and the associated data word, exactly like the second example. The third… is really only necessary if the address is far away, otherwise use one of the first two ways. But this comes with caveats. The first is that the data to load must be within 4K either side of the instruction (as it’s technically pre-indexed from PC), or you’ll have to construct an ADRW like the third example. By using MOVW and MOVT, you have a guaranteed two instruction way to load any value into a register without requiring any memory access. AREA |Asm$Code|, CODE, A32bit ENTRY MOVW R0, #&5678 ; clears upper nibble so do this FIRST MOVT R0, #&1234 ; doesn't touch lower 16 bits ADR R1, buffer MOV R2, #12 SWI &D4 ; OS_ConvertHex8 SWI &02 ; OS_Write0 SWI &03 ; OS_NewLine MOV PC, R14 buffer DCD 0 DCD 0 DCD 0 END You’ll need to specify something like The problem here isn’t that these instructions exist, the problem is that our current emulator solution is a really old machine.
What a pain in the arse. Better to just recompile for ARMv3 and accept that it’ll be marginally less efficient on newer machines… but, then, we’re talking an OS that’s only capable of using one of the four available cores, so swings and roundabouts. |
David J. Ruck (33) 1649 posts |
I never got around to adding all the ARMv7 and later instructions to ARMalyser, as it could potentially make it worse. To cope with many old executables constructed from convoluted handwritten assembler, or even worse the tangle produced by early versions of gcc (Norcroft was beautiful by comparison), once ARMalyser has followed all the defined entry points and static branches, it then has to guess if what is left is actually data or some code which may be reached dynamically. This is much easier with a small instruction set (ARMv5 and earlier) where large chunks of the instruction space aren’t valid or likely e.g. things like sequences of NV instructions – but ARMv7 repurposes those former NV instructions to valid new instructions. MOVW and MOVT take up a massive chunk of the instruction space and If I put them in, it would think everything was code! I could have only enabled ARMv7 optionally, but the tool was designed to help port 26 bit, and mixing 26 bit and ARMv7 instructions in the same program is unlikely. |
Ralph Barrett (1603) 155 posts |
Keep ARMalyser ‘as is’ IMHO. A very valuable tool for those who try and port 26bit RISC OS code. Without ARMalyser, I’d have probably never attempted to port !ARM_Debug. So thanks Druck. RISC OS 5 needs as many good tools and utilities as possible, in my opinion… Ralph |
tymaja (278) 178 posts |
ARMv8.4 (in Aarch64) introduces an interesting instruction – RMIF, the functionality of which was missing in earlier versions of the 64-bit architecture. It rotates a register right into flags – a variable size rotate right, and you can decide any combination of the NZCV flags it rotates into. It seems like the RRX shift type was missed from Aarch32! (Aarch64 has also had dedicated memory move commands added as well … reinventing the Aarch32 wheel?) |