Inline assembler
Rick Murray (539) 13840 posts |
Sorry. I’m loving the inline assembler (in the C compiler). Think about it. There are times when you just want to do something simple, but making simple API calls with C is not always as easy as it looks. Let’s say… turning off the Hourglass. The quickest (in terms of resources) way, in C, used to be to create a little bit of assembler:
and then:
This would do it via a branch to my function and a branch back. Not bad. The quickest and simplest pure C way would be:
One line of C. Simple? Well, so long as when calling a SWI you don’t mind three instructions to sort out the SWI number, loading R0-R9 (whether or not you need them), calling a different SWI to invoke the desired SWI, saving R0-R9, and a few instructions to deal with error state. It’s a one-size-fits-all so it won’t be especially efficient. Another option might be:
(I think? Never used _swix()). This is a friendlier looking routine that you can tell it the input and output registers. _IN(0) and _OUT(0) and such. Just…don’t…look…at…the…code… Now? Now there’s a better way. It blows away everything else.
The first brace lists registers to pass to the SWI, the second lists registers we want back from the SWI, the third lists those that are corrupted by the SWI. But, wait, it’s better. We aren’t actually writing in pure ARM code. We are talking to the back end code generation part of the compiler, so it is possible to do cool stuff like:
The result? Your instructions might have been ‘optimised’ and rearranged slightly, but the compiler will output code to meet your wishes. In this case, I’m finding a lot of my calls to _kernel_swi() can be replaced by short pieces of assembler to do the exact same thing directly in the program. So – a big thumb up from me for this functionality. ☺ |
Colin (478) 2433 posts |
nah swix is better. |
GavinWraith (26) 1563 posts |
I have not used the inline assembler of the Norcroft compiler, but I have just been scratching the surface of the one in GCC. The trouble, it seems to me, is that compilers are now so darn smart at optimizing that what you think is going into that code may turn out actually to be quite different. Perhaps it is because CPUs are now too complicated for humans to have an adequate picture of what they do, what order they do it in, and how fast. |
Jeff Doggett (257) 234 posts |
_kernel_swi always calls the X form of the SWI. |
Rick Murray (539) 13840 posts |
;-) |
Rick Murray (539) 13840 posts |
Colin – take a look at the code for swix and you’ll see what I mean. Gavin – that’s why I said it isn’t pure ARM code. You need to understand that the code generation may alter things. |
Colin (478) 2433 posts |
My days of worrying about code speed have long passed. If I’m writing in C I don’t want to be bothered with which registers are corrupted and don’t want to fill registers with movs before calling a swi (I don’t like _kernel_swi for the same reason) swix(HourglassOff) is nicer. You can call the whole swi on 1 line and don’t need extra lines of code to set up more complicated swis. What is the hurry in calling a swi, swis take ages anyway so a bit longer is unlikely to make much difference. |
Chris Mahoney (1684) 2165 posts |
Theoretically swix is more future-proof; a hypothetical ARM revision may change the way that SWIs are called. Update the DDE, recompile, and you’re set. No code changes required. In practice, probably not an issue :) |
Colin (478) 2433 posts |
You could probably modify the compiler to compile the inline asm to suit the new target. |
Rick Murray (539) 13840 posts |
Assuming the API doesn’t change drastically and require stuff to be rewritten. Case in point, the register block for kernel swi is int, not even long and not unsigned or anything either. In practice it makes little difference on the ARM except that there’s a lot of casting from functions that return correctly typed parameters. Now if the processor word size becomes 64 bit, say, what does that do to int, long, and long long? Whatever option you pick, stuff is going to need to be rewritten to cope… |
Steffen Huber (91) 1953 posts |
Just use OSLib. Type safety is much more important than anything else during development and maintenance. If you happen to have a performance problem and it is really the SWI call overhead and not the SWI itself (and changes are very slim), convert it to Rick’s method and plaster it with comments on why you do so. |
Rick Murray (539) 13840 posts |
Colin: While it is somewhat comical to worry about a few instructions on a multitasking single-processor machine that may also be clobbered by thousands of USB interrupts per second… I am not sure I want to call a SWI using a SWI mechanism that it, alone, is half the size of my function.1 Plus… Steffen: I’m not sure there is much between DeskLib, OSLib, and RISC_OSLib in the respect of how to use them; however given deficiencies2 in the Debugger disassembly, I find the code is a lot easier to follow if short API calls (OS_ReadMonotonicTime, Hourglass_SomethingOrOther, etc) are inlined directly into the code. That way, when looking at the executable, I can see right away what is going on instead of a function that contains a load of branches to code elsewhere. 1 I grew up when a single kilobyte meant a lot and processors were…not so nippy. I don’t go OTT on saving every single byte ad cycle; but if one gives up caring about the efficiency of their code, the logical end result is something like Windows. Just look at where the Linux kernel is going… ☺ 2 An earlier Disassembler, Darren Salt’s I think, was able to look to see if branch points were to the start of an APCS labelled function and was able to detect the CLib jump table, so branches from code were appropriately annotated with a comment indicating where that branch was going – such as |
Steffen Huber (91) 1953 posts |
DeskLib and RISC_OSLib are high level libs, while OSLib is just a typesafe wrapper around SWIs. So a completely different approach. You can even use OSLib from assembler. |
Chris Mahoney (1684) 2165 posts |
Now look in swis.h: #define XOS_Bit (1U << 17) /*deprecated: use _swi() or _swix()*/ Time to ;-) back at you, I think! |
Jeff Doggett (257) 234 posts |
The point that I was trying to make with kernel_swi using the X form, is that in the trivial example given of replacing it for the Hourglass_Off is that if the user has unplugged the hourglass module then the program will now crash out rather than continuing. |
Rick Murray (539) 13840 posts |
Isn’t that always the case with C? It is best to call X-form SWIs and either deal with an error or ignore it (as the situation requires). |
Jeff Doggett (257) 234 posts |
Exactly. But the trivial example above broke that – an easy mistake to make – which a lot of BASIC programmers do. |
Rick Murray (539) 13840 posts |
Well, Jeff, there is maybe a sense of “you deserve what you get” if one decides to arbitrarily unplug random parts of the core OS. Unplug Hourglass, reboot, and… I wonder how many other things would fail if a core part of RISC OS “disappeared”? ;-) |