ARCCOS in ARM assembler?
David Miller (1712) 34 posts |
I’m in the process of attempting to port the old Arachne Molecular Modeller (last upgraded for compatibility with the StrongARM RiscPC) to RISC OS 5 on the Raspberry Pi. However, I’m encountering a few self-inflicted problems along the way, mainly in relation to a lost ‘library.h’ file. My latest problem is that I appear to have had a macro named ARCCOS which returned the inverse cosine of one floating point register in another – in other words, something pretty much equivalent to the C library function acos(). Does anyone know of or have an acos routine that I could use (directly from a relocatable module written in assembler)? Alternatively, I’m guessing that I could use the C library function – but I’m not following APCS conventions for stack frames so this would take a bit of setting up. More seriously, I don’t know how happy the C library would be to be called without being initialised first (given that I’m a relocatable module SWI hosted within a BASIC program). Any suggestions would be very welcome! Thanks, David |
Kuemmel (439) 384 posts |
Dear David, Here would be a nice and fast NEON library including ARCCOS derived from ARCSIN. You could implement that somewhere in your code. I got no experience with inline assembler in C, but as far as I know it should basically work. Before you use the NEON extension you got to initiate it. You can find Basic Assembler code for that within my stuff |
David Miller (1712) 34 posts |
Thanks, Kuemmel – I will take a look… |
Jeffrey Lee (213) 6048 posts |
Kuemmel: NEON isn’t very useful if he wants his software to run on the Pi :) David: If you’re sticking with FPA floating point, then there’s an arccos instruction, “ACS”, which you can use directly (plus ASN and ATN for arcsin and arctan). However these will obviously go via the floating point emulator, so may be slower than whatever approach you were using in the past. If you’re aiming to convert the code to use VFP then there aren’t any builtin trig instructions for you to use – you’ll have to implement your own version (or find someone elses to use!) |
Kuemmel (439) 384 posts |
…ups…sorry David, I missed the PI thing. May be you could take the formula behind the approximation of ARCSIN/ARCCOS and code a VFP version based on that (the formula is also in the library link I gave you in a C-version in front of the NEON one). As Jeffrey said, FPEMu is an alternative, but a quite slow one… |
David Miller (1712) 34 posts |
Jeffrey, thanks – that was exactly that answer I was looking for (and, bizarrely, the one that I worked out for myself a few hours ago and was just returning here to post!). I had completely forgotten that there was an ACS instruction. The only mystery that remains is why my old code uses what appears to be an ARCCOS macro rather than using ACS directly. (I’m sure it wasn’t performance – I used FPE floating-point everywhere.) Does RISC OS make use of the hardware floating point support in the Pi? If so, I would expect significant performance improvements… |