RISC OS Open: Forum: ARCCOS in ARM assembler?

Nov 18, 2012 11:37pm

I’m in the process of attempting to port the old Arachne Molecular Modeller (last upgraded for compatibility with the StrongARM RiscPC) to RISC OS 5 on the Raspberry Pi. However, I’m encountering a few self-inflicted problems along the way, mainly in relation to a lost ‘library.h’ file.

My latest problem is that I appear to have had a macro named ARCCOS which returned the inverse cosine of one floating point register in another – in other words, something pretty much equivalent to the C library function acos().

Does anyone know of or have an acos routine that I could use (directly from a relocatable module written in assembler)?

Alternatively, I’m guessing that I could use the C library function – but I’m not following APCS conventions for stack frames so this would take a bit of setting up. More seriously, I don’t know how happy the C library would be to be called without being initialised first (given that I’m a relocatable module SWI hosted within a BASIC program).

Any suggestions would be very welcome!

Thanks,

David

Nov 20, 2012 10:33pm

Kuemmel (439) 384 posts

Dear David,

Here would be a nice and fast NEON library including ARCCOS derived from ARCSIN.

You could implement that somewhere in your code. I got no experience with inline assembler in C, but as far as I know it should basically work. Before you use the NEON extension you got to initiate it.

You can find Basic Assembler code for that within my stuff

Nov 23, 2012 9:19pm

David Miller (1712) 34 posts

Thanks, Kuemmel – I will take a look…

Nov 27, 2012 9:03pm

Jeffrey Lee (213) 6048 posts

Kuemmel: NEON isn’t very useful if he wants his software to run on the Pi :)

David: If you’re sticking with FPA floating point, then there’s an arccos instruction, “ACS”, which you can use directly (plus ASN and ATN for arcsin and arctan). However these will obviously go via the floating point emulator, so may be slower than whatever approach you were using in the past.

If you’re aiming to convert the code to use VFP then there aren’t any builtin trig instructions for you to use – you’ll have to implement your own version (or find someone elses to use!)

Nov 27, 2012 9:19pm

Kuemmel (439) 384 posts

…ups…sorry David, I missed the PI thing.

May be you could take the formula behind the approximation of ARCSIN/ARCCOS and code a VFP version based on that (the formula is also in the library link I gave you in a C-version in front of the NEON one). As Jeffrey said, FPEMu is an alternative, but a quite slow one…

Nov 28, 2012 6:01pm

David Miller (1712) 34 posts

Jeffrey, thanks – that was exactly that answer I was looking for (and, bizarrely, the one that I worked out for myself a few hours ago and was just returning here to post!). I had completely forgotten that there was an ACS instruction. The only mystery that remains is why my old code uses what appears to be an ARCCOS macro rather than using ACS directly. (I’m sure it wasn’t performance – I used FPE floating-point everywhere.)

Does RISC OS make use of the hardware floating point support in the Pi? If so, I would expect significant performance improvements…

ARCCOS in ARM assembler?

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Nov 18, 2012 11:37pm David Miller (1712) 34 posts	I’m in the process of attempting to port the old Arachne Molecular Modeller (last upgraded for compatibility with the StrongARM RiscPC) to RISC OS 5 on the Raspberry Pi. However, I’m encountering a few self-inflicted problems along the way, mainly in relation to a lost ‘library.h’ file. My latest problem is that I appear to have had a macro named ARCCOS which returned the inverse cosine of one floating point register in another – in other words, something pretty much equivalent to the C library function acos(). Does anyone know of or have an acos routine that I could use (directly from a relocatable module written in assembler)? Alternatively, I’m guessing that I could use the C library function – but I’m not following APCS conventions for stack frames so this would take a bit of setting up. More seriously, I don’t know how happy the C library would be to be called without being initialised first (given that I’m a relocatable module SWI hosted within a BASIC program). Any suggestions would be very welcome! Thanks, David

Nov 20, 2012 10:33pm Kuemmel (439) 384 posts	Dear David, Here would be a nice and fast NEON library including ARCCOS derived from ARCSIN. You could implement that somewhere in your code. I got no experience with inline assembler in C, but as far as I know it should basically work. Before you use the NEON extension you got to initiate it. You can find Basic Assembler code for that within my stuff

Nov 23, 2012 9:19pm David Miller (1712) 34 posts	Thanks, Kuemmel – I will take a look…

Nov 27, 2012 9:03pm Jeffrey Lee (213) 6048 posts	Kuemmel: NEON isn’t very useful if he wants his software to run on the Pi :) David: If you’re sticking with FPA floating point, then there’s an arccos instruction, “ACS”, which you can use directly (plus ASN and ATN for arcsin and arctan). However these will obviously go via the floating point emulator, so may be slower than whatever approach you were using in the past. If you’re aiming to convert the code to use VFP then there aren’t any builtin trig instructions for you to use – you’ll have to implement your own version (or find someone elses to use!)

Nov 27, 2012 9:19pm Kuemmel (439) 384 posts	…ups…sorry David, I missed the PI thing. May be you could take the formula behind the approximation of ARCSIN/ARCCOS and code a VFP version based on that (the formula is also in the library link I gave you in a C-version in front of the NEON one). As Jeffrey said, FPEMu is an alternative, but a quite slow one…

Nov 28, 2012 6:01pm David Miller (1712) 34 posts	Jeffrey, thanks – that was exactly that answer I was looking for (and, bizarrely, the one that I worked out for myself a few hours ago and was just returning here to post!). I had completely forgotten that there was an ACS instruction. The only mystery that remains is why my old code uses what appears to be an ARCCOS macro rather than using ACS directly. (I’m sure it wasn’t performance – I used FPE floating-point everywhere.) Does RISC OS make use of the hardware floating point support in the Pi? If so, I would expect significant performance improvements…