RISC OS Open: Forum: FP support

Sep 24, 2021 6:38pm

Circled back round this week, took a deep breath and managed to get PipeDream running OK in -apcs /softfp mode… …I obtained about a factor of five to six speed improvement when inverting a large matrix

Great stuff, any chance of the same thing for Fireworkz?

Sep 24, 2021 7:50pm

Stuart Swales (8827) 1357 posts

Great stuff, any chance of the same thing for Fireworkz?

Quite likely, I would think ;-)

Sep 24, 2021 8:08pm

Paolo Fabio Zaino (28) 1882 posts

@ Stuart

I’ve got my hands full ATM, but if you want I can give it a try this weekend while testing other stuff. If so link to pull down the source for build (or binary for test only) pleaseeeee :)

P.S. Awesome work!

Sep 24, 2021 8:54pm

Stuart Swales (8827) 1357 posts

Have a whirl with this source tarball. Interested in comments at present as to whether to polish further – should we go to Code Review?

Note that I haven’t yet implemented fetestexcept() and friends for the VFP world (expect NaNs and Infs rather than SIGFPE barfs).

http://croftnuisk.co.uk/coltsoft-downloads/other/apcs_softpcs_20210924.zip

In the end I had to change PipeDream very little to use this library; I didn’t get C99 double complex with Norcroft /softfp working so had to revert to my own old implementation, and change one trivial inline function to non-inline to stop a compiler barf.

[Edit: I forgot about adding -DAPCS_SOFTPCS as well as using -apcs /softfp as the compiler doesn’t seem to defined anything useful]

Sep 25, 2021 9:33am

Chris Gransden (337) 1207 posts

I did a quick test with the flops.c benchmark.



fpa (-Otime)

   MFLOPS(1)       =    18.6673
   MFLOPS(2)       =    17.2189
   MFLOPS(3)       =    18.5478
   MFLOPS(4)       =    19.3437
   
softpcs (-Otime)

   MFLOPS(1)       =   189.3599
   MFLOPS(2)       =   158.6173
   MFLOPS(3)       =   154.6956
   MFLOPS(4)       =   153.1121

gcc 4.7.4 vfp (-mfpu=vfp -O3)

   MFLOPS(1)       =  1828.5714
   MFLOPS(2)       =  1120.2533
   MFLOPS(3)       =  1607.2243
   MFLOPS(4)       =  1943.7630

Sep 25, 2021 9:49am

Stuart Swales (8827) 1357 posts

Thanks Chris! 10x, but could be 10x better, eh. As I mentioned somewhere else, I see it as a useful stepping-stone towards using some of the potential performance offered by new hardware without abandoning the old. I’m sure I’m not the only person who is pretty much tied into continuing to use Norcroft for RISC OS targets given various pragmas and globs of assembler.

Chris: I tried with flops.c from the interweb to see what ops that used and get vastly different results to yours – could I have a copy please? Ta.

Sep 25, 2021 11:11am

Chris Gransden (337) 1207 posts

I’ve just sent it. I’ll see I can find and build something that is more of a real world test.

Sep 25, 2021 11:35am

Stuart Swales (8827) 1357 posts

Thanks – results now believably closer. Mine are somewhat lower due to older HW (ARMX6@1GHz) but the gap between Norcroft -Otime with apcs_softpcs and gcc -O2 -mfpu=vfp (4.7.4) is also lower, about a factor of three to four, not ten. I do see a factor of ten still between Norcroft FPA and Norcroft with apcs_softpcs.

Sep 25, 2021 11:49am

David Pitt (3386) 1248 posts

Some results using this flops.c built with GCC4.7.4 :-

*gcc flops.c -o flops -mfpu=vfp

On the 1.5MHz Titanium :-

   FLOPS C Program (Double Precision), V2.0 18 Dec 1992
   Iterations      =  512000000
   NullTime (usec) =     0.0014
   MFLOPS(1)       =   291.6420
   MFLOPS(2)       =   295.3921
   MFLOPS(3)       =   343.1982
   MFLOPS(4)       =   347.2868

On The RPi400 at 2.4MHz :-

   Iterations      =  512000000
   NullTime (usec) =     0.0009
   MFLOPS(1)       =   697.6939
   MFLOPS(2)       =   811.5966
   MFLOPS(3)       =   886.8430
   MFLOPS(4)       =   898.4188

Sep 25, 2021 12:09pm

Stuart Swales (8827) 1357 posts

David: Try with -O2, might get closer to Chris’ results.

Sep 25, 2021 1:50pm

Chris Gransden (337) 1207 posts

I used -O3.

While trying to link something else I get an undefined symbol for __apcs_softpcs__lrintf.

Sep 25, 2021 1:53pm

Stuart Swales (8827) 1357 posts

__apcs_softpcs__lrintf

Ah, overzealous bit of macro-ing in apcs_softpcs.h! Thanks. lrint and llrint (and friends) didn’t need wrapping, and might not benefit much from VFP-ing as their implementation in the C library is pure ARM w/o FPA.

[Edit: the above is true for lrint/lrintl/llrint/llrintl but NOT for lrintf/llrintf when compiled with -apcs /softfp. That’s down to the normal calling standard passing f.p. always as double in ARM register pairs (unless it has the __caller_narrow qualifier), but /softfp passes floats as single ARM registers… Can tell I never use float, just double, can’t you? So all the C library functions with non-__caller_narrow’ed float parameters will need wrapping appropriately. Bloody underlines.]

Sep 25, 2021 2:11pm

Rick Murray (539) 13840 posts

continuing to use Norcroft for RISC OS targets

Norcroft really needs to move away from emitting FPA instructions, and these examples are (yet another) demonstration why.

I noticed a few versions ago it has some options for the FPU type. I don’t think they do anything, but maybe it’s planned?

Fingers crossed!

Sep 25, 2021 2:14pm

David Pitt (3386) 1248 posts

David: Try with -O2, might get closer to Chris’ results.
I used -O3.

Thanks both, O2 good O3 better. (RPi400 2400kHz)

*gcc flops.c -o flops -mfpu=vfp -O2
*flops

   FLOPS C Program (Double Precision), V2.0 18 Dec 1992
   Iterations      =  512000000
   NullTime (usec) =     0.0000
   MFLOPS(1)       =  1669.2163
   MFLOPS(2)       =  1127.8841
   MFLOPS(3)       =  1596.2417
   MFLOPS(4)       =  1860.7029

*
*gcc flops.c -o flops -mfpu=vfp -O3
*flops

   FLOPS C Program (Double Precision), V2.0 18 Dec 1992
   Iterations      =  512000000
   NullTime (usec) =     0.0000
   MFLOPS(1)       =  1889.5671
   MFLOPS(2)       =  1158.4400
   MFLOPS(3)       =  1661.5248
   MFLOPS(4)       =  2009.1419

*

Sep 25, 2021 2:18pm

Stuart Swales (8827) 1357 posts

Norcroft really needs to move away from emitting FPA instructions

Not just the compiler, Rick. The C library is chokka with FPA assembler. For instance the lrintf code could be sped-up usefully by having a VFP code branch as well as the existing FPA branch just to do the callee-narrowing prior to the common ARM bit.

But how much are we prepared to annoy people who for good reasons are still running old hardware (“it works for me in my setup”) or emulators (“don’t have hardware at work, just RPCEmu on the laptop”)? I wouldn’t bother to release a VFP-only version of PipeDream or Fireworkz as the performance gains in those applications wouldn’t be worth it for 99% of users, whereas something that usefully boosts performance for anyone with modern-ish hardware without unduly penalising the other users looks like a win to me. Anyone who needs to run at highest performance needs to grab code and compile it to suit their needs.

Sep 25, 2021 2:49pm

Chris Gransden (337) 1207 posts

Commenting out lrintf in the header got it to link.

Here’s the results for twolame converting a wav file to mp2 on a RPi CM4 @2.4GHz.

fpa

374.7 secs

softpcs (APCS_SOFTPCS_RUNTIME_SWITCH: TRUE)

16.18 secs

softpcs (APCS_SOFTPCS_RUNTIME_SWITCH: FALSE)

15.02 secs

gcc 4.7.4 vfp

3.89 secs

Sep 25, 2021 2:57pm

Stuart Swales (8827) 1357 posts

Wow! That’s a win.

Edit: Note that you can assemble the library with APCS_SOFTPCS_RUNTIME_SWITCH set to {FALSE} for more performance when you know the target will have VFP. That setting eliminates a LDR/LDR/TEQ/BEQ for each basic f.p. operator. e.g. on my 1.0GHz ARMX6:

*flops-vfpf [no run-time switch, so VFP required for basic operations, VFP with FPA fallback for library functions]

   Iterations      =  128000000
   NullTime (usec) =     0.0020
   MFLOPS(1)       =    73.4334
   MFLOPS(2)       =    73.0166
   MFLOPS(3)       =    78.5507
   MFLOPS(4)       =    82.1033

*flops-vfps [run-time switch for VFP/FPA for everything]

   Iterations      =  128000000
   NullTime (usec) =     0.0020
   MFLOPS(1)       =    60.0289
   MFLOPS(2)       =    61.3677
   MFLOPS(3)       =    65.3769
   MFLOPS(4)       =    67.8630

This benchmark is just really exercising the basic f.p. operators.

I forgot to mention, these figures are WITHOUT -Otime for flops.c as that degraded it slightly… Several of the individual methods seem to run faster when flops.c is compiled with -arch 2 -cpu 3 as it uses LDM (and STM) rather than two LDRs to move the double precision values around.

Sep 25, 2021 3:46pm

Rick Murray (539) 13840 posts

The C library is chokka with FPA assembler.

I know. It will need some stuff duplicated, but then maybe a little bit of smarts will be able to set up the Stubs jump table accordingly if called using a new SWI (LibInitAPCS32VFP or something?) in order that older FPA software also works as expected.

But how much are we prepared to annoy people who for good reasons are still running old hardware

My thoughts on this are that it isn’t really an annoyance as such. Everything they own and everything they use won’t mysteriously cease to function. The only difference is that upgrades and new releases of some things won’t work.

Firstly, there is precedence from Acorn (think of all the RiscPC extended stuff that was never officially made available for older machines (why do you think Dummy Dynamic Areas was created?)).

Secondly, there is precedence, take a look at https://www.riscosports.co.uk/vfp/ and note that it isn’t aimed at anything pre-5.2x with VFP.

Thirdly, this should be a question for each individual author. Some bend over backwards to use StubsG to support “damn near everything”, while others figure after over a quarter of a century, the RiscPC has had a good run, but it shouldn’t be a millstone preventing future progress.
My own personal view here is that I write stuff for contemporary machines. If it works on older ones, great. If not, oh well. [I don’t go out of my way to be incompatible, but RiscPC/RO3.7 is not part of my testing regime ¹; I don’t even own a copy of 4.×.]

“Because ancient machines” is a pretty lousy excuse for not having the DDE compiler support hardware maths, and that sort of logic might push people less lazy than me to GCC. I wouldn’t move, as I suck at maths so my code doesn’t tend to be maths heavy, but Chris has provided yet another example of the limitations of emulated FP. I mean, literally, six odd minutes (FPA) versus a mite under four seconds (VFP). I’m not entirely certain what softpcs actually is, but even that hands FPA it’s arse on a plate, running in at sixteen seconds. Which is way closer to four then it is to six freaking minutes!

As such, softpcs would seem an acceptable alternative (how might I use this in my programs? (Norcroft compiler)), as, really, it’s FPA that’s not fit for purpose…

¹ It used to be, but these days I try to avoid turning on the power hungry Windows box.

Sep 25, 2021 3:52pm

Stuart Swales (8827) 1357 posts

RE: apcs_softpcs – see my post from 19 hours ago (I have no idea how to paste links to individual posts here)

Sep 25, 2021 4:02pm

David Pitt (3386) 1248 posts

see my post from 19 hours ago (I have no idea how to paste links to individual posts here)

At the required post click on the 19hours link, that is the link required, copy it from the URL bar. I do this in a second browser window to avoid loosing my place.

Sep 25, 2021 4:13pm

Chris Gransden (337) 1207 posts

Note that you can assemble the library with APCS_SOFTPCS_RUNTIME_SWITCH set to {FALSE}

Down from 16.18 secs to 15.02 secs.

Sep 25, 2021 4:23pm

Stuart Swales (8827) 1357 posts

Down from 16.18 secs to 15.02 secs.

I did wonder about having the first instruction of each function being B FPA-equivalent-function and patch that with a NOP when the softpcs VFP system was initialised (it would need to not be in a READONLY area then). Or less hacky, a more Stubs-like arrangement where it patches in a function table of addresses at run-time to avoid modifying the code. Current implementation was just a get-you-going one to fail-safe to the FPA function set if it was never initialised.

@David: Thanks – I can only see links to topics and posts but have now found the post id to use by HTML inspection. Let’s see if I can do it:

https://www.riscosopen.org/forum/forums/2/topics/3457?page=1#posts-45080

was what inspired me to do this.

Sep 25, 2021 5:19pm

David Pitt (3386) 1248 posts

I can only see links to topics and posts but have now found the post id to use by HTML inspection

It only works from within the topics but not from “Recent post”.

Contemplating this message look up at the one above, the time, above the name, is a link to that post including the #posts-number tag. It even works in NetSurf.

Sep 25, 2021 5:26pm

Rick Murray (539) 13840 posts

Mmm, just read bits of it on my phone. It looks like the sort of FP support that was provided with TurboC way back when – use hardware if available, else emulate. It’s a good compromise.

Thanks. ;-)

Sep 25, 2021 5:28pm

Rick Murray (539) 13840 posts

I can only see links to topics and posts but have now found the post id to use by HTML inspection.

It’s hiding.

Don’t use Recent Posts, go into the actual thread.

Then look at the posting time above the user’s icon. There’s your link.

[alternate: Firefox, install the Display #Anchors add-on]

FP support

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Sep 24, 2021 6:38pm David J. Ruck (33) 1635 posts	Circled back round this week, took a deep breath and managed to get PipeDream running OK in -apcs /softfp mode… …I obtained about a factor of five to six speed improvement when inverting a large matrix Great stuff, any chance of the same thing for Fireworkz?

Sep 24, 2021 7:50pm Stuart Swales (8827) 1357 posts	Great stuff, any chance of the same thing for Fireworkz? Quite likely, I would think ;-)

Sep 24, 2021 8:08pm Paolo Fabio Zaino (28) 1882 posts	@ Stuart I’ve got my hands full ATM, but if you want I can give it a try this weekend while testing other stuff. If so link to pull down the source for build (or binary for test only) pleaseeeee :) P.S. Awesome work!

Sep 24, 2021 8:54pm Stuart Swales (8827) 1357 posts	Have a whirl with this source tarball. Interested in comments at present as to whether to polish further – should we go to Code Review? Note that I haven’t yet implemented fetestexcept() and friends for the VFP world (expect NaNs and Infs rather than SIGFPE barfs). http://croftnuisk.co.uk/coltsoft-downloads/other/apcs_softpcs_20210924.zip In the end I had to change PipeDream very little to use this library; I didn’t get C99 double complex with Norcroft /softfp working so had to revert to my own old implementation, and change one trivial inline function to non-inline to stop a compiler barf. [Edit: I forgot about adding -DAPCS_SOFTPCS as well as using -apcs /softfp as the compiler doesn’t seem to defined anything useful]

Sep 25, 2021 9:33am Chris Gransden (337) 1207 posts	I did a quick test with the flops.c benchmark. `fpa (-Otime) MFLOPS(1) = 18.6673 MFLOPS(2) = 17.2189 MFLOPS(3) = 18.5478 MFLOPS(4) = 19.3437 softpcs (-Otime) MFLOPS(1) = 189.3599 MFLOPS(2) = 158.6173 MFLOPS(3) = 154.6956 MFLOPS(4) = 153.1121 gcc 4.7.4 vfp (-mfpu=vfp -O3) MFLOPS(1) = 1828.5714 MFLOPS(2) = 1120.2533 MFLOPS(3) = 1607.2243 MFLOPS(4) = 1943.7630`

Sep 25, 2021 9:49am Stuart Swales (8827) 1357 posts	Thanks Chris! 10x, but could be 10x better, eh. As I mentioned somewhere else, I see it as a useful stepping-stone towards using some of the potential performance offered by new hardware without abandoning the old. I’m sure I’m not the only person who is pretty much tied into continuing to use Norcroft for RISC OS targets given various pragmas and globs of assembler. Chris: I tried with flops.c from the interweb to see what ops that used and get vastly different results to yours – could I have a copy please? Ta.

Sep 25, 2021 11:11am Chris Gransden (337) 1207 posts	I’ve just sent it. I’ll see I can find and build something that is more of a real world test.

Sep 25, 2021 11:35am Stuart Swales (8827) 1357 posts	Thanks – results now believably closer. Mine are somewhat lower due to older HW (ARMX6@1GHz) but the gap between Norcroft -Otime with apcs_softpcs and gcc -O2 -mfpu=vfp (4.7.4) is also lower, about a factor of three to four, not ten. I do see a factor of ten still between Norcroft FPA and Norcroft with apcs_softpcs.

Sep 25, 2021 11:49am David Pitt (3386) 1248 posts	Some results using this flops.c built with GCC4.7.4 :- *gcc flops.c -o flops -mfpu=vfp On the 1.5MHz Titanium :- FLOPS C Program (Double Precision), V2.0 18 Dec 1992 Iterations = 512000000 NullTime (usec) = 0.0014 MFLOPS(1) = 291.6420 MFLOPS(2) = 295.3921 MFLOPS(3) = 343.1982 MFLOPS(4) = 347.2868 On The RPi400 at 2.4MHz :- Iterations = 512000000 NullTime (usec) = 0.0009 MFLOPS(1) = 697.6939 MFLOPS(2) = 811.5966 MFLOPS(3) = 886.8430 MFLOPS(4) = 898.4188

Sep 25, 2021 12:09pm Stuart Swales (8827) 1357 posts	David: Try with -O2, might get closer to Chris’ results.

Sep 25, 2021 1:50pm Chris Gransden (337) 1207 posts	I used -O3. While trying to link something else I get an undefined symbol for __apcs_softpcs__lrintf.

Sep 25, 2021 1:53pm Stuart Swales (8827) 1357 posts	__apcs_softpcs__lrintf Ah, overzealous bit of macro-ing in apcs_softpcs.h! Thanks. lrint and llrint (and friends) didn’t need wrapping, and might not benefit much from VFP-ing as their implementation in the C library is pure ARM w/o FPA. [Edit: the above is true for lrint/lrintl/llrint/llrintl but NOT for lrintf/llrintf when compiled with `-apcs /softfp`. That’s down to the normal calling standard passing f.p. always as double in ARM register pairs (unless it has the `__caller_narrow` qualifier), but /softfp passes floats as single ARM registers… Can tell I never use float, just double, can’t you? So all the C library functions with non-`__caller_narrow`’ed float parameters will need wrapping appropriately. Bloody underlines.]

Sep 25, 2021 2:11pm Rick Murray (539) 13840 posts	continuing to use Norcroft for RISC OS targets Norcroft really needs to move away from emitting FPA instructions, and these examples are (yet another) demonstration why. I noticed a few versions ago it has some options for the FPU type. I don’t think they do anything, but maybe it’s planned? Fingers crossed!

Sep 25, 2021 2:14pm David Pitt (3386) 1248 posts	David: Try with -O2, might get closer to Chris’ results. I used -O3. Thanks both, O2 good O3 better. (RPi400 2400kHz) gcc flops.c -o flops -mfpu=vfp -O2 flops FLOPS C Program (Double Precision), V2.0 18 Dec 1992 Iterations = 512000000 NullTime (usec) = 0.0000 MFLOPS(1) = 1669.2163 MFLOPS(2) = 1127.8841 MFLOPS(3) = 1596.2417 MFLOPS(4) = 1860.7029 * gcc flops.c -o flops -mfpu=vfp -O3 flops FLOPS C Program (Double Precision), V2.0 18 Dec 1992 Iterations = 512000000 NullTime (usec) = 0.0000 MFLOPS(1) = 1889.5671 MFLOPS(2) = 1158.4400 MFLOPS(3) = 1661.5248 MFLOPS(4) = 2009.1419 *

Sep 25, 2021 2:18pm Stuart Swales (8827) 1357 posts	Norcroft really needs to move away from emitting FPA instructions Not just the compiler, Rick. The C library is chokka with FPA assembler. For instance the lrintf code could be sped-up usefully by having a VFP code branch as well as the existing FPA branch just to do the callee-narrowing prior to the common ARM bit. But how much are we prepared to annoy people who for good reasons are still running old hardware (“it works for me in my setup”) or emulators (“don’t have hardware at work, just RPCEmu on the laptop”)? I wouldn’t bother to release a VFP-only version of PipeDream or Fireworkz as the performance gains in those applications wouldn’t be worth it for 99% of users, whereas something that usefully boosts performance for anyone with modern-ish hardware without unduly penalising the other users looks like a win to me. Anyone who needs to run at highest performance needs to grab code and compile it to suit their needs.

Sep 25, 2021 2:49pm Chris Gransden (337) 1207 posts	Commenting out lrintf in the header got it to link. Here’s the results for twolame converting a wav file to mp2 on a RPi CM4 @2.4GHz. fpa 374.7 secs softpcs (APCS_SOFTPCS_RUNTIME_SWITCH: TRUE) 16.18 secs softpcs (APCS_SOFTPCS_RUNTIME_SWITCH: FALSE) 15.02 secs gcc 4.7.4 vfp 3.89 secs

Sep 25, 2021 2:57pm Stuart Swales (8827) 1357 posts	Wow! That’s a win. Edit: Note that you can assemble the library with APCS_SOFTPCS_RUNTIME_SWITCH set to {FALSE} for more performance when you know the target will have VFP. That setting eliminates a LDR/LDR/TEQ/BEQ for each basic f.p. operator. e.g. on my 1.0GHz ARMX6: flops-vfpf [no run-time switch, so VFP required for basic operations, VFP with FPA fallback for library functions] Iterations = 128000000 NullTime (usec) = 0.0020 MFLOPS(1) = 73.4334 MFLOPS(2) = 73.0166 MFLOPS(3) = 78.5507 MFLOPS(4) = 82.1033 flops-vfps [run-time switch for VFP/FPA for everything] Iterations = 128000000 NullTime (usec) = 0.0020 MFLOPS(1) = 60.0289 MFLOPS(2) = 61.3677 MFLOPS(3) = 65.3769 MFLOPS(4) = 67.8630 This benchmark is just really exercising the basic f.p. operators. I forgot to mention, these figures are WITHOUT -Otime for flops.c as that degraded it slightly… Several of the individual methods seem to run faster when flops.c is compiled with -arch 2 -cpu 3 as it uses LDM (and STM) rather than two LDRs to move the double precision values around.

Sep 25, 2021 3:46pm Rick Murray (539) 13840 posts	The C library is chokka with FPA assembler. I know. It will need some stuff duplicated, but then maybe a little bit of smarts will be able to set up the Stubs jump table accordingly if called using a new SWI (LibInitAPCS32VFP or something?) in order that older FPA software also works as expected. But how much are we prepared to annoy people who for good reasons are still running old hardware My thoughts on this are that it isn’t really an annoyance as such. Everything they own and everything they use won’t mysteriously cease to function. The only difference is that upgrades and new releases of some things won’t work. Firstly, there is precedence from Acorn (think of all the RiscPC extended stuff that was never officially made available for older machines (why do you think Dummy Dynamic Areas was created?)). Secondly, there is precedence, take a look at https://www.riscosports.co.uk/vfp/ and note that it isn’t aimed at anything pre-5.2x with VFP. Thirdly, this should be a question for each individual author. Some bend over backwards to use StubsG to support “damn near everything”, while others figure after over a quarter of a century, the RiscPC has had a good run, but it shouldn’t be a millstone preventing future progress. My own personal view here is that I write stuff for contemporary machines. If it works on older ones, great. If not, oh well. [I don’t go out of my way to be incompatible, but RiscPC/RO3.7 is not part of my testing regime ¹; I don’t even own a copy of 4.×.] “Because ancient machines” is a pretty lousy excuse for not having the DDE compiler support hardware maths, and that sort of logic might push people less lazy than me to GCC. I wouldn’t move, as I suck at maths so my code doesn’t tend to be maths heavy, but Chris has provided yet another example of the limitations of emulated FP. I mean, literally, six odd minutes (FPA) versus a mite under four seconds (VFP). I’m not entirely certain what softpcs actually is, but even that hands FPA it’s arse on a plate, running in at sixteen seconds. Which is way closer to four then it is to six freaking minutes! As such, softpcs would seem an acceptable alternative (how might I use this in my programs? (Norcroft compiler)), as, really, it’s FPA that’s not fit for purpose… ¹ It used to be, but these days I try to avoid turning on the power hungry Windows box.

Sep 25, 2021 3:52pm Stuart Swales (8827) 1357 posts	RE: apcs_softpcs – see my post from 19 hours ago (I have no idea how to paste links to individual posts here)

Sep 25, 2021 4:02pm David Pitt (3386) 1248 posts	see my post from 19 hours ago (I have no idea how to paste links to individual posts here) At the required post click on the 19hours link, that is the link required, copy it from the URL bar. I do this in a second browser window to avoid loosing my place.

Sep 25, 2021 4:13pm Chris Gransden (337) 1207 posts	Note that you can assemble the library with APCS_SOFTPCS_RUNTIME_SWITCH set to {FALSE} Down from 16.18 secs to 15.02 secs.

Sep 25, 2021 4:23pm Stuart Swales (8827) 1357 posts	Down from 16.18 secs to 15.02 secs. I did wonder about having the first instruction of each function being `B FPA-equivalent-function` and patch that with a `NOP` when the softpcs VFP system was initialised (it would need to not be in a READONLY area then). Or less hacky, a more Stubs-like arrangement where it patches in a function table of addresses at run-time to avoid modifying the code. Current implementation was just a get-you-going one to fail-safe to the FPA function set if it was never initialised. @David: Thanks – I can only see links to topics and posts but have now found the post id to use by HTML inspection. Let’s see if I can do it: https://www.riscosopen.org/forum/forums/2/topics/3457?page=1#posts-45080 was what inspired me to do this.

Sep 25, 2021 5:19pm David Pitt (3386) 1248 posts	I can only see links to topics and posts but have now found the post id to use by HTML inspection It only works from within the topics but not from “Recent post”. Contemplating this message look up at the one above, the time, above the name, is a link to that post including the `#posts-number` tag. It even works in NetSurf.

Sep 25, 2021 5:26pm Rick Murray (539) 13840 posts	Mmm, just read bits of it on my phone. It looks like the sort of FP support that was provided with TurboC way back when – use hardware if available, else emulate. It’s a good compromise. Thanks. ;-)

Sep 25, 2021 5:28pm Rick Murray (539) 13840 posts	I can only see links to topics and posts but have now found the post id to use by HTML inspection. It’s hiding. Don’t use Recent Posts, go into the actual thread. Then look at the posting time above the user’s icon. There’s your link. [alternate: Firefox, install the Display #Anchors add-on]