RISC OS Open: Forum: Crash when dividing by 0 with signals disabled

Aug 17, 2021 9:06pm

Hiya,

In the xml/xslt tools I produced recently I have code that generates a NaN and Infinities by turning off signals and then doing a 0/0 and 1/0 respectively. This is a pretty normal way to generate these values. It always worked for me.

However, that was because I was using RISC OS 4.

On RISC OS 3.7 and RISC OS 5, it crashes badly.

Details can be found here:

https://github.com/gerph/riscos-nantest

Aug 18, 2021 12:36pm

Julie Stamp (8365) 474 posts

I don’t know much about floating point numbers, but here’s what I get on my Rasperry Pi:

*cc c.nantest
Norcroft RISC OS ARM C vsn 5.83 [01 Jun 2020]
"c.nantest", line 21: Warning: floating point constant invalid operation: '/'
"c.nantest", line 23: Warning: actual type 'unsigned long' mismatches format '%0
8x'
"c.nantest", line 23: Warning: actual type 'unsigned long' mismatches format '%0
8x'
"c.nantest", line 38: Warning: floating point constant division by zero: '/'
"c.nantest", line 40: Warning: actual type 'unsigned long' mismatches format '%0
8x'
"c.nantest", line 40: Warning: actual type 'unsigned long' mismatches format '%0
8x'
c.nantest: 6 warnings, 0 errors, 0 serious errors

*nantest
NaN test
nan = nan
nan = 7ff80000/e0000000
same = 0, different = 1
INF test
inf = inf
inf = 7ff00000/00000000
same = 1, different = 0

RISC OS 5.29 (01 Aug 2021)
==> Help on keyword SharedCLibrary
Module is: C Library       6.09 (06 Feb 2021)
==> Help on keyword FPEmulator
Module is: FPEmulator      4.38 (03 Jul 2021) (1.13CELM)

I get the same result on RPCEmu (except there’s no CEL in the FPEmulator version code).

Aug 18, 2021 3:11pm

Stuart Swales (8827) 1357 posts

Working OK for me too on ARMX6, RISC OS 5.29 (02 Nov 2020), SCL 6.08, FPE 4.37; compiled with Norcroft 5.86 [10 Feb 2021].

Has gcc got the optimisation for comparisons with f.p. constant values wrong when targetting arm-riscos?

[Aside to others: when handling f.p. values that may have NaN/Inf, usually best to use the comparison macros in math.h]

Aug 18, 2021 5:12pm

Charles Ferguson (8243) 427 posts

Ok, so that means that when Julie Stamp and Stuart Swales compile with their compilers and Stubs on their machines there’s no problem.

What about when you run the binary that was supplied?
Eliminating other compilers and stubs from the problem is helpful in reducing the scope though.

User reporting the issue was running it on RPCEmu on RISC OS 5.29. My tests were on RPCEmu using the RISC OS 5 from the easy start bundle (RISC OS 5.27 (19 Mar 2020)).

Stuart:

Has gcc got the optimisation for comparisons with f.p. constant values wrong when targetting arm-riscos?

gcc doesn’t come into it. This is compiled with Norcroft 5.18+a bit and stubsg.

Aug 18, 2021 5:41pm

Steve Pampling (1551) 8172 posts

Norcroft 5.18+a bit and stubsg

Perhaps if you compile with something other than Stubsg and maybe rule in (or rule out) the age of the Norcroft package since Stuart has no visibility of an issue when using Norcroft 5.86

Aug 18, 2021 6:20pm

Stuart Swales (8827) 1357 posts

It appears to be ‘faulting’ trying to continue from the FP exception somewhere in the guts of the SCL.

Gerph’s binary fails on my system at the DVFD f0,f0,#0.

I can reproduce the fault using my newer-than-thou compiler if I break out the divide by zero to a separate routine (newer Norcroft otherwise just computes the NaN (and the two booleans) at compile time).

If I then replace the SIGFPE handler with one which prints “SIGFPE” and continues, then on the FP exception the SCL does call that handler, but still postmortems … quite deliberately.

See https://gitlab.riscosopen.org/RiscOS/Sources/Lib/RISC_OSLib/-/blob/master/clib/s/cl_body#L353

I think here you’d have to wrap the divide by zero in (C99) feholdexcept & feclearexcept/feupdateenv to try to prevent the FPEmulator raising the exception.

Aug 18, 2021 6:46pm

Charles Ferguson (8243) 427 posts

It’s somewhat odd that it’s failing for the user when it worked for you – and I can only assume it’s some factor of the compiler or stubs that’s affecting it there. That said, I’ve tried the RISC OS 5 CLib on Pyromaniac with a full trace, and it failed with an error, reporting an invalid operation, but which didn’t crash in the way I described earlier. I’ve included the findings on the repo’s README.md, together with the full log of what the RO 5 CLib did.

Aug 18, 2021 6:57pm

Stuart Swales (8827) 1357 posts

Bizarrely my suggested feclearexcept(FE_DIVBYZERO) didn’t clear the /0 exception which is still then raised by feupdateenv(&env).

If I change it to feclearexcept(FE_ALL_EXCEPT), that does the trick – it will happily divide by zero yielding NaN/Inf.


double dodivbyzero(double d)
{
    fenv_t env;
    feholdexcept(&env);
    //puts("After feholdexcept");
    d = d / 0.0;
    //puts("Back from divide");
    feclearexcept(FE_ALL_EXCEPT);
    feupdateenv(&env);
    //puts("After feupdateenv");
    return d;
}

[Edit: That’s because 0/0 on the FPA gives an Invalid Operation exception (FP_INVALID_OP), not divide by zero (FE_DIVBYZERO).]

Aug 18, 2021 7:04pm

Charles Ferguson (8243) 427 posts

I can reproduce the fault using my newer-than-thou compiler if I break out the divide by zero to a separate routine (newer Norcroft otherwise just computes the NaN (and the two booleans) at compile time).

Nice that it optimises that into a NaN… BUT the side effect should still have been there! There should be an equivalent of a `__rt_divtest` call to raise the side effects of the optimised away calculation.

The optimised away code and the lack of the side effect then explains why Julie and Stuart got versions that worked. Yay! One mystery solved.

If I then replace the SIGFPE handler with one which prints “SIGFPE” and continues, then on the FP exception the SCL does call that handler, but still postmortems…

Which I guess implies that it’s not honouring the SIG_IGN – which is what it looks like from the trace I’ve just done of the code. The SCL never even tries to disable the signals from being generated.

Looking through git logs, in the ancient sources the signal handler disabling exceptions was addressed in 2001 in my version of CLib… Coo, over 20 years ago. Now I feel old.

commit 1000c2c3e4994ce9dff991b7438dae691ce4a5b9
Author: justin <>
Date:   Sun Aug 5 02:23:44 2001 +0000

    Summary:
      Added support for SIGFPE ignoring.
    Detail:
      * At present, if a SIGFPE happens we call our handler. It is run and on
        return we produce a postmortem request. This isn't useful. As a first
        stage to allowing a handler on SIGFPE, we allow SIG_IGN to disable
        all the exceptions associated with the FPE. This means that if we have
        code that does :
          signal(SIGFPE,SIG_IGN);
        subsequently, all FPE exceptions will be ignored (where previously a
        postmortem would have been produced).
          signal(SIGFPE,&handler);
        will restore the signal state and allow us to report exceptions in the
        normal manner.
    Admin:
      Tested with Galaxy doing an explicit overflow on Virginia; seems to work
      and continues as you might expect. In theory, this fixes the Galaxy
      crashes and should allow other users to perform their normal
      signal(SIGFPE,...) operations as they would under unix.
    Tag:
      RISC_OSLib-4_92

Aug 18, 2021 7:06pm

Charles Ferguson (8243) 427 posts

For reference, in the actual change, I’ve reduced this to just setting the value of NaN and the infinities directly by encoding the 64bit values:

https://github.com/gerph/libxml2/commit/b1909005f496e0d08664501ec35c9c7649de5ab4

Aug 18, 2021 7:12pm

Charles Ferguson (8243) 427 posts

I have updated the write up at https://github.com/gerph/riscos-nantest to describe the findings using the CLib on Pyromaniac.
Amusingly if I use the RISC OS 5 CLib from the harddisc image on Pyromaniac I get the Invalid operation error (because it doesn’t actually ignore the signal) but doesn’t crash. That’s weird?

Aug 18, 2021 7:15pm

Stuart Swales (8827) 1357 posts

Presumably it borks on vanilla RISC OS 4.02 as well?

Aug 18, 2021 7:36pm

Charles Ferguson (8243) 427 posts

Perhaps if you compile with something other than Stubsg and maybe rule in (or rule out) the age of the Norcroft package since Stuart has no visibility of an issue when using Norcroft 5.86

I was going to say ‘but I know it can’t be the problem, ‘cos I wrote it and it can’t affect this’, but that’s silly – I wrote it, so I know it could be a problem. So let’s try it with a 32bit stubs. Admittedly a 32bit stubs that I updated, but meh… it’s one less thing.

charles@laputa ~/projects/RO/nantest (master)> riscos-amu BUILD32=1
riscos-cc   -c  -Wc -fa   -IC: -za1 -apcs 3/32/fpe2/swst/fp -D__CONFIG=32 -o o32/nantest c/nantest
Norcroft RISC OS ARM C vsn 5.18 (JRF:5.18.119)  [Nov 13 2020]
"c/nantest", line 21: Warning: floating point constant overflow: '/'
"c/nantest", line 23: Warning: actual type 'long' mismatches format '%08x'
"c/nantest", line 23: Warning: actual type 'long' mismatches format '%08x'
"c/nantest", line 38: Warning: floating point constant overflow: '/'
"c/nantest", line 40: Warning: actual type 'long' mismatches format '%08x'
"c/nantest", line 40: Warning: actual type 'long' mismatches format '%08x'
c/nantest: 6 warnings, 0 errors, 0 serious errors
riscos-link -rescan -C++ -aif -o aif32.nantest-stubsg o32.nantest C:o.stubsGS
nantest-stubsg: All built {Disc}
charles@laputa ~/projects/RO/nantest (master)> riscos-amu BUILD32=1 -f MakefileStubs,fe1
riscos-link -rescan -C++ -aif -o aif32.nantest-stubs32 o32.nantest <Lib$Dir>.CLib.o.stubs-32
nantest-stubs32: All built {Disc}
charles@laputa ~/projects/RO/nantest (master)> pyrodev --common --command aif32.nantest-stubs32
NaN test
nan = nan
nan = 7ff80000/e0000000
same = 1, different = 0
INF test
inf = inf
inf = 7ff00000/00000000
same = 1, different = 0
charles@laputa ~/projects/RO/nantest (master)>

Tested, new aif posted (and the old one renamed to nantest-stubsg). Same problem, so it’s not stubsg that cause the problem. Or it is stubsg that causes the problem AND it’s in my stubs as well. And Stuart has repro’d with his own build too, so I’m pretty sure I’m in the clear here.

Aug 18, 2021 7:53pm

Charles Ferguson (8243) 427 posts

Presumably it borks on vanilla RISC OS 4.02 as well?

Going by that code comment, I’d assume so.

Yup, just tested it and I get an exception at 3ffffffc. So basically this is something that I’ve fixed in Select. Yay.

Aug 18, 2021 8:00pm

Stuart Swales (8827) 1357 posts

So basically this is something that I’ve fixed in Select

Was the fix there to turn off FP exceptions at source when you do signal(SIGFPE, SIG_IGN), or let them run through the exception handling and then handle SIG_IGN being raised without calling postmortem()?

the side effect should still have been there

You do get a nice compiler warning about invalid operations ;-)

Aug 18, 2021 8:16pm

Julie Stamp (8365) 474 posts

I don’t believe that the LDMIB is aborting at all. The value a2 = &FFFFFFFD = SIG_IGN is exactly what you’d expect: the LDMIB is loading registers from the register dump, and that was the value of a2 when the DVFD F0, F0, #0 was executed because it was the second argument to the signal() immediately before.

I don’t follow the relevant piece of _postmortem, but I see it does check whether it’s just come from RaiseIt, so presumably in this case it lets that show up in the back trace.

Thanks for letting us know about the not ignoring signals.

Aug 18, 2021 8:18pm

Stuart Swales (8827) 1357 posts

It looks like the reason that you get same=1, different=0 when running on RISC OS 4 is that your compiler has optimised the d==d and d!==d expressions to 1 and 0 – there is no CMFD subsequent to the DVFD.

Aug 18, 2021 10:01pm

Charles Ferguson (8243) 427 posts

Stuart:

Was the fix there to turn off FP exceptions at source when you do signal(SIGFPE, SIG_IGN), or let them run through the exception handling and then handle SIG_IGN being raised without calling postmortem()?

Yeah, if you don’t raise the exceptions you avoid going through a lot of extra hoops – and of course you /can’t/ have exceptions happen if you’re in FP code in SVC mode, so allowing them and hoping that the environment handler will catch them and recover wasn’t a solution. I’m not sure that that was actually in my mind at the time, but that’s certainly a reason why you shouldn’t try to leave it to the environment handlers to deal with it.

Julie:

I don’t believe that the LDMIB is aborting at all. The value a2 = &FFFFFFFD = SIG_IGN is exactly what you’d expect: the LDMIB is loading registers from the register dump, and that was the value of a2 when the DVFD F0, F0, #0 was executed because it was the second argument to the signal() immediately before.

Oh, that’s interesting. I had assumed that that was the case when it crashed.
Oh, I think I get it now… yes that address that’s the failure is the value of R14 – which has been set to &fc1367f4 because of the BL. Yup, thanks for the explanation – the LDMIB using R1 is a red herring.

Stuart:

It looks like the reason that you get same=1, different=0 when running on RISC OS 4 is that your compiler has optimised the dd and d!d expressions to 1 and 0 – there is no CMFD subsequent to the DVFD.

Tut… I had intended to check what the comparison code did, but was dealing with the actual crash first. That I might be able to fix in the compiler. Maybe. My compiler-fu is not great for features. Making it work on 64 bit systems – easy… changing behaviour – scarey.

Crash when dividing by 0 with signals disabled

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Aug 17, 2021 9:06pm Charles Ferguson (8243) 427 posts	Hiya, In the xml/xslt tools I produced recently I have code that generates a NaN and Infinities by turning off signals and then doing a 0/0 and 1/0 respectively. This is a pretty normal way to generate these values. It always worked for me. However, that was because I was using RISC OS 4. On RISC OS 3.7 and RISC OS 5, it crashes badly. Details can be found here: https://github.com/gerph/riscos-nantest

Aug 18, 2021 12:36pm Julie Stamp (8365) 474 posts	I don’t know much about floating point numbers, but here’s what I get on my Rasperry Pi: cc c.nantest Norcroft RISC OS ARM C vsn 5.83 [01 Jun 2020] "c.nantest", line 21: Warning: floating point constant invalid operation: '/' "c.nantest", line 23: Warning: actual type 'unsigned long' mismatches format '%0 8x' "c.nantest", line 23: Warning: actual type 'unsigned long' mismatches format '%0 8x' "c.nantest", line 38: Warning: floating point constant division by zero: '/' "c.nantest", line 40: Warning: actual type 'unsigned long' mismatches format '%0 8x' "c.nantest", line 40: Warning: actual type 'unsigned long' mismatches format '%0 8x' c.nantest: 6 warnings, 0 errors, 0 serious errors `nantest NaN test nan = nan nan = 7ff80000/e0000000 same = 0, different = 1 INF test inf = inf inf = 7ff00000/00000000 same = 1, different = 0` `RISC OS 5.29 (01 Aug 2021) ==> Help on keyword SharedCLibrary Module is: C Library 6.09 (06 Feb 2021) ==> Help on keyword FPEmulator Module is: FPEmulator 4.38 (03 Jul 2021) (1.13CELM)` I get the same result on RPCEmu (except there’s no CEL in the FPEmulator version code).

Aug 18, 2021 3:11pm Stuart Swales (8827) 1357 posts	Working OK for me too on ARMX6, RISC OS 5.29 (02 Nov 2020), SCL 6.08, FPE 4.37; compiled with Norcroft 5.86 [10 Feb 2021]. Has gcc got the optimisation for comparisons with f.p. constant values wrong when targetting arm-riscos? [Aside to others: when handling f.p. values that may have NaN/Inf, usually best to use the comparison macros in math.h]

Aug 18, 2021 5:12pm Charles Ferguson (8243) 427 posts	Ok, so that means that when Julie Stamp and Stuart Swales compile with their compilers and Stubs on their machines there’s no problem. What about when you run the binary that was supplied? Eliminating other compilers and stubs from the problem is helpful in reducing the scope though. User reporting the issue was running it on RPCEmu on RISC OS 5.29. My tests were on RPCEmu using the RISC OS 5 from the easy start bundle (RISC OS 5.27 (19 Mar 2020)). Stuart: Has gcc got the optimisation for comparisons with f.p. constant values wrong when targetting arm-riscos? gcc doesn’t come into it. This is compiled with Norcroft 5.18+a bit and stubsg.

Aug 18, 2021 5:41pm Steve Pampling (1551) 8172 posts	Norcroft 5.18+a bit and stubsg Perhaps if you compile with something other than Stubsg and maybe rule in (or rule out) the age of the Norcroft package since Stuart has no visibility of an issue when using Norcroft 5.86

Aug 18, 2021 6:20pm Stuart Swales (8827) 1357 posts	It appears to be ‘faulting’ trying to continue from the FP exception somewhere in the guts of the SCL. Gerph’s binary fails on my system at the DVFD f0,f0,#0. I can reproduce the fault using my newer-than-thou compiler if I break out the divide by zero to a separate routine (newer Norcroft otherwise just computes the NaN (and the two booleans) at compile time). If I then replace the SIGFPE handler with one which prints “SIGFPE” and continues, then on the FP exception the SCL does call that handler, but still postmortems … quite deliberately. See https://gitlab.riscosopen.org/RiscOS/Sources/Lib/RISC_OSLib/-/blob/master/clib/s/cl_body#L353 I think here you’d have to wrap the divide by zero in (C99) feholdexcept & feclearexcept/feupdateenv to try to prevent the FPEmulator raising the exception.

Aug 18, 2021 6:46pm Charles Ferguson (8243) 427 posts	It’s somewhat odd that it’s failing for the user when it worked for you – and I can only assume it’s some factor of the compiler or stubs that’s affecting it there. That said, I’ve tried the RISC OS 5 CLib on Pyromaniac with a full trace, and it failed with an error, reporting an invalid operation, but which didn’t crash in the way I described earlier. I’ve included the findings on the repo’s README.md, together with the full log of what the RO 5 CLib did.

Aug 18, 2021 6:57pm Stuart Swales (8827) 1357 posts	Bizarrely my suggested feclearexcept(FE_DIVBYZERO) didn’t clear the /0 exception which is still then raised by feupdateenv(&env). If I change it to feclearexcept(FE_ALL_EXCEPT), that does the trick – it will happily divide by zero yielding NaN/Inf. `double dodivbyzero(double d) { fenv_t env; feholdexcept(&env); //puts("After feholdexcept"); d = d / 0.0; //puts("Back from divide"); feclearexcept(FE_ALL_EXCEPT); feupdateenv(&env); //puts("After feupdateenv"); return d; }` [Edit: That’s because 0/0 on the FPA gives an Invalid Operation exception (FP_INVALID_OP), not divide by zero (FE_DIVBYZERO).]

Aug 18, 2021 7:04pm Charles Ferguson (8243) 427 posts	I can reproduce the fault using my newer-than-thou compiler if I break out the divide by zero to a separate routine (newer Norcroft otherwise just computes the NaN (and the two booleans) at compile time). Nice that it optimises that into a NaN… BUT the side effect should still have been there! There should be an equivalent of a `__rt_divtest` call to raise the side effects of the optimised away calculation. The optimised away code and the lack of the side effect then explains why Julie and Stuart got versions that worked. Yay! One mystery solved. If I then replace the SIGFPE handler with one which prints “SIGFPE” and continues, then on the FP exception the SCL does call that handler, but still postmortems… Which I guess implies that it’s not honouring the SIG_IGN – which is what it looks like from the trace I’ve just done of the code. The SCL never even tries to disable the signals from being generated. Looking through git logs, in the ancient sources the signal handler disabling exceptions was addressed in 2001 in my version of CLib… Coo, over 20 years ago. Now I feel old. commit 1000c2c3e4994ce9dff991b7438dae691ce4a5b9 Author: justin <> Date: Sun Aug 5 02:23:44 2001 +0000 Summary: Added support for SIGFPE ignoring. Detail: * At present, if a SIGFPE happens we call our handler. It is run and on return we produce a postmortem request. This isn't useful. As a first stage to allowing a handler on SIGFPE, we allow SIG_IGN to disable all the exceptions associated with the FPE. This means that if we have code that does : signal(SIGFPE,SIG_IGN); subsequently, all FPE exceptions will be ignored (where previously a postmortem would have been produced). signal(SIGFPE,&handler); will restore the signal state and allow us to report exceptions in the normal manner. Admin: Tested with Galaxy doing an explicit overflow on Virginia; seems to work and continues as you might expect. In theory, this fixes the Galaxy crashes and should allow other users to perform their normal signal(SIGFPE,...) operations as they would under unix. Tag: RISC_OSLib-4_92

Aug 18, 2021 7:06pm Charles Ferguson (8243) 427 posts	For reference, in the actual change, I’ve reduced this to just setting the value of NaN and the infinities directly by encoding the 64bit values: https://github.com/gerph/libxml2/commit/b1909005f496e0d08664501ec35c9c7649de5ab4

Aug 18, 2021 7:12pm Charles Ferguson (8243) 427 posts	I have updated the write up at https://github.com/gerph/riscos-nantest to describe the findings using the CLib on Pyromaniac. Amusingly if I use the RISC OS 5 CLib from the harddisc image on Pyromaniac I get the Invalid operation error (because it doesn’t actually ignore the signal) but doesn’t crash. That’s weird?

Aug 18, 2021 7:15pm Stuart Swales (8827) 1357 posts	Presumably it borks on vanilla RISC OS 4.02 as well?

Aug 18, 2021 7:36pm Charles Ferguson (8243) 427 posts	Perhaps if you compile with something other than Stubsg and maybe rule in (or rule out) the age of the Norcroft package since Stuart has no visibility of an issue when using Norcroft 5.86 I was going to say ‘but I know it can’t be the problem, ‘cos I wrote it and it can’t affect this’, but that’s silly – I wrote it, so I know it could be a problem. So let’s try it with a 32bit stubs. Admittedly a 32bit stubs that I updated, but meh… it’s one less thing. charles@laputa ~/projects/RO/nantest (master)> riscos-amu BUILD32=1 riscos-cc -c -Wc -fa -IC: -za1 -apcs 3/32/fpe2/swst/fp -D__CONFIG=32 -o o32/nantest c/nantest Norcroft RISC OS ARM C vsn 5.18 (JRF:5.18.119) [Nov 13 2020] "c/nantest", line 21: Warning: floating point constant overflow: '/' "c/nantest", line 23: Warning: actual type 'long' mismatches format '%08x' "c/nantest", line 23: Warning: actual type 'long' mismatches format '%08x' "c/nantest", line 38: Warning: floating point constant overflow: '/' "c/nantest", line 40: Warning: actual type 'long' mismatches format '%08x' "c/nantest", line 40: Warning: actual type 'long' mismatches format '%08x' c/nantest: 6 warnings, 0 errors, 0 serious errors riscos-link -rescan -C++ -aif -o aif32.nantest-stubsg o32.nantest C:o.stubsGS nantest-stubsg: All built {Disc} charles@laputa ~/projects/RO/nantest (master)> riscos-amu BUILD32=1 -f MakefileStubs,fe1 riscos-link -rescan -C++ -aif -o aif32.nantest-stubs32 o32.nantest <Lib$Dir>.CLib.o.stubs-32 nantest-stubs32: All built {Disc} charles@laputa ~/projects/RO/nantest (master)> pyrodev --common --command aif32.nantest-stubs32 NaN test nan = nan nan = 7ff80000/e0000000 same = 1, different = 0 INF test inf = inf inf = 7ff00000/00000000 same = 1, different = 0 charles@laputa ~/projects/RO/nantest (master)> Tested, new aif posted (and the old one renamed to nantest-stubsg). Same problem, so it’s not stubsg that cause the problem. Or it is stubsg that causes the problem AND it’s in my stubs as well. And Stuart has repro’d with his own build too, so I’m pretty sure I’m in the clear here.

Aug 18, 2021 7:53pm Charles Ferguson (8243) 427 posts	Presumably it borks on vanilla RISC OS 4.02 as well? Going by that code comment, I’d assume so. Yup, just tested it and I get an exception at 3ffffffc. So basically this is something that I’ve fixed in Select. Yay.

Aug 18, 2021 8:00pm Stuart Swales (8827) 1357 posts	So basically this is something that I’ve fixed in Select Was the fix there to turn off FP exceptions at source when you do signal(SIGFPE, SIG_IGN), or let them run through the exception handling and then handle SIG_IGN being raised without calling postmortem()? the side effect should still have been there You do get a nice compiler warning about invalid operations ;-)

Aug 18, 2021 8:16pm Julie Stamp (8365) 474 posts	I don’t believe that the LDMIB is aborting at all. The value a2 = &FFFFFFFD = SIG_IGN is exactly what you’d expect: the LDMIB is loading registers from the register dump, and that was the value of a2 when the DVFD F0, F0, #0 was executed because it was the second argument to the signal() immediately before. I don’t follow the relevant piece of `_postmortem`, but I see it does check whether it’s just come from RaiseIt, so presumably in this case it lets that show up in the back trace. Thanks for letting us know about the not ignoring signals.

Aug 18, 2021 8:18pm Stuart Swales (8827) 1357 posts	It looks like the reason that you get `same=1, different=0` when running on RISC OS 4 is that your compiler has optimised the `d==d` and `d!==d` expressions to 1 and 0 – there is no CMFD subsequent to the DVFD.

Aug 18, 2021 10:01pm Charles Ferguson (8243) 427 posts	Stuart: Was the fix there to turn off FP exceptions at source when you do signal(SIGFPE, SIG_IGN), or let them run through the exception handling and then handle SIG_IGN being raised without calling postmortem()? Yeah, if you don’t raise the exceptions you avoid going through a lot of extra hoops – and of course you /can’t/ have exceptions happen if you’re in FP code in SVC mode, so allowing them and hoping that the environment handler will catch them and recover wasn’t a solution. I’m not sure that that was actually in my mind at the time, but that’s certainly a reason why you shouldn’t try to leave it to the environment handlers to deal with it. Julie: I don’t believe that the LDMIB is aborting at all. The value a2 = &FFFFFFFD = SIG_IGN is exactly what you’d expect: the LDMIB is loading registers from the register dump, and that was the value of a2 when the DVFD F0, F0, #0 was executed because it was the second argument to the signal() immediately before. Oh, that’s interesting. I had assumed that that was the case when it crashed. Oh, I think I get it now… yes that address that’s the failure is the value of R14 – which has been set to &fc1367f4 because of the BL. Yup, thanks for the explanation – the LDMIB using R1 is a red herring. Stuart: It looks like the reason that you get same=1, different=0 when running on RISC OS 4 is that your compiler has optimised the dd and d!d expressions to 1 and 0 – there is no CMFD subsequent to the DVFD. Tut… I had intended to check what the comparison code did, but was dealing with the actual crash first. That I might be able to fix in the compiler. Maybe. My compiler-fu is not great for features. Making it work on 64 bit systems – easy… changing behaviour – scarey.