Newshound and Eternal September
Chris Hughes (2123) 336 posts |
I did not get errors like this with standard version of NH Runimage. |
Steve Pampling (1551) 8173 posts |
and C/C++ book (the one supplied with the DDE), page 32 (or 43 in the PDF). |
Colin Ferris (399) 1818 posts |
Is it at all possible for RO build of the CLib to have Function / Proc names included? Gerf made a Prog that would back pedal from an error address to print out the first function / Proc name it came to. And if a Prog had terminated – use on the Run image file. |
Dave Higton (1515) 3535 posts |
CMFE F0, #2 comes from the source line: if (difftime(time(NULL),newsserver→cur_time)>2) which is innocuous enough. Neither time(NULL) nor newsserver→cur_time is a floating point number – they are both time_t, which is defined in CLib.h.time as unsigned int. So the floating point value doesn’t exist until difftime() is executed. It does remind me that, a few years back, I was looking at Javascript in NetSurf to do time operations, and I couldn’t get a consistent value of system time. It’s too long ago to remember the details, but I did conclude that it wasn’t the Javascript that was going wrong, it was getting the system time that was failing – intermittently, which made it impossible to put up a test case. I wonder if we’re seeing something related here? |
Dave Higton (1515) 3535 posts |
How to work out what is in F0? There isn’t much code in front of the comparison between F0 and 2: MOV R0, #0 BL &0001A37C LDR R1,[R6,#2524] BL &0001A374 CMFE F0,#2 (The BL addresses vary according to the build, of course.) Presumably, the first two instructions are doing time(NULL), the third is getting newsserver→cur_time, and the fourth is doing… what? There has to be a subtraction and a conversion to double. The two BL addresses must be to the Shared C Library, as there’s just a large area of zeroes where they point to. There must be a list of Shared C Library function entry points somewhere, which would make it possible to work out what functions are being called. I’m on SharedCLibrary 6.08 (05 Sep 2020) – what’s your version, Chris? |
Chris Hughes (2123) 336 posts |
The same one as you. For reference the latest error gave: |
Colin Ferris (399) 1818 posts |
Have you tried running your RunImage through Druck’s Armalyser? |
Martin Avison (27) 1495 posts |
I can now corfirm that – had one today. Only other clues were a PM report which I did not have time to note – but next time I will! |
Rick Murray (539) 13857 posts |
Yes, difftime is defined in the source as: double difftime(time_t time1, time_t time0) { return (double)time1 - (double)time0; } Given that time_t is an unsigned int (under RISC OS and POSIX), and this is a simple subtraction, why on earth is it promoting everything to double? Especially given that that’s the only time double appears in the entire time source (...Sources.Lib.RISC_OSLib.c.time).
Yes, that makes sense.
Well, time() is &1A37C. This unknown call is &1A374, which is 8 less. In other words, two functions before time(). The answer is… kind of obvious. ;-) Entry clock, imported, , , 0 Entry difftime, imported, , , 2 Entry mktime, imported, , , 1 Entry time, imported, , , 1 Entry asctime, imported, , , 1 There’s time(), and two before it is… difftime()! ;-) It will have left the result in F0, so the final instruction is comparing that with two. |
Martin Avison (27) 1495 posts |
And when restarting NH and doing the failed fetch again, I got (with details this time!)…
(Oh why do pre code tags sometimes double space?!) |
Rick Murray (539) 13857 posts |
What are you using to read the code? Zap (rick-06 or later) is capable of recognising programs that use CLib and annotating with function names. Here’s a short example (I’ve removed the opcode data to save space): 00009080 : BL &0001FCAC ; call: strstr 00009084 : CMP R0,#0 00009088 : MOVNE R1,R8 0000908C : ADDNE R0,R13,#&028C ; =652 00009090 : MOVNE R2,#&40 ; ="@" 00009094 : BLNE &0001FC7C ; call: strncpy 00009098 : ADD R0,R13,#&0C ; =12 0000909C : ADR R1,&000091E0 ; -> string: "notice" 000090A0 : BL &0001FCAC ; call: strstr 000090A4 : CMP R0,#0 000090A8 : BEQ &000090D0 000090AC : MOV R1,R7 000090B0 : ADD R0,R13,#&020C ; =524 000090B4 : MOV R2,#&7F ; =127 000090B8 : BL &0001FC7C ; call: strncpy 000090BC : ADD R0,R13,#&020C ; =524 000090C0 : BL &0001FCBC ; call: strlen 000090C4 : ADD R1,R13,#&020C ; =524 |
Rick Murray (539) 13857 posts |
That code is difftime. Running the compiler in assembler output mode generates this: difftime 000000 e2202102 EOR a3,a1,#&80000000 000004 e2211102 EOR a2,a2,#&80000000 000008 ee022190 FLTD f2,a3 00000c ed9f8104 LDFD f0,[pc, #L000024-.-8] 000010 ee031190 FLTD f3,a2 000014 ee022180 ADFD f2,f2,f0 000018 ee033180 ADFD f3,f3,f0 00001c ee220183 SUFD f0,f2,f3 000020 e1a0f00e MOV pc,lr L000024 000024 41e00000 DCFD 2147483648.0 000028 00000000 How such a simple looking function turns into that….makes my head hurt. |
Dave Higton (1515) 3535 posts |
Thanks, all. I discovered this lot at the same time as you! Except the bit about Zap – I’ve been using rick-07 for ages, but the comment option (which I didn’t know existed) was turned off. So, thanks for that!
Because that’s what difftime() is defined to return, possibly for compatibility with other platforms where time_t might be something other than an integer count of seconds? Dunno. I have to admit that I don’t even begin to understand Martin Avison’s dump above. So: difftime subtracts two 32-bit integers and returns the result as a double. What combination of input values can cause an invalid operation? |
Martin Avison (27) 1495 posts |
While I was painting this morning NH had…
There was nothing in *Where and no OS dump, but NH had vanished from the iconbar! I had a closer look at NH Syslog, and noticed that normally there is something like
but when it has a problem that run obviously ends with just the DATE request – the next entries are when NH is restarted. |
Martin Avison (27) 1495 posts |
Some thoughts about the FP errors – DaveH has said he has done nothing in that area … but anyway I am having problems with v1.52-32. However, the incidence seems to have increased – I cannot remember seeing such problems in the years I have been using NH to fetch every 15 minutes. But they seem associated with date/time calculations, and one thing which has changed is the date/time, and I wonder if it has increased to a value which is causing more of these problems? |
Rick Murray (539) 13857 posts |
Is it a problem with parsing the date? Have date strings changed at all (other than switching to summer time)? Is something perhaps using a timezone name rather than hour offset? |
Rick Murray (539) 13857 posts |
Divide by zero isn’t an exception. It’s a maths error that could be trapped using signal(), but given the indeterminate state afterwards, it’s usually better to sanitise values to avoid the problem in the first place; e.g. if you’re dividing something by something else that you are not absolutely sure cannot be zero, check it before the division. And if you are sure the value cannot be zero…check it anyway. What you see looks like a generic handler that basically does “oh crap, this is bad, let’s just spew some nerdy gibberish and then abort”. |
Dave Higton (1515) 3535 posts |
Builds since yesterday have function names.
Where I’ve been able to find floating point division used, I’ve observed that a check for zero was already in place, along with appropriate correction. Re. DATE – the NewsHound Help file states that DATE isn’t part of the NNTP RFCs, and is not supported by all servers. I can’t add any personal insight into that. NewsHound can be configured to ignore the error if it requests DATE but doesn’t get it, and supposedly gets the time from the computer’s clock. Again, not tested because the servers I use do support DATE and it has never gone wrong for me. Since I’ve found and fixed some more bugs since the test version that went out, I’m now convinced that the best way forward is to try the current version rather than do any more reverse engineering on the obviously defective version. I’ll send one out to people who’re having trouble. |
Rick Murray (539) 13857 posts |
I have a feeling (having run into this myself with a faulty integer percentage counter (got the amount and size back to front)), that this may be any division by zero, not just FP ops. |
Dave Higton (1515) 3535 posts |
If so, we’d all be seeing the same problems. |
Martin Avison (27) 1495 posts |
Seems to be a bad day for me: After returning from a walk, I found an error box with I see difftime, which was mentioned earlier. I could send the dump if anyone interested, or even post it all here.
Is that the CLib in ROOL ROM builds, or yours?
That would be fine, thanks. |
Rick Murray (539) 13857 posts |
Hmm… CLib entry #9: x$divtest (internal function) |
Dave Higton (1515) 3535 posts |
My builds of NewsHound. |
Dave Higton (1515) 3535 posts |
I hadn’t twigged until Martin emailed me just now, but he’s having these problems with a NewsHound RunImage from over 10 years ago! Long before I started interfering with it. So the problems are not new. That puts a different slant on possible solutions. |
David J. Ruck (33) 1636 posts |
If its from 10 years ago and there could be all sorts of dogginess on recent machines. I’ve found a lot of software which worked up to the Pi4 and then had issues, many due to bodged 32 bit conversions i.e. put a 32 bit header on the file, and just leave all the 26 bit only code as is. |