Improving errors
Jeffrey Lee (213) 6048 posts |
Prompted by this thread I thought it would be worthwhile to try and jot down a list of a few potential ways we could improve the way errors are handled by the OS + apps.
The last two are going to be the tricky ones. We could bodge in some support by changing the way that MessageTrans works (e.g. have MsgTrans_ErrorLookup generate a second, hidden buffer which contains the full-length error message, and then add a new SWI which will allow code which reports the error text to map from the public error pointer to the hidden internal buffer). But for a non-bodge solution we’ll probably need to introduce an entirely new method of managing errors within the OS. |
Martin Avison (27) 1494 posts |
Agreed. Many of the OS error messages were conceived in a time when programs ran one at a time – very single tasking – so it was obvious what program caused the error, and probably which action and object. In the multi-tasking Wimp world this is no longer true at all – particularly with the layered nature of programs. Perhaps any other prime example cases could be logged in this thread? If the management of errors is affected, maybe it should include non-blocking Wimp error boxes?! [steps back smartly] |
Chris Johnson (125) 825 posts |
A consistent way of logging errors would be of great benefit – how often have we had to try to copy the more relevant parts of a vdu window? |
Chris Hall (132) 3558 posts |
I think an error generated whilst a particular line of an Obey file is being run (at the highest level) should report the file in which the line that caused the rror resides. For example CLib out of date at line xx of file !Boot.somewhere.obeyfile |
Steve Pampling (1551) 8172 posts |
Thank you for putting such a comfortable saddle on my hobby horse :) All, and I mean all system errors should be logged both to a display message (multitasking please) and to a text file on a convenient file system. All application errors should do the same by default, but a setting in the !Run file could be allowed to override this to produce a blocking error if the current system wide settings allow this. I’m working on the basis that once in a blue moon there might be something that needs to stop the machine dead. I’m thinking based around SysLog but I’m thinking that a newer version out in Linux land might be a better idea as the base than the current v0.20 that we make do with in RO5.x installs. Oh, and the prime point for starting with decent logging is at the beginning:- how many times have we had conversations on the subject of which particular bit of the boot sequence died, tripped up etc? New users arrive, get bewildered and leave. Migrating users (upgrades from RO4.02 being commonest) discover nice things like the “missing sprites” – great there’s a missing file, but what exactly complained in that way? I shall release the reins now… |
Rick Murray (539) 13850 posts |
Jeffrey: I think the advanced error handling should be a part of the Debugger module. In that way, it may be possible to use the backtrace code to unwind C programs from the error point…?
No. That should be a separate API. Think about it – if an application is “blocked” due to an error being shown, what happens if a user closes a window or drags something across an open window? RISC OS is not compositing, it needs application help. It may be possible to have a minimal API registered with the Wimp as a sort of callback (sorry, precludes BASIC) to perform some polling behaviours when an error box is visible, but the only other way to get this to work is to actually write it into the program with a minimalist polling loop around the error window being shown. Steve – as I run a server, I wrote a small module to automatically deal with error messages. As I was doing it, I added a tweak where it can record the messages to DADebug as well. I’ve found that useful.
Because of could-do-better error messages? Or because our paradigm is vastly different to how other systems behave. Some that I’ve encountered the most:
It’s easy for us, many of us grew up with RISC OS since the days it before it even had an outline font renderer, or apps in ROM… But for a newbie? :-)
Rarely. The times I manage to stiff the machine it’s usually one of RISC OS’ big weaknesses, a stack imbalance in module code. Mess up the SVC stack, it’s curtains. Modules probably ought to mostly run in SYS mode but that didn’t exist way back when. |
Steve Drain (222) 1620 posts |
Could someone wiser confirm that most errors can be, and ought to be, handled by an app while multi-tasking. The default of passing an error back to the calling program, the Wimp generally, is what results in a lot of single-tasking error boxes. Using the Toolbox I like to present non-fatal errors from BASIC in a modified QuitDialogue, giving the user the error message and the option to quit or proceed. Is that sensible? |
Rick Murray (539) 13850 posts |
Xxxxx Can be, but it can be a lot more work – perhaps involving a complicated set of interlocks to prevent certain actions from occurring while the multitasking error message is displayed. To be honest, I often cheap out and just toss the error block at Wimp_ReportError… [edit: added Xs to get around Textile stupidity] |
Martin Avison (27) 1494 posts |
There are frequently references to SysLog … is it about time that it was included somewhere in RO5? I am still using the DoggySoft v0.20 dated 2003, and I have no idea if it is still the best one to use. |
Jeffrey Lee (213) 6048 posts |
Possibly. There are a few aspects of it that I don’t like, but it is useful, and has been around long enough that a few different apps support it as standard. ROL also include a version in their ROMs. I think the main improvements I’d want to see would be:
It may also make sense to add a plug-in API for handling the messages; that way any number of different types of backend could be supported (e.g. the ability to forward to DADebug, or other bespoke debug interfaces). |
Jon Abbott (1421) 2651 posts |
Definitely need non-blocking error dialogues. Those damned things plague my life…having to write down error messages so I can then open a command prompt to examine why something crashed – its infuriating.
I believe we’re talking about dialogues blocking the whole Wimp, not blocking one app – that app can be terminated if its reported a non recoverable error and does not need it’s windows redrawn. |
Ron (2686) 63 posts |
If an App has reported an error via the normal ErrorBox, shouldn’t the wimp just then exclude that app from its polling loop until the box has been “closed”. It would only have to check key pressed or mouse clicked in the error box Thoughts: Error boxes pushed to top of the window stack? An Error box manager listing all current apps/error boxes (or clickable via switcher icon) Ico Bar app icons (highlighted or flashing (cough/mumble(a bit like windows)) when there is an error box open) That could maybe also applied to other query boxes where an app has stalled. Ron |
Steve Pampling (1551) 8172 posts |
Jeffrey:
Don’t like aspects of SysLog 0.20, don’t like aspects of RO implementations of SysLog, or don’t like SysLog in general? Martin:
On RO5.x you don’t really have a choice unless you go get some source and port a newer version. |
Jeffrey Lee (213) 6048 posts |
Jeffrey: All of the above.
|
Steve Pampling (1551) 8172 posts |
I suppose there’s two aspects:
1Which reminds me I need to buy stuff on the way in to work tomorrow. |
Rick Murray (539) 13850 posts |
This post of yours reminded me to dust off something I’ve had kicking around and fix it up. It’s really simple. It will pick up on Service_Error and will log the error message to DADebug (but could be modified to, perhaps, push the message to serial port or whatever). Caveat: It does NOT trap a lot of Wimp_ReportError messages. It depends on whether it’s because an actual error happened or the program just called Wimp_ReportError with some sort of message. Example: Error: "Application is not 32-bit compatible". Error: "Application is not 32-bit compatible". Error: "Unknown or missing variable". Error: "Internal error: abort on data transfer at &FC1D5880". In the latter case, the on-screen message was “Application may have gone wrong. Click Continue to try to resume or Quit to stop Application.”
|
Jeffrey Lee (213) 6048 posts |
LDMNEFD R13!, {R0-R1, PC} ; There's no mechanism for detecting DADebug being loaded. Just re-load this module... END Obvious bug is obvious. |
Rick Murray (539) 13850 posts |
:-) Perfectly highlighting the danger of using copy-paste instead of actually typing stuff… [fixed] |