Recovery after "FileCore in use" errors
Jeffrey Lee (213) 6048 posts |
A couple of days ago I was searching through some files using ‘grep’, only to have it crash and break FileCore (i.e. “FileCore in use” errors whenever I tried to access any filesystems). Since FileCore is a pretty important module it’s a bit annoying that it can get left in a broken state like that, especially since it makes it difficult to debug whatever caused the crash (no way of saving memory dumps, etc. – although I guess I could have tried saving to a network share). So I’m wondering if there’s anything that can be done to recover FileCore following a crash.
In a way this issue also touches upon the way that RISC OS behaves whenever unhandled CPU exceptions occur – if an IRQ handler crashes then by flattening the entire IRQ/SVC stack the OS will unceremoniously kill off any foreground tasks (and probably whatever user application happened to be active at the time). So it would be nice if the OS was to save the full CPU context whenever an IRQ occurs, so that it has a known-good location to resume execution from, protecting forground tasks from any badly behaved background operations. AIUI this is the approach ROL took with their background error system. It’ll slow things down a teensy bit, but I’d say it’s pretty vital for making the OS more crash-proof, or for when we start work on making the OS multithreaded, etc. |
Alan Peters (515) 51 posts |
The point about saving the status to recover from errors in IRQs is a good one. It is important that errors don’t crash the entire operating system as is often the case at the moment. It’s perhaps one of those things where there isn’t a perfect solution – it would be good to have one that works most of the time. The speed I doubt will be an issue as most users of RISC OS 5 will have modern hardware. I assume in preserving full CPU state that would include VFP / Co-processor states so there is a fair bit to store. As a comparison and despite huge amounts of effort from M$, Windows 7 still likes a BSOD where a faulty IRQ handler (driver) is involved. It just does it a lot less than it used to. |
Alan Peters (515) 51 posts |
Excuse the double post – not entirely sure how I managed that. I don’t have permission to delete the second one for some reason! [admin] Deleted your duplicate post. |
Jeffrey Lee (213) 6048 posts |
Yeah, storing the VFP state does add quite a bit. But since most IRQ handlers won’t be interested in VFP, all that the VFPSupport module will do is disable access to the coprocessor and put off saving/restoring the register contents until something that wants to use VFP comes along. FPEmulator also has similar functionality, but it was only added “recently”, so pretty much all existing code continues to manage the FPA context manually. This isn’t particularly great for performance, considering that for most machines it’ll end up invoking the undefined instruction handler – much slower than a simple SWI/function call. At one point I did have a quick go at updating the Wimp to use FPEmulator contexts, but there was a bug somewhere and I haven’t found the time to investigate it yet. |
Jeffrey Lee (213) 6048 posts |
Having just had another “FileCore in use” error (thanks, objasm!), I can confirm that the following doesn’t work:
|