OS_ValidateAddress shortcomings
Jeffrey Lee (213) 6048 posts |
I’ve been looking at ticket #279, and in particular trying to add some sanity checks to the kernel/OS so that it can try and avoid getting stuck in an infinite abort loop if you do something silly like give it a duff error handler pointer. However the current implementation of OS_ValidateAddress doesn’t seem like it’s up to the job:
So the solution to me seems to implement something better, e.g. based around ROL’s OS_Memory 24 call. But since RISC OS 5 hasn’t (yet) gone down the route of turning all memory regions into dynamic areas we’ll still need some extra code in there to manually check all the non-DA regions. I’m also wondering how (or if) the call should fit in with Service_ValidateAddress. I’m guessing the original intention of the call was to allow podule drivers to flag memory attached to their podule as being valid. I can see us needing a similar call to deal with IO memory; although 90% of IO memory will be IO registers or invalid, there might be 10% or so which is a valid target for some operations (e.g. VRAM on Iyonix/Pi). So maybe we’ll need a service call to allow reporting of that information. But it doesn’t look like ROL’s OS_Memory 24 makes a distinction between IO registers and memory (regular RAM/ROM or IO RAM/ROM). And neither is Service_ValidateAddress a very good fit, as it won’t deal with the situation where half the range is IO registers and half the range is IO memory. Plus I don’t really like the idea of involving service calls at all – if the kernel is in the middle of recovering from an error and is trying to work out if writing to a particular address is going to make it go bang, it should probably be doing the least amount of work possible instead of calling out into other modules which may also be broken (disregarding the fact that if the module chain is broken the chances of the OS being left in a recoverable state are very slim anyway… but we can always hope for the day when the OS is able to recover from such situations). Apart from adding some protection to the error handler/buffer I’m also hoping to add some protection to the abort register dump area. This one’s a bit trickier, I suspect the best course of action is for the kernel to create a temporary register dump on the stack, go through the majority of the recovery process (disabling FIQs, resetting the SVC stack, etc.) and then check to see if the buffer specified by the current environment is any good or not. Anyone have any thoughts on the matter? |
Martin Avison (27) 1494 posts |
I have run into problems in the past, just trying to establish whether a range of memory addresses was valid to read or not. What I thought would be simple with OS_Validate address certainly did not give the whole picture, but I cannot remember the details. So I would certainly vote for something similar to OS_Memory,24. |
Jon Abbott (1421) 2651 posts |
I always thought OS_ValidateAddress was supposed to do what it says…validate an address, be it IO, RAM, ROM or any other flavour of readable\writable memory. Supporting just logical RAM seems somewhat illogical to me. An issue I currently have in ADFFS is validating if memory is valid in an Abort handler, it currently falls flat in its arse because there’s no way to validate ROM. If you can fix that, either be changing the behaviour to support all memory types or extend it so it’s possible to validate ROM that would a step forward. |
Sprow (202) 1158 posts |
PRM1-388 first line on the page “Check that a range of addresses are in logical RAM” looks pretty unequivocal. The snag here is that code can exist in places other than logical RAM. In the confines of just the kernel, I wonder how cheap it would be to set a flag, try the access anyway, and catch it in the abort handler – checking the flag and diverting back if it is set to conclude that it’s unsafe? It’s a bit like lazy task swapping, just lazy execute validation. That way it’s entirely defined by the MMU tables & possibly quicker than trying to do the reverse mapping? Service_ValidateAddress looks to pair with OS_ValidateAddress, so I think a new service call/SWI pair is in order to query executable regions, though I’d probably exclude IO regions. |
Jeffrey Lee (213) 6048 posts |
I think the call was intended as a way for SWIs which perform block read/writes from memory (filesystem stack, DMA manager, etc.) to make sure that the area they’ve been given is a safe place to access before they get halfway through and fail horribly. Or even worse, fail silently, e.g. accidentally DMA’ing to/from a bunch of IO registers.
That would add extra overhead to abort handling, which wouldn’t be ideal. And it wouldn’t catch all the situations I’m interested in – e.g. it wouldn’t prevent the kernel scribbling all over IO registers, and it wouldn’t catch attempts to execute code from DAs/regions which are meant to be nonexecutable. Unless you also had code in the prefetch abort handler, which would add even more abort handling overhead. |
Jon Abbott (1421) 2651 posts |
We possibly need a multi-purpose replacement then, that returns only logical RAM for backward compatibility, a OS_Memory 24 a-like which handles all memory types and possibly another for just executable. In which case, possibly leave OS_ValidateAddress as is, implement OS_Memory 24 and add another OS_Memory for executable or extend OS_Memory 24 if it’s possible, so you can specify the memory types you want to validate. However, the problem we then face is the service call, which for backward compatibility sounds like it should only return logical RAM.
Does the ROL implementation stick to just RAM? With OS_Memory 24 used to cover all memory types, or does OS_ValidateMemory simply pass the call onto OS_Memory 24? |
Jeffrey Lee (213) 6048 posts |
OS_Memory 24 returns a bunch of flags describing the region, so I don’t think we need lots of separate calls. Just one multi-purpose one and then it’s the caller’s responsibility to check the flags on exit. PRM1-388 first line on the page “Check that a range of addresses are in logical RAM” looks pretty unequivocal. ROL’s OS_ValidateAddress passes the call onto OS_Memory 24. For compatibility with the old OS_ValidateAddress, I think all it has to do is check the returned flags to see if the area is completely read and write accessible in privileged modes (bits 2 & 3 set), and has no physically mapped areas (bit 12 clear). Or if it is physically mapped, it must be fully “IO RAM”. This should make it pick memory with the right attributes as OS_ValidateAddress, although there may still be some discrepancies (ROL’s docs aren’t clear on what, if anything, they’ve done with Service_ValidateAddress – potentially they’ve ditched it entirely since they’re using dynamic areas for mapping physical memory. Plus there will be the areas which OS_Memory 24 will be checking which OS_ValidateAddress never did, e.g. kernel buffers, CAM/page tables, etc.) As far as making a ROOL version of OS_Memory 24, I think the only things which we’d need to do is add a flag pair for returning nonexecutability information, and a flag pair for returning “IO RAM” information. ROL’s physically mapped areas are identical to ROOL’s IO memory pool (OS_Memory 13), in that they’re a mix of “IO registers” and “IO RAM”, so for all IO regions we should have the physically mapped flag set, and then use the service call to allow “IO RAM” regions to identified (which for compatibility will end up having both bits set, even if the address range is fully within an IO RAM region) As far as lookup speed goes, I’m thinking that the OS should really keep a lookup table for indexing dynamic areas by address. E.g. keep all the DAs in a linked list sorted by address, and then use a LUT (256 entry? 4K entry?) for fast indexing into the right part of the linked list. In fact the OS may already be doing this – I haven’t looked in too much detail at the Ursula DA management optimisations. That would help both the performance of this call and anything else where fast address-to-DA lookups are required (e.g. if/when we add support for per-DA abort handlers) |
Jon Abbott (1421) 2651 posts |
Do you need to do anything special for lazy page mapping? Abort handlers might want to know if the page they’re about to access is actually mapped or not. I fell foul of this particular issue in my Abort handler. |
WPB (1391) 352 posts |
Having had a quick scan, I don’t think there’s any more useful information in Gerph’s rambles, but I’ll link to it just in case it helps. You’ve probably read it already anyway! |
Jeffrey Lee (213) 6048 posts |
ROL’s OS_Memory 24 does have a flag for indicating whether DAs have custom abort handlers, so it would make sense if application space had that flag set to indicate that lazy task swapping is in use. Other than that I don’t think we need anything special, or at least not yet. E.g. if/when we implement abortable DAs we’d probably want to make the kernel check that the abort handler for a particular DA isn’t itself located within an abortable DA, to try and avoid a circular chain of aborts. |
Jon Abbott (1421) 2651 posts |
The RISCOS Abort handler would probably need additions for this, going by the mess lazy task swapping made of my Abort handler, it looks like the last Abort handler to register owns the hardware vector. Which means every Abort handler currently needs to be re-entrant and check if the Abort is actually in its memory range and an Abort type it deals with. RISC OS would need to own the hardware vector and be aware of memory ranges and Abort types that each Abort handler can deal with. God knows how you could make that backward compatible as all and sundry tap into the vectors. In the case of CLib, I had to give it a false range so it didn’t mis-report a write Abort as a problem and left my Abort handler to either deal with it or pass it on down the Abort chain. Not ideal as any valid aborts are no longer dealt with by CLib but the OS as Clib is further up the chain and you can only pass down the chain, if you understand what I mean. The consequence being that any Abort is reported by the OS and CLib remains as the last owner of the Abort handler leaving my Abort handler orphaned in the middle when it tries to clean up and deregister. I’m digressing somewhat off-topic, but it would be useful if hardware vector handlers could unregister when they’re in the middle of the chain. Back on topic, I had a look at OS_Memory 24 and the one thing it doesn’t include is a flag for “executable”. This would be useful in a crash scenario, where an Abort handler might want to produce a dump report, which leads me on to the next point. How does OS_Memory 24 interact with Service_ValidateAddress? At the minute the service doesn’t return any flags, could we extend this so service claimants can return the memory type? Would the service not also need extending to support ranges that overlap several handlers? Somehow allow the handler to return a list of ranges and types, or perhaps issue a new service call after a module is loaded or app run asking for memory ranges and types and leave the OS to deal with the request on behalf the handler? |
Jeffrey Lee (213) 6048 posts |
Yes, that’s one of the things I’m planning on adding (although it’ll be a ‘nonexecutable’ flag rather than ‘executable’, for compatibility with ROL’s version).
That’s one of the things we still need to work out! My theory is that Service_ValidateAddress was only added in order to allow podules to report that any podule-attached RAM is a valid and safe memory location. For everything else, the kernel should be able to work out the memory flags itself, either by checking dynamic areas or by checking the page tables. So for OS_Memory 24, this means the following:
There are some shortcomings with this approach (e.g. it doesn’t deal well with memory ranges which cross from one region into another), but it should be roughly equivalent to the current behaviour. Also note that – at least as far as CVS is concerned – I can’t seem to find any code which implements Service_ValidateAddress. So I’d classify this as the kind of problem that’s probably left as-is until we better understand the requirements of the system. For example, instead of a better Service_ValidateAddress we may decide that it’s better to add some extra OS_Memory calls so that the kernel can be explicitly notified where the “IO RAM” regions are instead of having to issue service calls all the time. |
Jeffrey Lee (213) 6048 posts |
Ah, spoke too soon – the Wimp actually implements it! It looks like (pre-RISC OS 3.5) it was used in order to flag the free pool as being valid, because the free pool actually shared application space with the current Wimp task. I.e. rather than mapping the pages out, the Wimp merely shuffled the mappings around so that the current task is at the bottom of application space and the free pool (marked as inaccessible in user mode, and above the application space limit known to the kernel) sat after it. |
Jon Abbott (1421) 2651 posts |
RO doesn’t currently check the page tables does it? I had to add a Service_ValidateAddress handler to validate the memory ADFFS maps. It certainly makes sense to leave all the memory validation to RO and try to restrict the service call for podules etc One question, are lazily mapped pages reported as valid memory even if they’re not mapped? An OS_Memory call to force lazy areas to be mapped would be useful. And what about screen caching? If it does get implemented in the same way as RO4, by generating page faults on a write, should it be reported as valid? OS_Memory 24 probably covers most of this I suspect.
I’d leave the service as is for backward compatibility, extra OS_Memory calls seems the safer method.
A feature I exploited to back port OS_DynamicArea to RO3.1 :) |
Jeffrey Lee (213) 6048 posts |
While checking the RISC OS 3 PRM: Apparently Service_ValidateAddress is for internal use only. So it looks like we should be free to cut it out of OS_Memory 24 completely and come up with our own solution once the time comes. |
Jeffrey Lee (213) 6048 posts |
I think I’ve now got a fully working OS_Memory 24 implementation. I just need to do one more round of testing (check the results against the page tables to see if there are any gaps/inconsistencies) and then I can start updating things to use it.
Luckily an address-sorted linked list is already in use by the DA code, so it wasn’t too much work to also add an address-to-DA lookup table. |
Jon Abbott (1421) 2651 posts |
One test I’d like to see work under ADFFS (with the Service_ValidateAddress code removed) is: *RemapVideoMemory 13 160 The other issue I came across was OS_File to load a file into a DA, but it sounds like you now have that one covered. My workaround was to use OS_GBPB which didn’t have the issue – so may not be validating the memory it’s writing too. |
Jeffrey Lee (213) 6048 posts |
That would be a bit tricky, since you are changing the page tables without the OS’s knowledge. There are a few solutions I can think of:
I’ve just had a quick play with Aemulor, and that seems to work by creating valid DAs for the areas which it maps into application space, so I think it should work OK with my current OS_Memory 24 implementation (I’ll be doing some extra checks to make sure). However it doesn’t use the technique ADFFS uses to clone screen memory, so its requirements are a bit simpler than yours. Any thoughts on what kind of changes would work best for you? |
Jon Abbott (1421) 2651 posts |
When it comes to running games, ADFFS is very hands off and leaves as much as possible running natively. For the RO3.1 screen memory, it copies the first 480KB of L2PT entries from DA2 to <2000000 and >2000000 to mirror an Arc. I’m not sure if that’s possible or not with current SWI’s. That JIT however is frequently changing the access permissions on L2PT entries to track self-modifying code. Although that could be done with current SWI’s, it would severely impact the performance swapping in/out of Und/Abort mode to call memory related SWI’s. Is the SWI dispatcher capable of working if called in Und/Abort? The reason I originally dropped OS_SetMemMapEntries calls and went to direct L2PT manipulation was due to the flag corruption I was seeing, problems tertiary mapping memory pages from DA2 and the bugs in Service_ModeChange where it’s impossible to disable the screen caching. Bare in mind, this is all before I started looking at Pi support, so had to find solutions to the issues on RO3/4. Now the code is split into four versions, I can change the two RO5 versions (IOMD and non-IOMD) to use valid OS calls. |
Jeffrey Lee (213) 6048 posts |
My OS_Memory 24 implementation is now in CVS. After looking over ROL’s docs a second time it looks like their OS_ValidateAddress call still issues Service_ValdiateAddress, so for now I’ve gone with the path of least resistance and made our new OS_ValidateAddress do the same. This means things like ADFFS should continue to work correctly (e.g. your *LOAD into screen memory seemed to work OK). FYI, our Service_ValidateAddress will report the following as being valid:
And the OS_Memory 24 implementation only checks the page tables for the following situations:
For everything else, OS_Memory 24 relies on the dynamic area list and a bunch of fixed address ranges. It also assumes that there aren’t any dynamic areas (apart from the system heap) which were created above the IO memory region (this would only be possible with code which manually specifies an address for the DA). Dynamic areas created within application space should work fine however. After checking the OS_Memory 24 results against the page tables (and fixing a couple of other kernel bugs in the process!) it looks like it’s correctly reporting all memory regions, apart from anything that was mapped in manually via direct page table manipulation/OS_SetMemMapEntries. As far as my original goal of making the kernel less likely to get stuck in infinite abort loops, I’ve now updated the kernel to check the error handler + buffer addresses and the exception register dump buffer using OS_Memory 24. The recovery isn’t always very graceful, but it should be better than getting a hard crash when something goes wrong. |
Jon Abbott (1421) 2651 posts |
Sounds sensible. looking at the documention am I correct in assuming that if it returns R1=0 the range is invalid? You may also want to change “it does NOT issue Service_ValidateAddress” One thing I’d like to see at some point, is a means to determine is an address/range is executable (and include ROM). This would be useful for an Abort handler to determine if it should action and return or clean up. Thanks for checking ADFFS. |
Jeffrey Lee (213) 6048 posts |
Correct.
OS_ValidateAddress will issue Service_ValidateAddress, but OS_Memory 24 won’t.
Yeah, that’ll be coming once I get nonexecutable pages/DAs working.
No problem! |