OS_ValidateAddress shortcomings

22 posts, 5 voices

Apr 10, 2014 2:09pm Jeffrey Lee (213) 6048 posts	I’ve been looking at ticket #279, and in particular trying to add some sanity checks to the kernel/OS so that it can try and avoid getting stuck in an infinite abort loop if you do something silly like give it a duff error handler pointer. However the current implementation of OS_ValidateAddress doesn’t seem like it’s up to the job: The call is meant to check that the address range maps to logical RAM. The interpretation I take of this is that it’s meant to check to see if all the pages have corresponding physical page numbers, i.e. it’s what the OS considers to be regular RAM/VRAM. However rather than checking the CAM/page tables it instead just checks for a bunch of fixed address ranges and then checks all the DAs. This means that it ends up missing a bunch of RAM regions, including at least the following: CAM and page tables (as noted in the source) kernel buffers area (KbuffsBaseAddress) HAL workspace Because the code only checks for “RAM”, it will fail for some addresses which might be valid for certain purposes. E.g. error handler code will often be located in ROM. Or if someone decides they want to have the error buffer held in IO memory (e.g. VRAM on Iyonix/Pi). There are also situations where it will say that a region of memory is fine, even though we know from the context that it’s probably a bad place to be using. E.g. it’s fairly safe to say that the Wimp sprite pool, RAM disc, filecore disc map DAs, etc. should never have code executed from them. So it would be nice if we had a call which could return executability information for areas (disregarding the fact that we currently don’t have a way of flagging areas as non-executable!) So the solution to me seems to implement something better, e.g. based around ROL’s OS_Memory 24 call. But since RISC OS 5 hasn’t (yet) gone down the route of turning all memory regions into dynamic areas we’ll still need some extra code in there to manually check all the non-DA regions. I’m also wondering how (or if) the call should fit in with Service_ValidateAddress. I’m guessing the original intention of the call was to allow podule drivers to flag memory attached to their podule as being valid. I can see us needing a similar call to deal with IO memory; although 90% of IO memory will be IO registers or invalid, there might be 10% or so which is a valid target for some operations (e.g. VRAM on Iyonix/Pi). So maybe we’ll need a service call to allow reporting of that information. But it doesn’t look like ROL’s OS_Memory 24 makes a distinction between IO registers and memory (regular RAM/ROM or IO RAM/ROM). And neither is Service_ValidateAddress a very good fit, as it won’t deal with the situation where half the range is IO registers and half the range is IO memory. Plus I don’t really like the idea of involving service calls at all – if the kernel is in the middle of recovering from an error and is trying to work out if writing to a particular address is going to make it go bang, it should probably be doing the least amount of work possible instead of calling out into other modules which may also be broken (disregarding the fact that if the module chain is broken the chances of the OS being left in a recoverable state are very slim anyway… but we can always hope for the day when the OS is able to recover from such situations). Apart from adding some protection to the error handler/buffer I’m also hoping to add some protection to the abort register dump area. This one’s a bit trickier, I suspect the best course of action is for the kernel to create a temporary register dump on the stack, go through the majority of the recovery process (disabling FIQs, resetting the SVC stack, etc.) and then check to see if the buffer specified by the current environment is any good or not. Anyone have any thoughts on the matter?

Apr 10, 2014 5:17pm Martin Avison (27) 1494 posts	I have run into problems in the past, just trying to establish whether a range of memory addresses was valid to read or not. What I thought would be simple with OS_Validate address certainly did not give the whole picture, but I cannot remember the details. So I would certainly vote for something similar to OS_Memory,24.

Apr 10, 2014 5:29pm Jon Abbott (1421) 2651 posts	I always thought OS_ValidateAddress was supposed to do what it says…validate an address, be it IO, RAM, ROM or any other flavour of readable\writable memory. Supporting just logical RAM seems somewhat illogical to me. An issue I currently have in ADFFS is validating if memory is valid in an Abort handler, it currently falls flat in its arse because there’s no way to validate ROM. If you can fix that, either be changing the behaviour to support all memory types or extend it so it’s possible to validate ROM that would a step forward.

Apr 10, 2014 8:05pm Sprow (202) 1158 posts	I always thought OS_ValidateAddress was supposed to do what it says…validate an address PRM1-388 first line on the page “Check that a range of addresses are in logical RAM” looks pretty unequivocal. The snag here is that code can exist in places other than logical RAM. In the confines of just the kernel, I wonder how cheap it would be to set a flag, try the access anyway, and catch it in the abort handler – checking the flag and diverting back if it is set to conclude that it’s unsafe? It’s a bit like lazy task swapping, just lazy execute validation. That way it’s entirely defined by the MMU tables & possibly quicker than trying to do the reverse mapping? Service_ValidateAddress looks to pair with OS_ValidateAddress, so I think a new service call/SWI pair is in order to query executable regions, though I’d probably exclude IO regions.

Apr 10, 2014 10:34pm Jeffrey Lee (213) 6048 posts	Supporting just logical RAM seems somewhat illogical to me. I think the call was intended as a way for SWIs which perform block read/writes from memory (filesystem stack, DMA manager, etc.) to make sure that the area they’ve been given is a safe place to access before they get halfway through and fail horribly. Or even worse, fail silently, e.g. accidentally DMA’ing to/from a bunch of IO registers. In the confines of just the kernel, I wonder how cheap it would be to set a flag, try the access anyway, and catch it in the abort handler – checking the flag and diverting back if it is set to conclude that it’s unsafe? That would add extra overhead to abort handling, which wouldn’t be ideal. And it wouldn’t catch all the situations I’m interested in – e.g. it wouldn’t prevent the kernel scribbling all over IO registers, and it wouldn’t catch attempts to execute code from DAs/regions which are meant to be nonexecutable. Unless you also had code in the prefetch abort handler, which would add even more abort handling overhead.

Apr 11, 2014 5:23am Jon Abbott (1421) 2651 posts	We possibly need a multi-purpose replacement then, that returns only logical RAM for backward compatibility, a OS_Memory 24 a-like which handles all memory types and possibly another for just executable. In which case, possibly leave OS_ValidateAddress as is, implement OS_Memory 24 and add another OS_Memory for executable or extend OS_Memory 24 if it’s possible, so you can specify the memory types you want to validate. However, the problem we then face is the service call, which for backward compatibility sounds like it should only return logical RAM. PRM1-388 first line on the page “Check that a range of addresses are in logical RAM” looks pretty unequivocal. Does the ROL implementation stick to just RAM? With OS_Memory 24 used to cover all memory types, or does OS_ValidateMemory simply pass the call onto OS_Memory 24?

Apr 11, 2014 9:05am Jeffrey Lee (213) 6048 posts	We possibly need a multi-purpose replacement then, that returns only logical RAM for backward compatibility, a OS_Memory 24 a-like which handles all memory types and possibly another for just executable. OS_Memory 24 returns a bunch of flags describing the region, so I don’t think we need lots of separate calls. Just one multi-purpose one and then it’s the caller’s responsibility to check the flags on exit. PRM1-388 first line on the page “Check that a range of addresses are in logical RAM” looks pretty unequivocal. Does the ROL implementation stick to just RAM? With OS_Memory 24 used to cover all memory types, or does OS_ValidateMemory simply pass the call onto OS_Memory 24? ROL’s OS_ValidateAddress passes the call onto OS_Memory 24. For compatibility with the old OS_ValidateAddress, I think all it has to do is check the returned flags to see if the area is completely read and write accessible in privileged modes (bits 2 & 3 set), and has no physically mapped areas (bit 12 clear). Or if it is physically mapped, it must be fully “IO RAM”. This should make it pick memory with the right attributes as OS_ValidateAddress, although there may still be some discrepancies (ROL’s docs aren’t clear on what, if anything, they’ve done with Service_ValidateAddress – potentially they’ve ditched it entirely since they’re using dynamic areas for mapping physical memory. Plus there will be the areas which OS_Memory 24 will be checking which OS_ValidateAddress never did, e.g. kernel buffers, CAM/page tables, etc.) As far as making a ROOL version of OS_Memory 24, I think the only things which we’d need to do is add a flag pair for returning nonexecutability information, and a flag pair for returning “IO RAM” information. ROL’s physically mapped areas are identical to ROOL’s IO memory pool (OS_Memory 13), in that they’re a mix of “IO registers” and “IO RAM”, so for all IO regions we should have the physically mapped flag set, and then use the service call to allow “IO RAM” regions to identified (which for compatibility will end up having both bits set, even if the address range is fully within an IO RAM region) As far as lookup speed goes, I’m thinking that the OS should really keep a lookup table for indexing dynamic areas by address. E.g. keep all the DAs in a linked list sorted by address, and then use a LUT (256 entry? 4K entry?) for fast indexing into the right part of the linked list. In fact the OS may already be doing this – I haven’t looked in too much detail at the Ursula DA management optimisations. That would help both the performance of this call and anything else where fast address-to-DA lookups are required (e.g. if/when we add support for per-DA abort handlers)

Apr 11, 2014 11:19am Jon Abbott (1421) 2651 posts	Do you need to do anything special for lazy page mapping? Abort handlers might want to know if the page they’re about to access is actually mapped or not. I fell foul of this particular issue in my Abort handler.

Apr 11, 2014 11:41am WPB (1391) 352 posts	Having had a quick scan, I don’t think there’s any more useful information in Gerph’s rambles, but I’ll link to it just in case it helps. You’ve probably read it already anyway!

Apr 11, 2014 12:30pm Jeffrey Lee (213) 6048 posts	Do you need to do anything special for lazy page mapping? ROL’s OS_Memory 24 does have a flag for indicating whether DAs have custom abort handlers, so it would make sense if application space had that flag set to indicate that lazy task swapping is in use. Other than that I don’t think we need anything special, or at least not yet. E.g. if/when we implement abortable DAs we’d probably want to make the kernel check that the abort handler for a particular DA isn’t itself located within an abortable DA, to try and avoid a circular chain of aborts.

Apr 12, 2014 6:58am Jon Abbott (1421) 2651 posts	if/when we implement abortable DAs we’d probably want to make the kernel check that the abort handler for a particular DA isn’t itself located within an abortable DA, to try and avoid a circular chain of aborts. The RISCOS Abort handler would probably need additions for this, going by the mess lazy task swapping made of my Abort handler, it looks like the last Abort handler to register owns the hardware vector. Which means every Abort handler currently needs to be re-entrant and check if the Abort is actually in its memory range and an Abort type it deals with. RISC OS would need to own the hardware vector and be aware of memory ranges and Abort types that each Abort handler can deal with. God knows how you could make that backward compatible as all and sundry tap into the vectors. In the case of CLib, I had to give it a false range so it didn’t mis-report a write Abort as a problem and left my Abort handler to either deal with it or pass it on down the Abort chain. Not ideal as any valid aborts are no longer dealt with by CLib but the OS as Clib is further up the chain and you can only pass down the chain, if you understand what I mean. The consequence being that any Abort is reported by the OS and CLib remains as the last owner of the Abort handler leaving my Abort handler orphaned in the middle when it tries to clean up and deregister. I’m digressing somewhat off-topic, but it would be useful if hardware vector handlers could unregister when they’re in the middle of the chain. Back on topic, I had a look at OS_Memory 24 and the one thing it doesn’t include is a flag for “executable”. This would be useful in a crash scenario, where an Abort handler might want to produce a dump report, which leads me on to the next point. How does OS_Memory 24 interact with Service_ValidateAddress? At the minute the service doesn’t return any flags, could we extend this so service claimants can return the memory type? Would the service not also need extending to support ranges that overlap several handlers? Somehow allow the handler to return a list of ranges and types, or perhaps issue a new service call after a module is loaded or app run asking for memory ranges and types and leave the OS to deal with the request on behalf the handler?

Apr 13, 2014 4:59pm Jeffrey Lee (213) 6048 posts	Back on topic, I had a look at OS_Memory 24 and the one thing it doesn’t include is a flag for “executable”. Yes, that’s one of the things I’m planning on adding (although it’ll be a ‘nonexecutable’ flag rather than ‘executable’, for compatibility with ROL’s version). How does OS_Memory 24 interact with Service_ValidateAddress? That’s one of the things we still need to work out! My theory is that Service_ValidateAddress was only added in order to allow podules to report that any podule-attached RAM is a valid and safe memory location. For everything else, the kernel should be able to work out the memory flags itself, either by checking dynamic areas or by checking the page tables. So for OS_Memory 24, this means the following: We only need to bother issuing a service call if the address range intersects the IO memory pool By examining the page tables, the kernel can work out the access flags itself. The only bit which it will need help with is working out whether the memory is “IO registers” (i.e. probably not safe to poke around in) or “IO RAM” (safe to poke around in) Since Service_ValidateAddress is used to indicate whether a range is “safe to poke around in”, we can continue to use it as-is for working out whether a region of IO memory is registers or RAM. There are some shortcomings with this approach (e.g. it doesn’t deal well with memory ranges which cross from one region into another), but it should be roughly equivalent to the current behaviour. Also note that – at least as far as CVS is concerned – I can’t seem to find any code which implements Service_ValidateAddress. So I’d classify this as the kind of problem that’s probably left as-is until we better understand the requirements of the system. For example, instead of a better Service_ValidateAddress we may decide that it’s better to add some extra OS_Memory calls so that the kernel can be explicitly notified where the “IO RAM” regions are instead of having to issue service calls all the time.

Apr 13, 2014 5:08pm Jeffrey Lee (213) 6048 posts	Also note that – at least as far as CVS is concerned – I can’t seem to find any code which implements Service_ValidateAddress. Ah, spoke too soon – the Wimp actually implements it! It looks like (pre-RISC OS 3.5) it was used in order to flag the free pool as being valid, because the free pool actually shared application space with the current Wimp task. I.e. rather than mapping the pages out, the Wimp merely shuffled the mappings around so that the current task is at the bottom of application space and the free pool (marked as inaccessible in user mode, and above the application space limit known to the kernel) sat after it.

Apr 13, 2014 7:35pm Jon Abbott (1421) 2651 posts	For everything else, the kernel should be able to work out the memory flags itself, either by checking dynamic areas or by checking the page tables RO doesn’t currently check the page tables does it? I had to add a Service_ValidateAddress handler to validate the memory ADFFS maps. It certainly makes sense to leave all the memory validation to RO and try to restrict the service call for podules etc One question, are lazily mapped pages reported as valid memory even if they’re not mapped? An OS_Memory call to force lazy areas to be mapped would be useful. And what about screen caching? If it does get implemented in the same way as RO4, by generating page faults on a write, should it be reported as valid? OS_Memory 24 probably covers most of this I suspect. instead of a better Service_ValidateAddress we may decide that it’s better to add some extra OS_Memory calls so that the kernel can be explicitly notified where the “IO RAM” regions are instead of having to issue service calls all the time. I’d leave the service as is for backward compatibility, extra OS_Memory calls seems the safer method. rather than mapping the pages out, the Wimp merely shuffled the mappings around so that the current task is at the bottom of application space and the free pool A feature I exploited to back port OS_DynamicArea to RO3.1 :)

Apr 13, 2014 8:54pm Jeffrey Lee (213) 6048 posts	While checking the RISC OS 3 PRM: Apparently Service_ValidateAddress is for internal use only. So it looks like we should be free to cut it out of OS_Memory 24 completely and come up with our own solution once the time comes.

Apr 18, 2014 11:18pm Jeffrey Lee (213) 6048 posts	I think I’ve now got a fully working OS_Memory 24 implementation. I just need to do one more round of testing (check the results against the page tables to see if there are any gaps/inconsistencies) and then I can start updating things to use it. As far as lookup speed goes, I’m thinking that the OS should really keep a lookup table for indexing dynamic areas by address. E.g. keep all the DAs in a linked list sorted by address, and then use a LUT (256 entry? 4K entry?) for fast indexing into the right part of the linked list. In fact the OS may already be doing this – I haven’t looked in too much detail at the Ursula DA management optimisations. That would help both the performance of this call and anything else where fast address-to-DA lookups are required (e.g. if/when we add support for per-DA abort handlers) Luckily an address-sorted linked list is already in use by the DA code, so it wasn’t too much work to also add an address-to-DA lookup table.

Apr 19, 2014 6:56am Jon Abbott (1421) 2651 posts	One test I’d like to see work under ADFFS (with the Service_ValidateAddress code removed) is: RemapVideoMemory 13 160 LOAD 1FD8000 The other issue I came across was OS_File to load a file into a DA, but it sounds like you now have that one covered. My workaround was to use OS_GBPB which didn’t have the issue – so may not be validating the memory it’s writing too.

Apr 19, 2014 10:24am Jeffrey Lee (213) 6048 posts	One test I’d like to see work under ADFFS (with the Service_ValidateAddress code removed) is: That would be a bit tricky, since you are changing the page tables without the OS’s knowledge. There are a few solutions I can think of: Re-introduce a Service_ValidateAddress-style service call, at least for the areas which OS_Memory 24 considers unmapped. Sprow recently found one of ROL’s docs which has more information on how they use Service_ValidateAddress (http://www.riscos.com/support/developers/riscos6/core/aborttrap.html), and in particular they suggest to implement it recursively in order to correctly deal with situations where the region being checked crosses from your memory into someone else’s. If an OS_Memory 24 version of Service_ValidateAddress used the same technique then that could work quite well. Make OS_Memory 24 check the page tables, either for all regions or just the ‘unmapped’ regions. Not ideal because it won’t allow all the extra attributes (abortable, physicaly mapped, etc.) to be determined. May also be slower than other approaches for large regions, due to the page tables being uncached. Find a way for ADFFS to use a dynamic area for remapping screen memory, instead of direct page table manipulation. This would probably be the best long-term choice. You could almost get this to work in a hacky kind of way if you were to create a sparse dynamic area and map in (via OS_DynamicArea 9) the last page. That would then cause my OS_Memory 24 code to fall back to checking L2PT to see which pages of the area are mapped (mapping in the last page is required, as it uses the high water mark of the DA to decide whether it needs to check L2PT or not). But it would probably be nice if there was a cleaner approach which gave you full control over the pages for that area, and allowed you to implement your own OS_Memory 24 handler. I’ve just had a quick play with Aemulor, and that seems to work by creating valid DAs for the areas which it maps into application space, so I think it should work OK with my current OS_Memory 24 implementation (I’ll be doing some extra checks to make sure). However it doesn’t use the technique ADFFS uses to clone screen memory, so its requirements are a bit simpler than yours. Any thoughts on what kind of changes would work best for you?

Apr 19, 2014 5:46pm Jon Abbott (1421) 2651 posts	3. Find a way for ADFFS to use a dynamic area for remapping screen memory, instead of direct page table manipulation This would be my preference, I’d rather not hack around in L2PT as it’s not future proofed. When it comes to running games, ADFFS is very hands off and leaves as much as possible running natively. For the RO3.1 screen memory, it copies the first 480KB of L2PT entries from DA2 to <2000000 and >2000000 to mirror an Arc. I’m not sure if that’s possible or not with current SWI’s. That JIT however is frequently changing the access permissions on L2PT entries to track self-modifying code. Although that could be done with current SWI’s, it would severely impact the performance swapping in/out of Und/Abort mode to call memory related SWI’s. Is the SWI dispatcher capable of working if called in Und/Abort? The reason I originally dropped OS_SetMemMapEntries calls and went to direct L2PT manipulation was due to the flag corruption I was seeing, problems tertiary mapping memory pages from DA2 and the bugs in Service_ModeChange where it’s impossible to disable the screen caching. Bare in mind, this is all before I started looking at Pi support, so had to find solutions to the issues on RO3/4. Now the code is split into four versions, I can change the two RO5 versions (IOMD and non-IOMD) to use valid OS calls.

Apr 20, 2014 5:43pm Jeffrey Lee (213) 6048 posts	My OS_Memory 24 implementation is now in CVS. After looking over ROL’s docs a second time it looks like their OS_ValidateAddress call still issues Service_ValdiateAddress, so for now I’ve gone with the path of least resistance and made our new OS_ValidateAddress do the same. This means things like ADFFS should continue to work correctly (e.g. your *LOAD into screen memory seemed to work OK). FYI, our Service_ValidateAddress will report the following as being valid: Any areas which have any of bits 0-3 set on return from OS_Memory 24 (i.e. completely user/svc readable or writable), and which aren’t (partially) physically mapped. This will cover all RAM+ROM areas, including the bits OS_ValidateAddress previously didn’t report on. Any areas which are completely physically mapped and are completely readable+writable in user mode – i.e. screen memory on Iyonix/Pi. In the future I might change it to something a little different depending on how Service_ValidateAddress, etc. evolves. E.g. on RISC OS Select it looks like the job of reporting screen memory as valid was farmed out to the LegacyScreen module. Anything reported as valid by Service_ValidateAddress And the OS_Memory 24 implementation only checks the page tables for the following situations: Sparse dynamic areas – L2PT checked to see which pages are mapped in L2PT itself – RISC OS only allocates pages to L2PT as necessary (a full L2PT would use 4MB of RAM), so L2PT is checked here to see which L2PT pages are mapped in IO memory (OS_Memory 13) – L1PT is checked to determine the access flags Physical access region (OS_Memory 14, 15) – L1PT checked to see if it’s mapped in or not For everything else, OS_Memory 24 relies on the dynamic area list and a bunch of fixed address ranges. It also assumes that there aren’t any dynamic areas (apart from the system heap) which were created above the IO memory region (this would only be possible with code which manually specifies an address for the DA). Dynamic areas created within application space should work fine however. After checking the OS_Memory 24 results against the page tables (and fixing a couple of other kernel bugs in the process!) it looks like it’s correctly reporting all memory regions, apart from anything that was mapped in manually via direct page table manipulation/OS_SetMemMapEntries. As far as my original goal of making the kernel less likely to get stuck in infinite abort loops, I’ve now updated the kernel to check the error handler + buffer addresses and the exception register dump buffer using OS_Memory 24. The recovery isn’t always very graceful, but it should be better than getting a hard crash when something goes wrong.

Apr 20, 2014 8:58pm Jon Abbott (1421) 2651 posts	I’ve gone with the path of least resistance and made our new OS_ValidateAddress do the same. This means things like ADFFS should continue to work correctly (e.g. your *LOAD into screen memory seemed to work OK). Sounds sensible. looking at the documention am I correct in assuming that if it returns R1=0 the range is invalid? You may also want to change “it does NOT issue Service_ValidateAddress” One thing I’d like to see at some point, is a means to determine is an address/range is executable (and include ROM). This would be useful for an Abort handler to determine if it should action and return or clean up. Thanks for checking ADFFS.

Apr 21, 2014 12:25am Jeffrey Lee (213) 6048 posts	looking at the documention am I correct in assuming that if it returns R1=0 the range is invalid? Correct. You may also want to change “it does NOT issue Service_ValidateAddress” OS_ValidateAddress will issue Service_ValidateAddress, but OS_Memory 24 won’t. One thing I’d like to see at some point, is a means to determine is an address/range is executable (and include ROM). This would be useful for an Abort handler to determine if it should action and return or clean up. Yeah, that’ll be coming once I get nonexecutable pages/DAs working. Thanks for checking ADFFS. No problem!

Reply

To post replies, please first log in.

Forums → Code review →

OS_ValidateAddress shortcomings

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options