Zero page protection
Theo Markettos (89) 919 posts |
David Given, who wrote the R2 reimplementation of the RISC OS kernel, noticed that many modules make reference to zero-page location &108. This turns out to be IRQsema, the IRQ semaphore. With this in place zero page cannot have memory protection. I notice a CVS comment:date: 2005/09/16 15:48:08; author: srevill; state: Exp; lines: +1 -1 Added IRQsema to the list of values which can be read back from OS_ReadSysInfo 6 (subreason code is 23 for IRQsema). This will be useful if zero page is ever protected and the IRQ semaphore moved. I’m mostly just publicising this issue so what he found out doesn’t get lost. There appears to be space in zero page to shift up all these variables clear of the first 4KB, which can then be mapped out. Would it then just be a case of changing all the modules to use the above call rather than just assuming a static compiled-in IRQsema value, or is there something else involved? |
Ben Avison (25) 445 posts |
Zero page has had memory protection – in the sense that it’s read-only in USR mode, since RISC OS 3.8. That alone broke quite a few third-party applications. I seem to remember that Mike Stephens (the kernel guru at the time) tried making zero page completely inaccessible in USR mode, but it broke so many applications that he gave up on the idea. When it comes to modules, we’re currently stuck with it needing to be read-write access in privileged modes. IRQsema is currently one of the easiest and quickest ways for any module to force non-reentrancy in its SWIs – on entry to the SWI handler, check whether IRQsema is nonzero, and if so then the interrupt stack is threaded and the SWI is being called in an unsafe manner. One danger is that third-party modules, not just OS ones, may be doing this. Ideally all such modules would use OS_ReadSysInfo 6,23 to locate IRQsema and fall back to &108 if that fails – it would have been handy if the SWI had been more widely announced to give developers plenty of time to prepare their software, but it seems to be the nature of these things that few will bother until their software breaks, if at all. The other big reason for modules needing read-write access to zero page is in order to install FIQ handlers – these live between &1C and &FC inclusive. For what it’s worth, most modules with bugs involving writing to null pointers probably had those bugs squashed in the late 1990s, because the StrongARM aborts on writes to the processor vectors in SVC26 mode. The whole issue cuts to the heart of a philosphical divide – are we trying to make RISC OS better, or are we trying to maintain the maximum possible compatibility with previous versions (subject to hardware constraints)? On those CPUs that support “high vectors” – where the processor vectors can be relocated to logical addresses from &FFFF0000 upwards – it would theroretically be possible to move everything out of zero page and to completely unmap it in all modes. For the record, this applies to some v4 and v5 CPUs (the kernel would have to work this out at run time) and all v6 and v7-A CPUs. So what would need changing? The 32-bit memory map was designed to leave the top 64K free for this very purpose, so nothing needs relocating. Fortunately all access to the processor vectors is already abstracted through OS_ClaimProcessorVector, so that’s a nice localised change. Everything that uses IRQsema would need to use OS_ReadSysInfo instead. All modules with FIQ handlers would need to be changed to check with the kernel what address to install the handler at – and possibly need some changes to the FIQ handler itself (remember that addresses &1C-&FC all have the nice feature for ARM code that they are valid immediate constants). You could consider some sort of abort fixup code to allow compatibility with a whitelisted set of modules – but since FIQ handlers are by their nature rather speed sensitive, this would only be useful for users of IRQsema. I wonder what people’s opinions would be on making such a change, bearing in mind the pros and cons? |
Peter Naulls (143) 147 posts |
I think it’s imperative that it be done, and in many ways it’s highly embarrassing (not that RISC OS doesn’t have many things like that ;-) that applications can still readily validly ready from low memory. Not to mention a vast amount of developer time wasted on chasing things that should have stopped immediately. I regularly use “Prot1k”, and the only thing that obviously breaks is ShareFS, although no doubt there are some applications that would also. A white list in that case might be a partial solution, but at the same time, it could much more quickly identify subtle memory problems in those applications. Bear in mind that a null pointer deference is probably one of the most common programming errors. There was some umming and ahhing over this when the Iyonix came out, and some somewhat specious arguments that there was some valid purpose for applications very occasionally to read low memory, but this didn’t hold any water. Obviously, modules are a different matter, and such changes it would be nice to work towards, but I don’t think have the same urgency. So, IMO, all that’s immediately required the equivalent of Prot1k (and perhaps later, more of lower memory), fixing of ShareFS, and checking of a bunch of major apps. I just wish this had been done around the time of RISC OS 4. It would I think, have spared much grief. |
Steve Revill (20) 1361 posts |
I agree that making it altogether inaccessible is the way to go. Let’s see what breaks and fix it! The Iyonix and Beagle would be good testbeds for this because we can relocate their hardware vectors to a non-zero page location and map-out zero page for all modes. |
Jeffrey Lee (213) 6048 posts |
I had a bit of spare time over the weekend so I started to look into this. At the moment my aim is to relocate the first 16K of memory to &FFFF0000 (I’m ignoring the 16K of scratch space for now), and to have it controlled at kernel compile time instead of dynamically selected at runtime (just to keep things simple for this first version). As mentioned above, any code that uses IRQsema will need updating to read the location using OS_ReadSysInfo 6, and the FIQ claim mechanism will probably need updating to indicate the location of the FIQ vector. But what about the other kernel public workspace values that lie within the first 16K? Does anyone know if they’re purely ROM-internal things or whether any external/third party code makes use of them? |
Colin Ferris (399) 1818 posts |
Hi- I wonder if this has anything to do with this – I have been using a program called ‘Protect’ by Andreas Feldner 1996 with my RPC/Iyonix which has three choices |
Jeffrey Lee (213) 6048 posts |
Yeah, that’s the kind of thing that I’m looking for. Cheers! |
Sprow (202) 1158 posts |
You’re brave! Things I’ve looked at recently would be LanManFS (I think I expunged cheeky zero page reads since there was a SWI equivalent) and SpriteExtend (much zero page reading of VDU workspace). I suppose if you’re mainly concerned about 3rd party app breakage then an intermediate step would be to assemble the ROM with PublicWS high and not worry about ROM modules. |
Jeffrey Lee (213) 6048 posts |
I probably wouldn’t be doing this if I didn’t have JTAG. I may be brave, but I’m certainly not foolish ;-) A quick search through the OMAP ROM source code suggests that the following modules access the exported kernel values:
That’s not too bad, considering all the hidden kernel bits that will need fixing. The only one that will cause any real trouble is the Debugger, since it uses MOV PC,#DebuggerSpace+offset as the breakpoint instruction. The easiest solution to that would probably be to map in a page at &1000 (or &2000, &3000) specifically for the debugger to use. Or maybe switch to using the BKPT instruction if we don’t mind the fact it’ll clobber some of the abort registers. |
Jeffrey Lee (213) 6048 posts |
Some issues that will need to be dealt with:
The kernel is now in the state where it should compile and run, so once I’ve fixed up a couple more modules I should be able to build a bare-bones ROM image and fix any mistakes. |
Colin Ferris (399) 1818 posts |
Is there an example of how to replace the PEEK at &FF8 from !DDT module? |
Jeffrey Lee (213) 6048 posts |
I haven’t looked at the disassembly, but I suspect it would be difficult to fix it just by patching the binary. It would involve either changing it to use the new address of &FFFF0FF8 (which would render the binary useless until you get my kernel/ROM changes), or changing it to use Wimp_ReadSysInfo 5 (which would probably require some registers to be preserved across the SWI call, and so taking up increased binary space). I’ll try and make sure that ROOL have released a fixed version before I check in and enable all my kernel changes. I spent most of the weekend trying to get the kernel running. It’s now at the point where a near-complete ROM image can make it into the desktop and run some software, although some of the ROM apps crash and there’s no FileCore yet so I can’t really test much else. Sometime soon I’ll be able to start checking in some of my changes. It’ll be the simple stuff first – fixing modules which had null pointer dereferences, and making things use OS_ReadSysInfo 6 to look up kernel addresses. Kernel changes will have to wait until I’ve done some more testing (and checked that I haven’t broken the option to keep zero page at &0!). Plus I obviously won’t be able to enable the zero page relocation until ROOL are able to make fixed versions of any closed-source components (ShareFS, DDT, etc.) |
nemo (145) 2552 posts |
Is there any hope of a compatibility mode? Duplicate read-only mapping to the usual address plus suppressed write exceptions? Just a thought. |
Jeffrey Lee (213) 6048 posts |
It’s certainly possible. However I’m expecting most of the breakage to come from programs which are reading from null pointers. Once I’ve got the ROM running properly I should be able to get a better idea of how many third-party programs are affected. |
Andrew Rawnsley (492) 1445 posts |
I’d echo the hope that this is something that could be option for applications (like exceptions on/off for ArmV7). I can see this being really useful for testing code, but you can be sure that older programs will have problems, and the source for them is unlikely to be around for fixing :( |
Trevor Johnson (329) 1645 posts |
But there’s some support for an emulator, which would help a lot wouldn’t it? |
Andrew Rawnsley (492) 1445 posts |
Possibly, but I know there are 32bit apps which work on ARMini which aren’t specifically ARMv7 safe/compiled. For example, our HTMLEdit software works fine, but I haven’t (yet) re-compiled it with the necessary compiler flags set. I would imagine a fair few other programs fall into the same boat, and anything that directly reduces the existing levels of compatibility (even if that is accidental compatibility) should probably be optional. The last thing RISC OS needs is fewer compatible applications :) |
nemo (145) 2552 posts |
I’d have deleted this if I were allowed. |
Jeffrey Lee (213) 6048 posts |
I’ve now got everything except ShareFS (which is closed-source) and RTSupport working. Once I’ve got RTSupport working I’ll probably upload a ROM image for people to try out (it’ll save me from doing all the testing, at least ;-)). Then I should be able to concentrate on tidying up the last few loose ends so I can check in the changes. |
Trevor Johnson (329) 1645 posts |
Nice one! Was the JTAG as necessary as you thought it’d be? |
Jeffrey Lee (213) 6048 posts |
I’d say it’s been pretty much invaluable. Most of the time I was only using it to look at the registers and stack, which I could have done without the use of JTAG, by writing abort handlers that just dump the data to the serial port. But there were also a few cases where I used it to step through code, which would have been impossible to do otherwise. Hopefully the kernel and most of the modules will just work the first time I try it on the Iyonix – The differences between an ARMv3-ARMv5 kernel and an ARMv6/ARMv7 kernel is basically just one file. |
rob andrews (112) 200 posts |
hi jeffery is this true of the kernal for the new a9 cpu’s?? |
Jeffrey Lee (213) 6048 posts |
Assuming you’re only interested in using one core, I think the current kernel will work on an A9 with only minimal changes. I managed to get RTSupport working, so here is a test image for people to try out. Remember that there’s no ShareFS, and there’s a nasty bug in VProtect so you’ll either have to patch your copy or remove it from the boot sequence (it gets loaded by !Boot.Utils.BootRun). I haven’t tried the ROM image on anything much beyond the boot sequence and ROM apps, so feel free to throw everything you’ve got at it and find out how many crashes you get ;-) |
Jeffrey Lee (213) 6048 posts |
This week I’m focusing on getting everything into a state where it can be checked in. Specifically I’m aiming to get all the module changes checked in, so that all I’m left with is the kernel, which I can then check in a while later once I’ve given it some more testing. However as part of getting the kernel ready I/we need to decide what to do about the FIQ claim mechanism. There are two problems that need solving: How to tell programs where the FIQ vector is located, and how to stop old programs from crashing the machine by claiming the vector and then crashing when trying to write its code to zero page. For the first problem, I’m thinking the best solution would be to add a new flag to either OS_PlatformFeatures 0 or OS_ReadSysInfo 8 indicates whether high processor vectors are in use. (OS_PlatformFeatures would probably make the most sense for this, since the current flags deal almost entirely with aspects of exception/interrupt handling). The other solution I can think of would be to expose the vector location and length via OS_ReadSysInfo 6, but that might be a bit of an overkill since I don’t think there’s any reason why we’d need to move/expand/shrink the vector in the future. For the second problem, things are a bit trickier. The current FIQ claim mechanism is based around service calls (Service_ClaimFIQ & friends), but apart from the service call number there aren’t any registrs defined for passing parameters. So we can either add a new trio of service calls which the kernel will respond to when high processor vectors are in use (although really only the two claim calls need replacing), or specify that if a program is trying to claim the high FIQ vector it must set (e.g.) R2 to a magic value of &48494748 (“HIGH”). Although I’m not a fan of using magic values to extend APIs, it seems like it’s a sensible way of handling it in this case, especially since R2 should be being preserved by any other modules listening in on the call. Any thoughts? |
Ben Avison (25) 445 posts |
I’m not too fussed which API you choose for signalling that the CPU is configured for high vectors mode. However, I’m confused about your thinking for the FIQ claim service calls. The high and low FIQ vectors aren’t separate devices you might want to claim, it’s the same resource (of which there is only one) but whose address is no longer fixed. There’s no sense in which “the kernel responds” to the service call. It’s a negotiation between multiple driver modules as to which one of them gets to have use of FIQ facilities for a time. For example on old IOMD hardware, the Econet module owned FIQs most of the time in order to handle incoming network traffic, but while a floppy disc operation was in progress, ADFS would grab control of FIQs, releasing them again as soon as it had finished. If your thinking was that an attempt to claim FIQs at the old address would simply never be granted, then there’s a problem – Service_ClaimFIQ (as opposed to Service_ClaimFIQInBackground) is basically defined to block until FIQs have been released. So any caller of Service_ClaimFIQ is likely to just go ahead and starting writing over zero page as soon as the service call returns anyway. Yes, it will likely be doing so with FIQs disabled – but then I believe the abort handler forcibly re-enables FIQs, so such an action shouldn’t crash the machine. |