Any FileCore experts in the house?
Jeffrey Lee (213) 6048 posts |
I’m looking into the bug Chris Gransden reported with the OMAP build where attempting to create a RAM disc causes an abort. The problem code is line 899 of FileSys.FileCore.s.FormSWIs – Density is a single-byte value at offset #3 of the disc record, but LDR is used to load it, causing an abort when alignment exceptions are enabled. Although I can easily fix the abort by changing it to LDRB, allowing the RAM disc to be created, clicking on its icon to open the directory viewer results in a “broken directory” error. But if alignment exceptions are turned off, or if I change the code to emulate the ARMv5 behaviour (i.e. LDREQ LR, [R5]) then everything works fine. This makes me wonder what the correct course of action should be – the RISC OS 3.6-ish initial version of FormSWIs has the same code, so for 15 years or more LayoutFreeSpace has had the behaviour of never running the “has boot block and need to faff with that situation” code (since the unaligned LDR would load Density, SectorSize, SecsPerTrack, and Heads – and it would be a very strange disc indeed for all 4 of those values to be zero) So should we try to fix FileCore to properly handle discs with 1 zone and a density of 0 (whatever that “proper” handling should be), or should we force it to always follow the alternate behaviour? Like I said, I haven’t checked what the cause of the broken directory error is, but I’d guess that at least part of the problem is that RamFS uses a disc record with a density of 0 – but what if other filesystems make the same mistake and end up breaking (due to assumptions they make about the disc layout) if FileCore suddenly starts treating the discs differently? |
Ben Avison (25) 445 posts |
In the seeming absence of any experts, here’s my 2p… I’ve been poring over the code for a few hours. The problem seems to occur in the code that sets up the disc map – specifically, the allocation bytes of each map block – during a format operation. In particular, it has to do the initial allocation of the disc object with fragment id 2, to reserve the disc space for the map itself, for the boot block (except for single-zone floppies, which don’t have a boot block), and (on pre-long-filenames discs) the root directory. On pre-long-filenames discs, the caller has to initialise the disc record with the offset within disc object 2 which corresponds to the root directory – although the formatting code appears to assume the root directory is placed as soon after the map as possible. On long-filenames discs, the caller has to provide a fragment id for the root directory, which the format code places in a separate disc object directly following the fragment of disc object 2 that contains the map. The boot block is always 3K into zone 0, whereas the map starts at the beginning of zone (nzones/2). This means the format code has to be able to initialise map blocks that cover the map, or the boot block, or neither, or (specifically in the case when nzones=1, and for a hard disc) both. It is in the determination of whether it is a hard disc – by loading the “density” parameter from the disc record and checking if it is 0 – where the problem arises. It would seem that this code has been broken as far back as records go, and that the case where both the map and the boot block are in the same zone is not executed unless you deliberately set the neighbouring bytes of the disc record to 0 (which is probably an invalid disc record for other reasons). Instead, it would seem, if you format a single-zone hard disc image, you’re in danger of not protecting the boot block. It could be in the free space, or if you’re lucky, in the middle of the root directory – either way it could be overwritten. In practice, though, as long as you’ve used format where (idlen+1)*LFAU is 4K or more, you’re saved by the granularity – in such cases, the fragment containing the map cannot be small enough to exclude the boot block. The implementation of the not-taken branch is very simple – just using a constant value. However, I think it is just being conservative: if the sector size is 1024 bytes (which is supported by FileCore) then you can’t fit both the twin copies of the map as well as a pre-long-filenames root directory in before the boot block – so it’s making sure the fragment is large enough to fit a pre-long-filenames root directory after the boot block instead. You could try being more intelligent about it, but I doubt it’s worth the effort to save at most 2K of disc space. So, I would argue the correct fix from FileCore’s point of view is to replace the LDR on line 899 of s.FormSWIs with an LDRB. However, that doesn’t explain why you’re seeing a broken directory when this fix is applied – I’ve not tried looking into that yet. |
Dave Higton (281) 668 posts |
Sorry, I’ve been unable to find Chris Gransden’s original report (a pointer would be welcome), but there’s an interesting observation: for me, and for John Chetwynd, a RAM disc set by Configure seems to work (and in my case, I’ve been trying to break a directory this evening, without success), but dragging the size in the task manager instantly causes a crash if alignment exceptions are on. What sequence of operations causes a broken directory? |
Jeffrey Lee (213) 6048 posts |
The original report was here. Good job I remembered which thread it was in :)
If you edit the source as described in my post – change line 899 of FileCore.s.FormSWIs to LDRB instead of LDR, and then (IIRC) create a RAM disc through the task manager (which originally would have just aborted) |
Dave Higton (281) 668 posts |
So, once the LDR has been changed to LDRB, it appears that RAMFS only malfunctions when the slider in the Task Manager is used to resize the RAMFS allocation (including from zero to non-zero). No-one seems to be able to cause a malfunction when the size is fixed non-zero on boot-up and then left alone. It’s the dragging that causes the problem. Correct? |
Dave Higton (281) 668 posts |
In search of what the drag does, I issued the command:
where the size was previously 20480 KiB. There was an immediate crash. |
Jeffrey Lee (213) 6048 posts |
So, once the LDR has been changed to LDRB, it appears that RAMFS only malfunctions when the slider in the Task Manager is used to resize the RAMFS allocation (including from zero to non-zero). No-one seems to be able to cause a malfunction when the size is fixed non-zero on boot-up and then left alone. It’s the dragging that causes the problem. Correct? Yes, I think that’s all correct. Although I think everyone’s been focusing on trying to understand what FileCore’s trying to do, rather than investigating different situations in which RAM disc creation does/doesn’t crash. |
Martin Bazley (331) 379 posts |
Could this bit, at the beginning of OS_ChangeDynamicArea, be relevant?
Even if R10 ! #CDASemaphore is a word-aligned address, R10 ! (#CDASemaphore-1) most certainly won’t be. That may not be the only problem, of course. |
Jeffrey Lee (213) 6048 posts |
Nope, that code’s perfectly fine. Look at the value of R10 it’s using as the base register :) The crash Dave was seeing when using dragging the slider or using the ChangeDynamicArea *command would have probably been due to the dodgy code in filecore (since that’s what I was doing when I found the code in the first place). |
Trevor Johnson (329) 1645 posts |
I don’t know whether this should be noted here, in RamFSFiler Icon or in a new thread. Can anyone verify the outcome(s) below on their system? (I guess it won’t be exhibited on the Iyonix.)
1 Edit: It’s labelled ‘pause’ rather than ‘pause’/‘break’. |
Sprow (202) 1158 posts |
I’ve cast an eye over this problem this morning and it’s a fun one. With a handy file full of zeroes (on a FileSwitch FS!) the problem is triggered at the boundary from 960k to 964k. A variation on this theme was fixed in RAMFS in 2002, and is the point where the disc transitions from 1 zone to 2 zones. When 2 zones the root directory is placed in the 2nd zone and all is well and good regardless of the LDR/LDRB fix. When 1 zone, FileCore places the map at the start, then there’s some broken logic to place a boot block (because it’s a density = 0), but the free space map doesn’t account for this, then for good measure the root directory is written over the boot block. On attempting to mount the disc it is incorrectly identified as a floppy, but then the root isn’t where it was claimed to be. If you comment out the floppy recognition code it also fails because the boot block got overwritten. The free space calculation looks a bit dodgy too when doing ‘FREE’. This is a long filenames thing – a special edit of RAMFS to force it to use new map/new dir only works fine. The PRM says E format floppies with 1 zone have no boot block. So there are 3 problems, assuming RAM discs are ‘fixed’ not ‘floppy’
|