OS_GBPB enumerating directories
Colin (478) 2433 posts |
I never thought enumerating directories with OS_GBPB would surprise me anymore but while investigating why LanManFS left files when deleting directories I discovered that arithmetic is done on the r4 value returned – I had always thought the value was opaque. I’ve not seen any documentation to the contrary – I’m sure someone will tell me if it’s common knowledge. The basic enumeration scheme goes like this:
That prints out the directory but if you want to delete the files instead you need
Which means the offset needs to be treat as an index into the directory and the index of the next item you know (offset) will change by -1 for every prior object that is deleted. Both Wipe and FilerAct does this. I’ve not seen it documented anywhere so thought it may be of interest. I had the idea of using the offset value as a context pointer for LanManFS – it did’t work and this is why. |
Steve Fryatt (216) 2107 posts |
I believe that this is a known “feature” (or bug, depending on preference) of FilerAction. It breaks Acorn’s own documentation and whilst it does work on Filecore-based systems, there are a number of other systems where deleting files can have interesting and not-completely-expected (from the user’s point of view) consequences. Or, in other words, the value is opaque, and several non-Acorn FSs treat it as such, but Acorn used “insider knowledge” when writing FilerAction and then departed before seeing the problems that this caused. |
Jeffrey Lee (213) 6048 posts |
I’d always thought that R4 was an opaque value as well. But the PRM only describes it as an “offset” which starts from 0 and ends at -1, so I guess some people inside Acorn interpreted it as being a numerical index. There have been a few other threads on the forum casting doubt on FilerAction’s use of R4, and Steve Fryatt claims newer PRMs have clear warnings that R4 is meant to be opaque – but I can’t find those newer PRMs myself (Nothing obvious in the DDE’s PRM2 or PRM5A, and nothing obvious in ROL’s online RO 3 or RO 6 PRMs). The StrongHelp manuals do contain a warning, but something more official would be nice. |
Colin (478) 2433 posts |
Yes. basically the value returned to the FS becomes meaningless unless you know what Wipe and Deleting via FilerAct does. I have a cunning plan though – funny how writing something down gives you an idea. |
Steve Fryatt (216) 2107 posts |
Neither can he, at present… The ROOL one does, and has done so since 2009, according to the Wiki history, so that might have been the one that I was thinking of at the time. All of the PDFs that I can find state “offset”, but I suspect they’re all the same edition. ROL’s HTML version says “offset”, too. To be fair, “don’t read anything into the values from R4” has been something that I’ve “just known” for as long as I’ve been using OS_HeebieJeebie, which might be how all good myths start. |
Jeffrey Lee (213) 6048 posts |
Looks like it’s just a copy & paste from the StrongHelp manual. |
Colin (478) 2433 posts |
That got through that hurdle. I just put a handle id in the high bits of offset and faked the increasing index in the low bits then just ignored the index. Worked ok deleting the pi sources. |
Martin Avison (27) 1497 posts |
After discovering that r4 is affected by deletes and seeing it is ‘opaque’, I now read the whole directory into memory, then go through memory doing any deletes. It would be nice if the PRM could be clarified about GBPB and r4. |
Colin (478) 2433 posts |
It doesn’t matter what the the filesystem does, Wipe and FilerAction will remove 1 from the next offset value for every file in the list of returned filenames that it deletes. So other filesystems must be working around that in some way. It probaby only works properly for filecore – the rest just ignore the value. I think the Wipe and FilerAction behaviour needs standardising – I don’t think we can change that at this point. So we need this added to the OS_GBPB documentation
Sounds like a recipe for disaster to me. And while we are at it
Just in case you are inclined to just give the offset a big value as you are going to ignore it anyway. FilerAction says it fixes a problem with broken archives with that one. Or better still deprecate it and give us a new system where you can open and close the directory so that Filing systems can hold state and know when you are finished with the directory. At present if you abort a search the filing system doesn’t know you are finished with the ‘offset’. If the search doesn’t find anything the filing system returns -1 so it knows you are finished so can tidy up. |
Rick Murray (539) 13861 posts |
I don’t think the PRMs explicitly state this, but it hints pretty strongly at this when it describes that there are only two guarantees: 1, that R3 is the number of entries read (which can be zero!) and: 2, that R4 is the number to provide the next time, or -1 when done. I can imagine that FileCore uses some sort of index offset (so it probably looks like an incrementing number) as the entries are generally returned in alphabetically sorted order.
That makes sense. Deleting multiple entries of a directory that you’re currently reading, it’s bound to go wrong.
It doesn’t actually state anywhere that there is a sequence. That there is…is an assumption. ;-) Likewise, I’ve yet to see a fliesystem actually return zero entries (given a large enough buffer to hold longer filenames). However, that it is documented suggests that some eccentric filesystem may in fact do so. |
Rick Murray (539) 13861 posts |
Probably a good thing nemo isn’t here.
No, they probably either fault, round to the next valid entry, or end up skipping things because of this. Just put this into practice: >LOAD "_scandir" >LIST 10DIM buf% 128 20d$ = "$.!Boot" 30 40offset% = 0 50REPEAT 60 SYS "OS_GBPB", 11, d$, buf%, 1, offset%, 128, "*" TO ,,,read%, offset% 70 IF ( (offset% <> -1) AND ( read% <> 0 ) ) THEN 80 SYS "XOS_GenerateError", (buf% + 29) TO fn$ 90 PRINT fn$+", "+STR$(offset%) 100 ELSE 110 PRINT "Returned zero entries / end of list." 120 ENDIF 130UNTIL (offset% = -1) 140END 150 >RUN !Boot, 1 !Help, 2 !Run, 3 Backup, 4 BootLog, 5 Choices, 6 Library, 7 Loader, 8 Log, 9 Resources, 10 RO310Hook, 11 RO350Hook, 12 RO360Hook, 13 RO370Hook, 14 RO400Hook, 15 RO500Hook, 16 RO510Hook, 17 RO520Hook, 18 Themes, 19 UKSA_mod, 20 Utils, 21 Returned zero entries / end of list. >20d$ = "$.!Boot.Loader" >RUN MLO, 2 U-BOOT/BIN, 3 uEnv/txt, 5 USER/TXT, 6 RISCOSXM, 7 CONFIG/TXT, 8 LICENCE/BROADCOM, 11 RISCZP/IMG, 12 Pi 1 firmware, 14 Pi New firmware, 17 START/ELF, 18 FIXUP/DAT, 19 BOOTCODE/BIN, 20 CMOS, 22 GDRISCOS/IMG, 23 C17-7/TXT, 24 RORC14/IMG, 25 Pi2 firmware, 27 CMDLINE/TXT, 28 riscos525/img, 31 riscos525o/img, 34 LOST/DIR, 35 wimphackRISCOS/IMG, 38 RISCOS/IMG, 39 Android, 41 _RISCZP/IMG, 42 RISCOSZP/NEW, 43 Returned zero entries / end of list. > Note the Filecore entries all nicely in order. Note the DOSFS entries also in order, but with missing numbers. |
Rick Murray (539) 13861 posts |
This is what happens on DOSFS if you blindly count up from zero until you get -1 returned… *BASIC ARM BBC BASIC V (C) Acorn 1989 Starting with 4190460 bytes free >CHAIN "_scandir2" MLO, 1 MLO, 2 U-BOOT/BIN, 3 uEnv/txt, 4 uEnv/txt, 5 USER/TXT, 6 RISCOSXM, 7 CONFIG/TXT, 8 LICENCE/BROADCOM, 9 LICENCE/BROADCOM, 10 LICENCE/BROADCOM, 11 RISCZP/IMG, 12 Pi 1 firmware, 13 Pi 1 firmware, 14 Pi New firmware, 15 Pi New firmware, 16 Pi New firmware, 17 START/ELF, 18 FIXUP/DAT, 19 BOOTCODE/BIN, 20 CMOS, 21 CMOS, 22 GDRISCOS/IMG, 23 C17-7/TXT, 24 RORC14/IMG, 25 Pi2 firmware, 26 Pi2 firmware, 27 CMDLINE/TXT, 28 riscos525/img, 29 riscos525/img, 30 riscos525/img, 31 riscos525o/img, 32 riscos525o/img, 33 riscos525o/img, 34 LOST/DIR, 35 wimphackRISCOS/IMG, 36 wimphackRISCOS/IMG, 37 wimphackRISCOS/IMG, 38 RISCOS/IMG, 39 Android, 40 Android, 41 _RISCZP/IMG, 42 RISCOSZP/NEW, 43 Returned zero entries / end of list. > |
André Timmermans (100) 655 posts |
Well, since we are deleting the objects, a correct solution would be to reset offset to 0 when nread <> 0.
I have already seen that behaviour when asking a single entry at a time. Probably a version of DOSFS of FAT32FS, because IIRC it was returning nread 0 and an “offset” corresponding to a deleted entry in the directory map. Which is proably what Rick sees with its _scandir2 program. |
Chris Hall (132) 3566 posts |
It doesn’t actually state anywhere that there is a sequence. For Filecore the sequence is alphabetically sorted as the directory entries are required to be held on disc sorted. However it is not clever enough to use an insertion sort when creating new entries. My NAS clearly does not hold files in any easily predictable order as !Cat returns them in an odd order. Saving the ‘Cat’ file automatically sorts them. This is what happens on DOSFS if you blindly count up from zero until you get -1 returned CP/M used to use more than one directory entry for files longer than sixteen ‘chunks’ (an 8 bit disc address) |
Colin (478) 2433 posts |
Just in case someone reads this
may miss the last entry as
Darn. Thanks for pointing that out. It makes things difficult and I think scuppers my idea for holding a handle in offset. I was using offset=0 to ‘open’ the directory and when -1 is returned I’d close releasing resources. I figured I’d get a few non closed handles when aborting a search but figured I could time them out and garbage collect. But if you are going to just call OS_GBPB with offset=0 it could create thousands of open handles before I could time out. Back to the drawing board. |
André Timmermans (100) 655 posts |
That’s something I proposed a long time ago to speedup such operations because now for each call the path string must be interpreted and routed to the appropriate FS and object. |
Martin Avison (27) 1497 posts |
If GBPB is given a larger buffer and ask it to fill the buffer with lots of entries, the number of calls are reduced considerably and is usefully faster ISTR. |
Colin (478) 2433 posts |
I’m looking at the problem from the FS perspective. |