Considerations for FS bounty 3
Julie Stamp (8365) 479 posts |
I was thinking what goes into choosing the disc format for this. I guess there are some practical considerations like
and then there are performance considerations like
|
Rick Murray (539) 13958 posts |
Plus data recovery. We’d be stuffed without DiscKnight. So a mature format may have better recovery options. But, on the other hand, what other format can cater for RISC OS metadata such as file types? |
Jean-Michel BRUCK (3009) 380 posts |
I don’t know if the FS needs to remember the position of the binders, but that would answer that question. I use !Links to create links, it works well and is very useful. If this possibility was native in the FS it would allow a better harmonization. |
Stuart Swales (8827) 1373 posts |
You can store such metadata in extended attributes, and part of bounty #3 is the ability to store arbitrary other data with a file too. Very much doubt that you’d be able to port data in place from FileCore to any other format. But bounty #2 doesn’t appear to be making much progress: https://www.riscosopen.org/content/documents/bounties |
Mr Rooster (1610) 21 posts |
If you’re considering a new FS format for RISC OS then NTFS is probably worth a look. A full NTFS implementation would probably be overkill, and you’d probably not want to bother with it’s native wide string support, but as a filesystem it’s quite elegant. Basically: Everything is a file, a ‘file’ is an entry in the master file table (stored in a file called $MFT). A file is a list of one or more attributes. Attributes can optionally be named. An attribute has a type, an optional name and some data. So a file’s data is stored in an attribute with a ‘DATA’ type. Named ‘DATA’ attributes are used for alternate streams as an example. Another attribute type is a B*Tree index, with a given name. There’s a specific named one that’s used for directory entries. A RISC OS implementation wouldn’t need some of the more advanced features (eg, compression), and could ditch the wide chars for filenames. RISC OS file type info would be it’s own attribute. More info here |
Rick Murray (539) 13958 posts |
While looking to see if NTFS had active patents (doesn’t appear to), I came across this: Am I missing something? Like, where’s the “invention” here? |
David J. Ruck (33) 1675 posts |
Of all the filing systems in the world which you could implement on RISC OS, NTFS would be the very, very last. It’s vast, horribly inefficient, a memory hog, and took years even with major commercial sponsorship to get it in to the Linux kernel, which now means Microsoft is bound to introduce a breaking change. There are many, many other truly open filing systems, all on which have plenty of meta data storage for RISC OS filetypes, and including those based on B-Tress and proper snapshot capabilities. |
Rick Murray (539) 13958 posts |
Everybody’s favourite fool ChatGPT suggested:
Personally, given that most contemporary installations are liable to be using some sort of flash 1 rather than spinning rust, it would be highly preferable to have a Flash friendly filing system – one that doesn’t cover the device with endless writes to the same places 2 and also supports the use of TRIM to allow the device to better manage itself. 1 SD cards, USB sticks, SSDs, etc. 2 https://heyrick.eu/blog/index.php?diary=20240504 – I talk about how my satellite recorder destroys flash media really quickly due to how many writes happen when recording live programming; a cheap little harddisc is much more reliable. |
Stuart Swales (8827) 1373 posts |
btrfs is the only fs in the last 30 years that has trashed itself for me after a month of use. |
David J. Ruck (33) 1675 posts |
Before we can think of using any grown up filing system, we have to implement proper low level disc caching, as the filing system structures are all vastly larger than those used by FileCore, which are small enough to permanently keep in memory. Otherwise you’ll end up with something very slow. |
André Timmermans (100) 656 posts |
Also annoying is that it all developments (like the bounties here or on the ROD/R-Comp side) happens behind closed doors without any idea of how they go at it. To me the FS bounties look more like trying to patch things together to somewhat handle things it was never designed for. Personally I’d like to see first drafts APIs to: Then, when we have a reasonable, clean, agreed upon API, they can start implementing and choose the preferred file system formats to provide with RISC OS. |
Stuart Swales (8827) 1373 posts |
Yup, and yet…
Per bounty 3: “…the map, used to describe where fragments of a file are on the disc, has grown to over 4MB. If we allow many more drives per filing system than the current limit of 8, the computer’s address space will quickly become exhausted”. So … This kind of smells of “we want this to work on minimal RISC PCs as well as 8GB Pi4s” |
Sprow (202) 1164 posts |
the filing system structures are all vastly larger than those used by FileCore, which are small enough to permanently keep in memory. I don’t think you can reliably draw that conclusion from reading that aspect of the bounty. DiscOp64 was expanded to support up to 256 drives per filing system and on my Titanium I have SDFS/SCSIFS/ADFS in regular use, which would be 768 drives with a map of 4MB in current technology → 3GB of address space used on maps. It’s not talking about amount of RAM being used, because I’d hope that could be dynamically sized or even paged in/out of disc, it’s talking about address space exhaustion which happens just as quickly on an 8MB A7000 as an 8GB Pi 4. I recently got a confusing “Memory cannot be moved” error when I took an IDE drive out of one computer and put it in an A7000. After a few minutes of headscratching I realised the drive was large enough/had been formatted with a reasonably fine grained LFAU that the map was too big to fit in the RMA. So we already have a situation where some olde computers can’t mount some discs. Maybe any future FileCore extensions would only be enabled if you have > 512MB of RAM or something? |
Rick Murray (539) 13958 posts |
Or maybe just better error checking so that it can say something useful here rather than just parroting the error message that the OS provides. I’m not keen on “if this then that” unless absolutely necessary, because there are always outliers. I’ll give an example I ran into yesterday. I tried the AI editor that is built into new versions of Google Photos. It refused to touch my photos, claiming that photo spheres and virtual reality photos are not supported. Point is, start making assumptions….. …better to trap the error when trying to allocate memory and provide a more appropriate message to tell the user that this disc won’t work on this machine because etc etc etc.
Aren’t the maps held in DAs that are (specifically inaccessible to user mode code) these days? Or were you using an older RISC OS in the A7000? |
Glenn R (2369) 125 posts |
“FileCore in use”? What about porting ffsv2 from BSD? Or does that not have the necessary metadata support? (NTFS)
Compared to ADFS perhaps? Modern NTFS is actually pretty decent. It supports ACLs, journalling, TRIM, sparse files, ADS (Alternate Data Stream, kinda like Resource and Data forks on the Mac), compression, encryption, and a whole load of other useful stuff. I don’t tend to get excited about the file system when it ‘just works’. Which, in fairness, modern NTFS does ‘just work’. |
nemo (145) 2644 posts |
MetaData in the FS is an extremely bad idea. It is fragile, non-portable, obscure and unpredictable for users. It is particularly problematic without universal application-implemented support – imagine a task that ‘saves’ a modified file by writing to a temporary file, deleting the original and renaming/copying the temp (eg SparkFS) – this would lose all metadata every single time. Do not do this. MetaData in the file however is where the modern world has moved – ISO 26300 and ISO 29500 are both open, patent-unencumbered, zip-based standards that could be easily and transparently supported in RISC OS. By demonstration, ImageFS is a good model (which could be improved upon). |
David Pilling (8394) 96 posts |
Why me 8-) Oh, not me anymore, something ROOL are up to. |
nemo (145) 2644 posts |
One only needs a single pin to pop a balloon. The excellent SparkFS was merely a handy pin. |
Steve Pampling (1551) 8228 posts |
And one mischievous |
Mr Rooster (1610) 21 posts |
Genuinely curious why you say this? It’s a FS designed in the 80s for machines with MBs of memory in the MHz era of computing. It’s underlying data structures are simple and fairly elegant. There’s not a lot of dead wood in there. For example, it’s use of attributes means you simply omit attributes for items you’re not interested in, they don’t take up dead space in the underlying FS structures. These would simply not be needed on a RISC OS implementation. It’s a 30+ year old FS, with the last version issued around 2001, yet supports multi terabyte FSs with millions of files. I think that’s fairly decent. Also, to be clear, I wasn’t suggesting implementing NTFS, I was suggesting using it’s underlying data structures to implement an FS.
Do you have any references for this? How is it vast? In what way is it inefficient? The main issue comes if you end up with a heavily fragmented MFT, but all FSs have problems like this. It also has other desirable properties, as it’s fairly flexible. (e.g., extending a partition is essentially just making the bitmap file bigger).
That’s not relevant, at all, as you’d not be using the GPL code in Risc OS unless I’ve missed something? Also, I’m not sure it’s 100% accurate either. What commercial sponsorship do you refer to? I thought there was a fairly basic read only implementation for years then Paragon ‘gifted’ their implementation to the kernel and got annoyed when they were asked to conform to standards, and have not done a right lot with it since?
The filesystem hasn’t changed since 2001. (More modern features are implemented as new attributes), why would they do this, and what relevance would it have to a ground up implementation of an NTFS alike filing system that would be required for RISC OS?
There are, I only offered NTFS up as an example, but it feels like the objections are to Microsoft, rather than the underlying FS. Your other options would be one of the UFSs from BSD land or something like ext4 from linux. Personally I’d not touch ext4 with a barge-poll as the GPL would scare me. There’s things like XFS and JFS but I’d argue their complexity is greater than NTFS. (XFS for example is designed for high async I/O, which is probably not a major consideration for RO yet. ;) ) Then you get to the really modern FSs. I wouldn’t go near BTRFS as I like my data integrity and you’re not getting something like ZFS near RISC OS in a million years. Or what about the Be filesystem from BeOS? There’s a whole book on that one….
This is why I suggested it, I got interested in it’s internals a couple of years ago and realised, unlike a lot of MS stuff, it’s actually pretty well designed. A lot of the features you mention have been implemented since the last version change of the underlying filesystem, they’re just implemented using data stored in new attributes. At the end of the day, it’s a filesystem designed in the 80s, on computers with MBs of memory and processors running at MHz speed. The last revision of the filesystem was 2001, in an era of computers with 100s of MBs of memory and single CPUs running around the GHz level. Yet it still works now on multi TB drives with millions of files that can be gigabytes in size. (I have files on my HD that are larger than the HD I had in 2001). |