Iyonix - Disc Error 20
Jon Abbott (1421) 2651 posts |
I’m seeing “Disc error 20” with the 01-10-17 build of 5.23 when copying files over – I was overwriting !Boot at the time and had three of them. |
Jon Abbott (1421) 2651 posts |
I’d completely forgotten I’d put a 4GB CF in this Iyonix, just realized when I took the lid off. This will be the long running issue of ADFS not fully supporting modern IDE devices, that’s plagued many a machine from Arc’s to RiscPC’s. Retrying the write works, so it’s probably something like the drive saying “my buffer is full, stop sending data” which is being reporting as an unknown command response by ADFS. EDIT: Looks like it might be a DRQ timeout issue, changing the timeout from 700 to 65536 appears to have resolved the problem – for this drive at least. |
Jon Abbott (1421) 2651 posts |
Out of curiosity, I’ve done some testing on a RiscPC with RO3.71 using CompactFlash cards I have to hand that do not work with RISCOS. With the original 700uS timeout, near enough every write results in “Disc Error 20” or a corrupt free space map. Increasing the timeout I’ve successfully written several GB to each card without error. This poses two questions:
|
Jeffrey Lee (213) 6048 posts |
Nice work on tracking this down! I’ve seen this error (or something like it) on my RiscPC when using a CF card, but mostly ignored it since it only seemed to affect large FS operations.
IOC/IOMD version counts transitions in the timer register. So it may end up waiting longer in some situations (interrupt causes it to miss a transition), and the first half-microsecond of any DoMicroDelay call will be shorter than it should be, but since the DRQ check calls DoMicroDelay in a loop, subsequent calls should wait for the correct amount. Meanwhile the HAL version uses HAL_CounterDelay, which (according to the comments, at least) looks like it will wait for n+1 timer clock transitions for both the Tungsten + IOMD implementations, so each delay should always be >= 1us.
It looks like the 700uS timeout was dropped from the spec. Wikipedia has a handy collection of links to various spec versions. Page 39 of the ATA-1 spec mentions the 700uS timeout for DRQ. By ATA-2, it looks like it’s been removed – page 44 has a note mentioning the removal, and after searching through the ATA-6 spec I can’t see any obvious DRQ timeouts anywhere. |
Jon Abbott (1421) 2651 posts |
Since my post, I’ve looked at the tech docs of a random selection of HD’s and CF’s, some did state 700uS but a few were 50mS and one Seagate HD stated 30 seconds – to account for drive spin up time. In fact, the CF’s I was tested both stated 700uS, but clearly were taking longer to assert DRQ. The text just above the DRQ wait loop states that background transfers are done with IRQ off, so although increasing the timeout is a quick workaround the wait probably needs recoding to issue a callback (or be threaded I suppose) if it doesn’t receive DRQ in a timely manor. From the drive tech docs I’ve read, the first DRQ assert does not appear to raise an IRQ, but subsequent ones do, so IRQ’s could possibly be used for multi-sector writes. It’s probably the first DRQ assert that’s triggering the issue however, whilst the drive prepares to recieve sector data. I’m going to knock up a Utility to patch 26bit ADFS as long waits with IRQ off probably won’t impact IOC/IOMD to any great extent. I did notice some odd sound glitching on IOMD whilst copying data, but I can live with that now that I can finally replace all my HD’s with CF’s. |
Jeffrey Lee (213) 6048 posts |
Yeah. I’d say a reasonable fix for background transfers would be to wait for 700us in a blocking manner, and then fall back to a TickerV routine which polls the register (with a 30s timeout or so). ADFS already has a multi-function TickerV routine (look for WinTickCount + WinTickAddress), so it shouldn’t be too hard to hook something into there. With a bit of fiddling you could probably end up with something which could be applied to old machines directly via ROMPatch. |
Matt Price (2343) 71 posts |
Jon would your ADFS patch work with RISC OS 3.1x as well? I’m very interested to hear that the ADFS patch on 3.7x allows you to use CF Cards that previously didn’t work. This is a good prospect for my RPC in the future. With your patch to ADFS v3.27 (3.7x), v3.30 (4.02) and v3.33 (your softload mod.) I can now finally set my RISC PC as the file server for my A3010 and virtual Arcs, so everything has access to a single, comprehensive data source. At the moment I have literally dozens of .hdf images and HostFS folders that I have accumulated over the past 20 years, each one holds a different copy of the data or is slightly out of sync if I haven’t used it for a few months. I assume in the end when it’s all done that I can take a .hdf image of my spinning rust HDD and use it on RPC_Emu or VirtualAcorn? |
Jon Abbott (1421) 2651 posts |
Yes, I believe it will resolve the HD/CF/SD compatibility issues on all RISC OS versions. If you want to test it, you can generate a patched ADFS Module with this BASIC program:
|
Chris Evans (457) 1614 posts |
The catch 22 is were do you load the patch from! |
Jon Abbott (1421) 2651 posts |
You load it in the boot sequence, it only affects sector writes. |
Steve Pampling (1551) 8170 posts |
The catch 22 is were do you load the patch from! I think Chris was probably referring to a circumstance where the CF is the boot disc although unless you’re logging stuff during the boot1 there is no write to be affected. 1 The standard boot doesn’t log anything. |
Jon Abbott (1421) 2651 posts |
That’s the scenario I tested on an Iyonix and a RiscPC. Provided no writes take place before the patched ADFS is loaded, booting from an incompatible device doesn’t appear to be a problem. Ideally you’d want to patch ADFS as the first line within !Boot. !Boot if writes are likely (ie logging), if you know there’s no writes during the boot, PreDesk would probably suffice. It needs wider testing, I tried a selection of 2GB, 4GB and 8GB CF cards, none of which previously worked as boot devices, but all worked fine with the patched ADFS in the boot sequence. I don’t have any incompatible HD’s or SD’s to test further. |
Colin Ferris (399) 1814 posts |
Would this patch improve things with the “Ide to Sata” interfaces – on the Iyonix? |
Jon Abbott (1421) 2651 posts |
If the problem is “disc error 20” when writing to disc or issues formatting, then yes, it should resolve the incompatibility. If you have an adapter to hand you can test, I’d suggest using a blank HD and find a reliable way to reproduce the problem before trying the patched ADFS. You’ll probably need the patched ADFS to format it first though. |
Matt Price (2343) 71 posts |
Ah! This may explain why I’ve had problems with Harinezumi on my RISC PC always generating an empty log. I was going to ask for help, but noticed that the author stated it’s compatibility and support is RISC OS 5 only. I’ve been running the ADFS patch in Utils:BootRun just after VProtect. When I fire the RISC PC up later I’ll move it to Run first line and see if that solves the problem. |
Martin Avison (27) 1494 posts |
I know Reporter does not offer the same facilities as Harinezumi, but it can log all commends & errors during boot … but it logs to memory, not to disc. You can write to disc after boot (assuming boot gets to a desktop). |
Matt Price (2343) 71 posts |
I wonder if whoever is sitting on Dave Holden’s (not forgotten) APDL hardware inventory will start releasing IDE mini-podules now. Especially the Castle ADFS version which frankly has gone from being the worst one to have to the best overnight with Jon’s patch. Previously it only worked with old spinning disc drives or very, very specific IDE—>CF adapters and even then only two or three CF cards would work. I tested loads of adapters and CF/SD cards with the Castle ADFS podule and found that only one dual CF adapter would work and only with two 256MB SanDisk CF cards. Both had to be inserted, and it only worked with the 256MB size. Go up the exact same CF card but at 512MB and it didn’t work. |
Chris Evans (457) 1614 posts |
The person selling off on ebay the selection of APDLs hardware that wasn’t scrapped, have included UniPods but no IDE interfaces so I deduce there wasn’t any in stock when Dave sadly died. The ARCIn IDE podules Dave was selling were designed (and made?) by Baildon. One of the Baildon partners is trying to put the podules back into production but it is taking a lot of time. AIUI some parts aren’t available now and the software updates Dave did weren’t passed back to Baildon. |
Steve Pampling (1551) 8170 posts |
They’d almost certainly need the source anyway as, assuming the modules I’ve seen are any guide to the 32-bit compatibility, the code isn’t 32-bit safe. |
Rick Murray (539) 13840 posts |
Oh, yeah. Now that is clever… …let’s take something that runs at the same privilege as the OS itself… …and wing it… …because RISC OS is totally capable of recovering from branching somewhere like &3015A1341 in SVC mode… <sigh> 1 A randomly picked address in the RMA with the ‘V’ bit pushed into PC, as might happen if an ORRS into R14 is missed. |
Jon Abbott (1421) 2651 posts |
That would probably be David Bradforth. |
Matt Price (2343) 71 posts |
They did sell a couple of APDL IDE interfaces last year. I think one went for well over £150. |
Jon Abbott (1421) 2651 posts |
It’s been reported that some CF’s on RO3.1 report Disc Error 23 when reading. I suspect later ADFS builds will suffer the same issue, so I’ll have a look into this at some point. Hopefully I can get my hands on some failing devices to test. |
Jon Abbott (1421) 2651 posts |
Whilst looking at Disc Error 23, I believe I’ve spotted a potential issue in DoSwiIDEUserOp which will cause long transfers to potentially generate the error. I believe the STR R5,WinTickCount at line 796 should be inside the loop, which starts one line lower, labelled 10. As its currently implemented, Disc Error 23 will occur if the overall transfer takes longer than the timeout value passed in R5 to DoSwiIDEUserOp (or 10 seconds if none specified.) I believe the timeout should be reset on a valid sector read. |
Jeffrey Lee (213) 6048 posts |
Timeouts are always a bit tricky when things get broken down into sub-operations. Both WinIDEStartTransfer and DoSwiIDEUserOp treat the timeout as covering one IDE operation. But there’s a discrepancy in how long the default timeout is (30 seconds for most things, 10 seconds for DoSwiIDEUserOp), and there’s a discrepancy in the maximum length of the transfer (WinIDEStartTransfer is 256 sectors (WinIDETimeoutTransfer), DoSwiIDEUserOp is 65536 sectors). 256 (512 byte) sectors is only 128KB, so 30 seconds should be more than adequate. But 65536 sectors is 32MB, so 30 seconds might be inadequate if you’re using a particularly slow device. So at the least WinIDETimeoutUser should probably be increased to 30 seconds to match the other timeouts. Possibly even more, to protect large transfers (30 seconds + 1 second per megabyte?). But that may depend on what’s actually using the SWI (which I don’t expect to be much) |