Iyonix - Disc Error 20
Jon Abbott (1421) 2651 posts |
Although capped transfers resolves the symptom, it’s not root cause. Once I’ve tracked that down, it should be a relatively innocuous patch. I suspect its a timing issue, possibly with the micro delays between status checks. |
Jon Abbott (1421) 2651 posts |
From some quick testing on an A4000 this morning, its the overall transfer that’s timing out causing WinIDETimeout to get called. Increasing the DRQ timeout does not resolve the issue, so its probably an issue in the command setup between sector transfers.
For writes, the cutoff is 1024 bytes (2 sectors), reads appear to have an issue beyond 512 bytes (1 sector). Verify works regardless of length. EDIT: Where is the loop that’s handling the transfers? I’m sure it’s staring me in the face, but I can’t see it in the source. |
Jon Abbott (1421) 2651 posts |
I’ve just posted some interim patches over on StarDot to cap the transfers to 1 sector and increase DRQ pre RISCOS 3.5. This has resolved all the CF issues on my A4000 as far as I can tell, but needs wider testing to confirm. I’m fairly certain there’s a bug in the command setup after one sector is transferred, although can’t for the life of me see the relevant loop in ADFS to investigate. My guess is there’s possibly not a long enough pause between transferring the sector data and sending the next command. |
Jeffrey Lee (213) 6048 posts |
WinIDEDoForeground and WinIDEDoBackground both kick off transfers, with WinIDEStartTransfer doing most of the leg-work (e.g. DRQ timeout loop). The bulk of the transfer is interrupt-driven, with WinIDEIRQHandler reading/writing individual sectors from the data FIFO as required (and kicking off subsequent ops via WinIDEStartTransfer if required – the code managing this looks a bit non-trivial, so I’m not sure exactly when it gets triggered). WinTickerHandler is used for detecting timeouts and other miscellaneous things (e.g. the disc error 20 fix). For foreground transfers, WinIDEDoForeground just sits in a loop waiting for either the IRQ handler or TickerV to mark the transfer as complete. All of this is in Adfs14. |
Jon Abbott (1421) 2651 posts |
The issue is more than likely in WinIDECommandDisc then? |
Jon Abbott (1421) 2651 posts |
I’ve been trying to figure out what’s different when capping WinIDEMaxSectorsPerTransfer to one sector, that cures Disc Error 23 on the A4000/A5000, but can’t see any obvious difference other than perhaps timing between IDE commands. As far as I can tell, there’s no difference in the command sequence that’s sent to the drive, it’s just a difference in how ADFS deals with the transfer internally. Admittedly, it’s not easy to follow the code as its jumping in and out of IRQ and calling code in the RMA to do the actual data transfer, so I’ve possibly missed a key difference somewhere. Perhaps we should just cap WinIDEMaxSectorsPerTransfer to one sector for RISC OS 3.11 and be done with it, I’m not sure you’d notice the speed difference anyhow on such slow machines. It would be nice to know root cause though. |
Jeffrey Lee (213) 6048 posts |
WinIDEMaxSectorsPerTransfer is used to limit the sector count that’s specified in the IDE command header (WinIDEParmSecCount used by WinIDECommandDisc). So it’ll be the difference between CMD-SECTOR-SECTOR-SECTOR…. and CMD-SECTOR-CMD-SECTOR-CMD-SECTOR… I think this might be where the WinIDEStartTransfer call from within the IRQ handler comes into play – if the transfer had been clamped due to WinIDEMaxSectorsPerTransfer then it will issue another command to transfer the next chunk.
It should be pretty easy to test, just by using the OS_File load/save operations on a “large” file. |
Chris Evans (457) 1614 posts |
I think in the past we’ve used the ROOL updated modules for the I-cubed card and had the same failure but tried to double check by updating the NICs firmware only for the computer to stiff 88% of the way through the flash update. Result, one bricked NIC. A pity that the flashing program can’t be told which podule slot it is in. |
Jon Abbott (1421) 2651 posts |
I wouldn’t advice flashing EtherLAN NIC’s as it will lock you into the specific version of CLib that’s contained in the flash image. You’re better off leaving it with the old 26bit Modules and softloading EtherH during the boot sequence, after the latest CLib has been loaded.
Quite, either a timing issue or a status check issue. Probably the former at a guess. |
Ian Bradbury (2561) 8 posts |
I posted a fix for NICs not working with CF cards on stardot last year: The problem is caused by the “Ready” signal on Pin 27 of the IDE interface which is shared with the network card. It’s probably meant to be pulled high weakly and driven low by open drain outputs on the network and IDE interface, however it looks like the SD adapters and some CF card are driving it high which causes any network card that makes use of that signal to fail. |
Jeffrey Lee (213) 6048 posts |
I’m assuming that the new ROMPatch has been sufficiently tested now, so I’ve committed it to CVS. https://www.riscosopen.org/viewer/revisions/logs?ident=1517607335-562617.html |
Colin Ferris (399) 1814 posts |
Which machines does this effect – ie RPC/Iyonix? (does softloading of the OS work with HDs) |
Jon Abbott (1421) 2651 posts |
Anything using ADFS.
Yes, it was committed to CVS on Feb 2nd. |
Jeffrey Lee (213) 6048 posts |
Which machines does this effect – ie RPC/Iyonix? Not quite – it’s really anything which uses the IDE version of ADFS. The new C version (ADFS 4.00+) that’s used for SATA on the Titanium & OMAP5 are fine. And presumably the old ST506 version is fine as well.
Yes. The bug affects the ability to write to the disc, so read operations like running a softload should work fine.
Yes (assuming the text on the page is correct – the URL says 5.22) |
Mike Howard (479) 216 posts |
I know this topic is fairly old but I’m currently having iyonix woes and I’m now wondering if a current issue is related to this topic. My initial problem determination seems to have been way off. I just thought the IDE bus was toast but it looks like multiple problems, including PSU & graphics card. Anyway, to the point of my post. With the old motherboard, new psu, new graphics card, the majority of problems seem solved. Except, ‘Disc error 23’. During the transfer of a lot of data and some large files 50-500MB etc, I (nearly) always get a ‘Copying files – Error’. This can happen on it’s own or be brought on by using , for example, the adjust size icon on a filer window. The error will be ‘Disc error 23 at :4/0000000DE8159000’ Obviously, the address is different each time. Trying the same copy again will usually work or selecting skip will ignore the proble file and continue. Thought I’d mention it in case there is something that can be done about it. It may of course be a hardware issue, though the disc works fine in another machine. These ‘copies’ are just tests and are within the same ide disc. |
Jon Abbott (1421) 2651 posts |
I assume you’ve updated to a recent RISCOS build that has the DRQ fix covered by this thread? Disc error 23 on early Acorn machines was being caused by large DMA transfers, we worked around it by capping the DMA limit to 1024 bytes. This was covered in the parallel thread to this on StarDot where a patched ADFS 2.68 is provided. Although not applicable to an Iyonix, it is likely the same issue so we could possibly modify the second patch file in that thread and see if the DMA limit can be capped on an Iyonix. I believe it’s being caused by a timing issue, but we didn’t investigate it further as I could only reproduce it on one device on an A5000. I managed to fry my Iyonix when flashing it at the time, so never go to test the device in question on it. I would advise taking a backup to another device and see if the problem still occurs, you’ll certainly want to take a backup before trying any hacks on ADFS to see if it is related to transferring more than one block at a time. |