RISC OS Open: Forum: Iyonix

Nov 22, 2017 1:56pm

Jon Abbott (1421) 2651 posts

If disc error 23 is just a case of truncating read ops to a certain length then it should be relatively straightforward to slot that in as well

Although capped transfers resolves the symptom, it’s not root cause. Once I’ve tracked that down, it should be a relatively innocuous patch. I suspect its a timing issue, possibly with the micro delays between status checks.

Nov 24, 2017 9:59am

Jon Abbott (1421) 2651 posts

From some quick testing on an A4000 this morning, its the overall transfer that’s timing out causing WinIDETimeout to get called. Increasing the DRQ timeout does not resolve the issue, so its probably an issue in the command setup between sector transfers.

if I cap the transfers to 1024 bytes, the error doesn’t occur

For writes, the cutoff is 1024 bytes (2 sectors), reads appear to have an issue beyond 512 bytes (1 sector). Verify works regardless of length.

EDIT: Where is the loop that’s handling the transfers? I’m sure it’s staring me in the face, but I can’t see it in the source.

Nov 24, 2017 1:18pm

Jon Abbott (1421) 2651 posts

I’ve just posted some interim patches over on StarDot to cap the transfers to 1 sector and increase DRQ pre RISCOS 3.5. This has resolved all the CF issues on my A4000 as far as I can tell, but needs wider testing to confirm.

I’m fairly certain there’s a bug in the command setup after one sector is transferred, although can’t for the life of me see the relevant loop in ADFS to investigate. My guess is there’s possibly not a long enough pause between transferring the sector data and sending the next command.

Nov 24, 2017 1:23pm

Jeffrey Lee (213) 6048 posts

EDIT: Where is the loop that’s handling the transfers? I’m sure it’s staring me in the face, but I can’t see it in the source.

WinIDEDoForeground and WinIDEDoBackground both kick off transfers, with WinIDEStartTransfer doing most of the leg-work (e.g. DRQ timeout loop). The bulk of the transfer is interrupt-driven, with WinIDEIRQHandler reading/writing individual sectors from the data FIFO as required (and kicking off subsequent ops via WinIDEStartTransfer if required – the code managing this looks a bit non-trivial, so I’m not sure exactly when it gets triggered). WinTickerHandler is used for detecting timeouts and other miscellaneous things (e.g. the disc error 20 fix).

For foreground transfers, WinIDEDoForeground just sits in a loop waiting for either the IRQ handler or TickerV to mark the transfer as complete.

All of this is in Adfs14.

Nov 24, 2017 6:51pm

Jon Abbott (1421) 2651 posts

The issue is more than likely in WinIDECommandDisc then?

Nov 27, 2017 3:35am

Jon Abbott (1421) 2651 posts

I’ve been trying to figure out what’s different when capping WinIDEMaxSectorsPerTransfer to one sector, that cures Disc Error 23 on the A4000/A5000, but can’t see any obvious difference other than perhaps timing between IDE commands. As far as I can tell, there’s no difference in the command sequence that’s sent to the drive, it’s just a difference in how ADFS deals with the transfer internally.

Admittedly, it’s not easy to follow the code as its jumping in and out of IRQ and calling code in the RMA to do the actual data transfer, so I’ve possibly missed a key difference somewhere.

Perhaps we should just cap WinIDEMaxSectorsPerTransfer to one sector for RISC OS 3.11 and be done with it, I’m not sure you’d notice the speed difference anyhow on such slow machines. It would be nice to know root cause though.

Nov 27, 2017 11:43am

Jeffrey Lee (213) 6048 posts

WinIDEMaxSectorsPerTransfer is used to limit the sector count that’s specified in the IDE command header (WinIDEParmSecCount used by WinIDECommandDisc). So it’ll be the difference between CMD-SECTOR-SECTOR-SECTOR…. and CMD-SECTOR-CMD-SECTOR-CMD-SECTOR… I think this might be where the WinIDEStartTransfer call from within the IRQ handler comes into play – if the transfer had been clamped due to WinIDEMaxSectorsPerTransfer then it will issue another command to transfer the next chunk.

I’m not sure you’d notice the speed difference anyhow on such slow machines.

It should be pretty easy to test, just by using the OS_File load/save operations on a “large” file.

Nov 27, 2017 9:46pm

Chris Evans (457) 1614 posts

Also what module versions are in the NIC /RiscPC disc image.

I think in the past we’ve used the ROOL updated modules for the I-cubed card and had the same failure but tried to double check by updating the NICs firmware only for the computer to stiff 88% of the way through the flash update. Result, one bricked NIC. A pity that the flashing program can’t be told which podule slot it is in.

Nov 27, 2017 10:43pm

Jon Abbott (1421) 2651 posts

I wouldn’t advice flashing EtherLAN NIC’s as it will lock you into the specific version of CLib that’s contained in the flash image. You’re better off leaving it with the old 26bit Modules and softloading EtherH during the boot sequence, after the latest CLib has been loaded.

So it’ll be the difference between CMD-SECTOR-SECTOR-SECTOR…. and CMD-SECTOR-CMD-SECTOR-CMD-SECTOR

Quite, either a timing issue or a status check issue. Probably the former at a guess.

Dec 17, 2017 9:47pm

Ian Bradbury (2561) 8 posts

I posted a fix for NICs not working with CF cards on stardot last year:

The problem is caused by the “Ready” signal on Pin 27 of the IDE interface which is shared with the network card. It’s probably meant to be pulled high weakly and driven low by open drain outputs on the network and IDE interface, however it looks like the SD adapters and some CF card are driving it high which causes any network card that makes use of that signal to fail.
However, the fix is fairly straightforward and that is to put a diode in the Ready line between the CF cards and the IDE socket on the RISC PC with the cathode of the diode (banded end) towards the CF card.
I used a schottky diode as that had a lower voltage drop but any small signal diode like a 1N4148 should work. I found it easier to cut wire 27 of the IDE cable to fit the diode rather than modify the PCB.

Feb 2, 2018 9:41pm

Jeffrey Lee (213) 6048 posts

Over here is an updated ROMPatch for RISC OS 3.50/3.60/3.70/3.71/4.02 which includes the disc error 20 fix. If suitably brave people can give it a test then I’ll get the changes checked into CVS.

I’m assuming that the new ROMPatch has been sufficiently tested now, so I’ve committed it to CVS.

https://www.riscosopen.org/viewer/revisions/logs?ident=1517607335-562617.html

May 14, 2018 8:02am

Colin Ferris (399) 1814 posts

Which machines does this effect – ie RPC/Iyonix? (does softloading of the OS work with HDs)
Does the RO5.24 come with the patch?
Are the ROM’s for the RPC 5.24?

May 14, 2018 9:13am

Jon Abbott (1421) 2651 posts

Which machines does this effect – ie RPC/Iyonix?

Anything using ADFS.

Does the RO5.24 come with the patch?

Yes, it was committed to CVS on Feb 2nd.

May 14, 2018 9:54am

Jeffrey Lee (213) 6048 posts

Which machines does this effect – ie RPC/Iyonix?

Anything using ADFS.

Not quite – it’s really anything which uses the IDE version of ADFS. The new C version (ADFS 4.00+) that’s used for SATA on the Titanium & OMAP5 are fine. And presumably the old ST506 version is fine as well.

(does softloading of the OS work with HDs)

Yes. The bug affects the ability to write to the disc, so read operations like running a softload should work fine.

Are the ROM’s for the RPC 5.24?

Yes (assuming the text on the page is correct – the URL says 5.22)

Dec 5, 2018 6:50pm

Mike Howard (479) 216 posts

I know this topic is fairly old but I’m currently having iyonix woes and I’m now wondering if a current issue is related to this topic. My initial problem determination seems to have been way off. I just thought the IDE bus was toast but it looks like multiple problems, including PSU & graphics card.

Anyway, to the point of my post. With the old motherboard, new psu, new graphics card, the majority of problems seem solved. Except, ‘Disc error 23’.

During the transfer of a lot of data and some large files 50-500MB etc, I (nearly) always get a ‘Copying files – Error’. This can happen on it’s own or be brought on by using , for example, the adjust size icon on a filer window.

The error will be ‘Disc error 23 at :4/0000000DE8159000’

Obviously, the address is different each time. Trying the same copy again will usually work or selecting skip will ignore the proble file and continue.

Thought I’d mention it in case there is something that can be done about it.

It may of course be a hardware issue, though the disc works fine in another machine.

These ‘copies’ are just tests and are within the same ide disc.

Dec 6, 2018 7:36am

Jon Abbott (1421) 2651 posts

I assume you’ve updated to a recent RISCOS build that has the DRQ fix covered by this thread?

Disc error 23 on early Acorn machines was being caused by large DMA transfers, we worked around it by capping the DMA limit to 1024 bytes. This was covered in the parallel thread to this on StarDot where a patched ADFS 2.68 is provided. Although not applicable to an Iyonix, it is likely the same issue so we could possibly modify the second patch file in that thread and see if the DMA limit can be capped on an Iyonix.

I believe it’s being caused by a timing issue, but we didn’t investigate it further as I could only reproduce it on one device on an A5000. I managed to fry my Iyonix when flashing it at the time, so never go to test the device in question on it.

I would advise taking a backup to another device and see if the problem still occurs, you’ll certainly want to take a backup before trying any hacks on ADFS to see if it is related to transferring more than one block at a time.

Iyonix - Disc Error 20

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Nov 22, 2017 1:56pm Jon Abbott (1421) 2651 posts	If disc error 23 is just a case of truncating read ops to a certain length then it should be relatively straightforward to slot that in as well Although capped transfers resolves the symptom, it’s not root cause. Once I’ve tracked that down, it should be a relatively innocuous patch. I suspect its a timing issue, possibly with the micro delays between status checks.

Nov 24, 2017 9:59am Jon Abbott (1421) 2651 posts	From some quick testing on an A4000 this morning, its the overall transfer that’s timing out causing WinIDETimeout to get called. Increasing the DRQ timeout does not resolve the issue, so its probably an issue in the command setup between sector transfers. if I cap the transfers to 1024 bytes, the error doesn’t occur For writes, the cutoff is 1024 bytes (2 sectors), reads appear to have an issue beyond 512 bytes (1 sector). Verify works regardless of length. EDIT: Where is the loop that’s handling the transfers? I’m sure it’s staring me in the face, but I can’t see it in the source.

Nov 24, 2017 1:18pm Jon Abbott (1421) 2651 posts	I’ve just posted some interim patches over on StarDot to cap the transfers to 1 sector and increase DRQ pre RISCOS 3.5. This has resolved all the CF issues on my A4000 as far as I can tell, but needs wider testing to confirm. I’m fairly certain there’s a bug in the command setup after one sector is transferred, although can’t for the life of me see the relevant loop in ADFS to investigate. My guess is there’s possibly not a long enough pause between transferring the sector data and sending the next command.

Nov 24, 2017 1:23pm Jeffrey Lee (213) 6048 posts	EDIT: Where is the loop that’s handling the transfers? I’m sure it’s staring me in the face, but I can’t see it in the source. WinIDEDoForeground and WinIDEDoBackground both kick off transfers, with WinIDEStartTransfer doing most of the leg-work (e.g. DRQ timeout loop). The bulk of the transfer is interrupt-driven, with WinIDEIRQHandler reading/writing individual sectors from the data FIFO as required (and kicking off subsequent ops via WinIDEStartTransfer if required – the code managing this looks a bit non-trivial, so I’m not sure exactly when it gets triggered). WinTickerHandler is used for detecting timeouts and other miscellaneous things (e.g. the disc error 20 fix). For foreground transfers, WinIDEDoForeground just sits in a loop waiting for either the IRQ handler or TickerV to mark the transfer as complete. All of this is in Adfs14.

Nov 24, 2017 6:51pm Jon Abbott (1421) 2651 posts	The issue is more than likely in WinIDECommandDisc then?

Nov 27, 2017 3:35am Jon Abbott (1421) 2651 posts	I’ve been trying to figure out what’s different when capping WinIDEMaxSectorsPerTransfer to one sector, that cures Disc Error 23 on the A4000/A5000, but can’t see any obvious difference other than perhaps timing between IDE commands. As far as I can tell, there’s no difference in the command sequence that’s sent to the drive, it’s just a difference in how ADFS deals with the transfer internally. Admittedly, it’s not easy to follow the code as its jumping in and out of IRQ and calling code in the RMA to do the actual data transfer, so I’ve possibly missed a key difference somewhere. Perhaps we should just cap WinIDEMaxSectorsPerTransfer to one sector for RISC OS 3.11 and be done with it, I’m not sure you’d notice the speed difference anyhow on such slow machines. It would be nice to know root cause though.

Nov 27, 2017 11:43am Jeffrey Lee (213) 6048 posts	WinIDEMaxSectorsPerTransfer is used to limit the sector count that’s specified in the IDE command header (WinIDEParmSecCount used by WinIDECommandDisc). So it’ll be the difference between CMD-SECTOR-SECTOR-SECTOR…. and CMD-SECTOR-CMD-SECTOR-CMD-SECTOR… I think this might be where the WinIDEStartTransfer call from within the IRQ handler comes into play – if the transfer had been clamped due to WinIDEMaxSectorsPerTransfer then it will issue another command to transfer the next chunk. I’m not sure you’d notice the speed difference anyhow on such slow machines. It should be pretty easy to test, just by using the OS_File load/save operations on a “large” file.

Nov 27, 2017 9:46pm Chris Evans (457) 1614 posts	Also what module versions are in the NIC /RiscPC disc image. I think in the past we’ve used the ROOL updated modules for the I-cubed card and had the same failure but tried to double check by updating the NICs firmware only for the computer to stiff 88% of the way through the flash update. Result, one bricked NIC. A pity that the flashing program can’t be told which podule slot it is in.

Nov 27, 2017 10:43pm Jon Abbott (1421) 2651 posts	I wouldn’t advice flashing EtherLAN NIC’s as it will lock you into the specific version of CLib that’s contained in the flash image. You’re better off leaving it with the old 26bit Modules and softloading EtherH during the boot sequence, after the latest CLib has been loaded. So it’ll be the difference between CMD-SECTOR-SECTOR-SECTOR…. and CMD-SECTOR-CMD-SECTOR-CMD-SECTOR Quite, either a timing issue or a status check issue. Probably the former at a guess.

Dec 17, 2017 9:47pm Ian Bradbury (2561) 8 posts	I posted a fix for NICs not working with CF cards on stardot last year: The problem is caused by the “Ready” signal on Pin 27 of the IDE interface which is shared with the network card. It’s probably meant to be pulled high weakly and driven low by open drain outputs on the network and IDE interface, however it looks like the SD adapters and some CF card are driving it high which causes any network card that makes use of that signal to fail. However, the fix is fairly straightforward and that is to put a diode in the Ready line between the CF cards and the IDE socket on the RISC PC with the cathode of the diode (banded end) towards the CF card. I used a schottky diode as that had a lower voltage drop but any small signal diode like a 1N4148 should work. I found it easier to cut wire 27 of the IDE cable to fit the diode rather than modify the PCB.

Feb 2, 2018 9:41pm Jeffrey Lee (213) 6048 posts	Over here is an updated ROMPatch for RISC OS 3.50/3.60/3.70/3.71/4.02 which includes the disc error 20 fix. If suitably brave people can give it a test then I’ll get the changes checked into CVS. I’m assuming that the new ROMPatch has been sufficiently tested now, so I’ve committed it to CVS. https://www.riscosopen.org/viewer/revisions/logs?ident=1517607335-562617.html

May 14, 2018 8:02am Colin Ferris (399) 1814 posts	Which machines does this effect – ie RPC/Iyonix? (does softloading of the OS work with HDs) Does the RO5.24 come with the patch? Are the ROM’s for the RPC 5.24?

May 14, 2018 9:13am Jon Abbott (1421) 2651 posts	Which machines does this effect – ie RPC/Iyonix? Anything using ADFS. Does the RO5.24 come with the patch? Yes, it was committed to CVS on Feb 2nd.

May 14, 2018 9:54am Jeffrey Lee (213) 6048 posts	Which machines does this effect – ie RPC/Iyonix? Anything using ADFS. Not quite – it’s really anything which uses the IDE version of ADFS. The new C version (ADFS 4.00+) that’s used for SATA on the Titanium & OMAP5 are fine. And presumably the old ST506 version is fine as well. (does softloading of the OS work with HDs) Yes. The bug affects the ability to write to the disc, so read operations like running a softload should work fine. Are the ROM’s for the RPC 5.24? Yes (assuming the text on the page is correct – the URL says 5.22)

Dec 5, 2018 6:50pm Mike Howard (479) 216 posts	I know this topic is fairly old but I’m currently having iyonix woes and I’m now wondering if a current issue is related to this topic. My initial problem determination seems to have been way off. I just thought the IDE bus was toast but it looks like multiple problems, including PSU & graphics card. Anyway, to the point of my post. With the old motherboard, new psu, new graphics card, the majority of problems seem solved. Except, ‘Disc error 23’. During the transfer of a lot of data and some large files 50-500MB etc, I (nearly) always get a ‘Copying files – Error’. This can happen on it’s own or be brought on by using , for example, the adjust size icon on a filer window. The error will be ‘Disc error 23 at :4/0000000DE8159000’ Obviously, the address is different each time. Trying the same copy again will usually work or selecting skip will ignore the proble file and continue. Thought I’d mention it in case there is something that can be done about it. It may of course be a hardware issue, though the disc works fine in another machine. These ‘copies’ are just tests and are within the same ide disc.

Dec 6, 2018 7:36am Jon Abbott (1421) 2651 posts	I assume you’ve updated to a recent RISCOS build that has the DRQ fix covered by this thread? Disc error 23 on early Acorn machines was being caused by large DMA transfers, we worked around it by capping the DMA limit to 1024 bytes. This was covered in the parallel thread to this on StarDot where a patched ADFS 2.68 is provided. Although not applicable to an Iyonix, it is likely the same issue so we could possibly modify the second patch file in that thread and see if the DMA limit can be capped on an Iyonix. I believe it’s being caused by a timing issue, but we didn’t investigate it further as I could only reproduce it on one device on an A5000. I managed to fry my Iyonix when flashing it at the time, so never go to test the device in question on it. I would advise taking a backup to another device and see if the problem still occurs, you’ll certainly want to take a backup before trying any hacks on ADFS to see if it is related to transferring more than one block at a time.