Ticket #411 (WorksForMe)Thu Jun 11 20:50:32 UTC 2015
Random freezing of Iyonix during hard disk activity
Reported by: | Christopher Martin (1504) | Severity: | Critical |
Part: | Release: | ||
Milestone: | Status | WorksForMe |
Details by Christopher Martin (1504):
Since installing 5.20 on an Iyonix, the system has been completely freezing at random times during intense IDE disk write activity. Typical activities include copying files, downloading or processing videos and other large files, and compiling software from sources. Installing 5.22 appears to have made the problem even more frequent. (Although I have also turned off buffering and minimised caching in an attempt to ward off filesystem corruption, and this has caused disk access to increase.)
Freezes seem to occur when the system is multitasking. I had thought it was related to disk-intensive activity running in a taskwindow, but freezes have also occurred when downloading files by web browser or processing downloaded emails. Fortunately, the machine has never frozen while running DiskKnight to repair damage to the filesystem.
The freeze is total. The hard disk light stays brightly lit and steady; the mouse pointer neither moves nor changes; the keyboard does not respond; and a complete power-off at the wall socket is necessary to reboot. As far as I can tell, despite the disk light being lit, there is no further disk activity once the freeze occurs.
Being unable to edit and rebuild software without the machine freezing, I have restored 5.18 on the machine and (fingers crossed) it has not frozen since.
Through the forum topic <https://www.riscosopen.org/forum/forums/5/topic… I am aware of one other user who has experienced the same problem on their Iyonix.
Changelog:
Modified by Christopher Martin (1504) Thu, June 18 2015 - 05:47:36 GMT
Sad to say, I have now had the occasional freeze with 5.18, but the frequency is far less than with 5.20 or 5.22.
Modified by Sprow (202) Sat, July 25 2015 - 07:32:14 GMT
Given the large number of people with Iyonix machines which don’t exhibit this (on 5.18, 5.20, 5.22 – a span of several years) it sounds more like a machine specific problem.
You could rent a spare parts kit
http://www.cjemicros.co.uk/micros/individual/ne…
as the power supply will be implicated, some access pattern of files could tip it over the edge causing a hang.
Modified by Christopher Martin (1504) Mon, August 10 2015 - 21:58:35 GMT
I understand that a faulty power supply or disk drive can cause problems. But mine is a home-brew Iyonix in Australia, so a spare parts kit from CJE isn’t really practical.
But on a hunch, I replaced the HourGlass module with one that implements the same interface as the real one but actually does nothing. The result is that freezing rarely happens now. I have reinstalled RISC OS 5.22 and re-enabled disk caches and buffers. The machine has gone from being unusable to nearly 100% reliable, just by neutralising the hourglass.
The occasional freeze now seems to be limited to internet activity in a taskwindow.
I am wondering if recent RISC OS versions are more “aggressive” in the way they handle interrupts such that a race condition can occur, or a deadlock, or a re-entrancy fault. I suspect an interrupt storm as a more likely cause than a file access pattern.
Modified by Christopher Martin (1504) Mon, August 10 2015 - 22:17:00 GMT
I understand that a faulty power supply or disk drive can cause problems. But mine is a home-brew Iyonix in Australia, so a spare parts kit from CJE isn’t really practical.
But on a hunch, I replaced the HourGlass module with one that implements the same interface as the real one but actually does nothing. The result is that freezing rarely happens now. I have reinstalled RISC OS 5.22 and re-enabled disk caches and buffers. The machine has gone from being unusable to nearly 100% reliable, just by neutralising the hourglass.
The occasional freeze now seems to be limited to internet activity in a taskwindow.
I am wondering if recent RISC OS versions are more “aggressive” in the way they handle interrupts such that a race condition can occur, or a deadlock, or a re-entrancy fault. I suspect an interrupt storm as a more likely cause than a file access pattern.
Modified by Christopher Martin (1504) Mon, August 10 2015 - 22:42:10 GMT
I understand that a faulty power supply or disk drive can cause problems. But mine is a home-brew Iyonix in Australia, so a spare parts kit from CJE isn’t really practical.
But on a hunch, I replaced the HourGlass module with one that implements the same interface as the real one but actually does nothing. The result is that freezing rarely happens now. I have reinstalled RISC OS 5.22 and re-enabled disk caches and buffers. The machine has gone from being unusable to nearly 100% reliable, just by neutralising the hourglass.
The occasional freeze now seems to be limited to internet activity in a taskwindow.
I am wondering if recent RISC OS versions are more “aggressive” in the way they handle interrupts such that a race condition can occur, or a deadlock, or a re-entrancy fault. I suspect an interrupt storm as a more likely cause than a file access pattern.
Modified by Sprow (202) Fri, August 21 2015 - 07:46:39 GMT
You could buy a generic PC power supply from a local store to test. You might want to clean the contacts on the graphics card PCI edge connector by unplug/reseating it.
The OS is certainly getting faster over time, as inefficient code is located and replaced. Could this be Ticket #324? Are you able to run for a period with no networking, or perhaps via a filtering switch?
Modified by Sprow (202) Sat, April 23 2016 - 10:50:13 GMT
- Status changed from Open to WorksForMe
Put down to machine specific issue.