Raspberry Pi 4 crashing during intensive disc activity
Matthew Phillips (473) 721 posts |
We’re busy converting lots of lovely fresh OpenStreetMap data, but I am finding that the Raspberry Pi 4, although it is nice and quick, is crashing quite regularly. The map data conversion process doesn’t do anything except read lots of stuff off disc, transform it, and write lots of files to disc. After it has been running a few hours, I find that Alarm’s clock is no longer ticking the seconds, and the whole machine has totally hung. No error on screen or anything. As one of the datasets needed a couple of days to process, I was periodically pausing the conversion, copying the paused data elsewhere, and if I then encountered a crash, deleting the failed data and copying the paused version back so I could resume. To start with, the crashes were only occurring when my conversion program was running, so naturally I assumed it was a fault in my program. But then I began to see it fail in exactly the same way when copying the paused data back into position, or when deleting a failed set of data. It seems, therefore, that some aspect of the disc system is unreliable when used intensively (or possibly unreliable full stop). I had a vague recollection that other people were encountering FileCore-related problems with RPi4 a few years ago, but Google hasn’t located any reports for me. The disc is a 240GB SSD connected via a USB-SATA adaptor, accessed through SCSIFS and formatted in a FileCore format. After converting a lot of data, the disc is now much more full: 49GB free. I don’t recall getting failures much when the disc was emptier, but that may be a red herring. Similarly, I did wonder if the fact that my conversion program had dynamic areas occupying well over 2GB might cause issues or affect the disc system in some way, but that cannot be the case as it has now crashed several times when hardly any software is running, and I have either just been copying ro deleting data straight after a reset. Any ideas? Lately I have been running CPU Clock so I can see the temperature display just when it crashes. Nothing of concern there. |
Chris Mahoney (1684) 2165 posts |
That’s a lot of data! If I have time on the weekend (and it’s looking a bit busy, so I may not) then I’ll try running the SQLite test scripts on my Pi 4. They’re only around 600 MB of actual code, but it writes a little bit and reads quite a lot. It’ll be interesting to see whether there are any issues there (I normally run on a Pi 3, which works fine). |
David J. Ruck (33) 1636 posts |
I’ve done a lot of sustained disc activity to a USB3-SATA attached SSD on my oldest RISC OS 4GB Pi 4B with no problems, it’s only a 128GB SSD with over 50% free though. |
Jon Abbott (1421) 2651 posts |
Interesting that you’re seeing lock ups as I have a similar issue but on a Pi3. With a recent OS nightly build from a few weeks ago (19th March 2022), it’s yet to hang. It only seems to hang if left alone at the desktop, I’ve yet to see it hang if I’m actively using the machine or have a game running on soak test. So in my case it’s not filesystem related, but could be power saving or USB stack – they’re the two I’ve focused on as likely suspects. What OS build version are running? Is it worth repeating your test with a nightly, if you haven’t already? |
Matthew Phillips (473) 721 posts |
I’ve only observed the hanging when I have left the machine alone: it’s busy doing lots of processing and disc activity so more active on the USB front perhaps than soak-testing a game. The data conversion takes such a lot of effort I don’t tend to use the machine for anything else at the same time. I don’t want to risk having to restart if something else causes a crash, for example! I’ve not tried another OS build yet. It’s currently on a build from 2 April 2022. One symptom I should mention is that when the hanging has occurred the hard disc’s LED continues blinking regularly. It’s absolutely regular, not the random flashing like when the data conversion is in progress. It does make me wonder whether there could be a bad patch on the disc and the machine is locking up waiting for the disc to respond? This would be consistent with the problem becoming more frequent as the disc got fuller. Just to point out, I have left the machine for several hours, when this happens, to be absolutely sure it’s not just in the middle of doing something complicated. I suppose I could try plugging in a different disc and see if that has any issues. At present I’m converting the data in smaller chunks and that has run overnight without hanging. |
Frederick Bambrough (1372) 837 posts |
I wonder if what I’m seeing might be related, at least in part. The USB sockets on my RPi4 seem pretty naff. Movement of the keyboard cable will cause either loss of keyboard and mouse or a complete freeze. The former requires off/on, the latter re-plugging (actually I’m using a switched hub though I’ve had the same without the hub). I thought to see what happens with Raspian. The disconnect still happens but the Pi reconnects automatically so if not forwarned one might not notice. Happens on all 4 sockets. |
George T. Greenfield (154) 749 posts |
Have you tried running a Disknight check or repair on the suspected drive? |
Bryan (8467) 468 posts |
My two thoughts are:- |
David J. Ruck (33) 1636 posts |
What else is on USB? The problems I have with RISC OS (but not Linux) on the Pi (and mini.m) is when I change which machine the Logitech wireless dongle is attached to using a USB soft switch (KVM). Occasionally this results in a complete hang of the machine, which is annoying, but I don’t switch that often, normally using the machines via VNC. |
Matthew Phillips (473) 721 posts |
Generally the only other thing plugged in via USB is a Cherry keyboard. The keyboard has a built-in USB hub into which I connect a Logitech wireless mouse. What I usually do, after setting the conversion process going, is unplug the keyboard, as it’s really the keyboard we use with the Iyonix. The only other things connected would be the ethernet lead, the HDMI cable, and the RTC hat. @Frederick: I sometimes find the machine crashes when I plug the keyboard back in, but this is not the main problem and doesn’t explain it hanging with nothing but the disc plugged in. The SATA adaptor is firmly connected to the board inside the case supplied by RISCOSBits, so it can’t be a wobbly connection there. @Bryan: I’ll take a look at config.txt for the CPU speed. @George: I’ve not checked the disc with DiscKnight yet: good idea. Not sure running on RPCEmu would help much because the OS would be rather different at the disc level. I’ve had the same software, OSMConvert, running for well over 24 reliably on the RPi3, converting data for Africa. The RPi3 sometimes (maybe once a week) complains that it can’t see the disc (different model of SATA adaptor) and I have to turn the whole thing off and on again but it doesn’t do this fairly frequent hanging — there’s always an error box and other stuff continues to work. |
Jon Abbott (1421) 2651 posts |
I’ve had the NIC unplugged since I started testing on this OS build and it hasn’t hung once. I’ve just plugged the NIC back in to transfer some files and within a few minutes it hung. I’m not sure if that’s coincidence or not, but I’ll continue to test with/without the NIC to see if its consistent. |
Matthew Phillips (473) 721 posts |
I ended up trying a different disc connected via one of the four USB sockets on the board. When I had the repeated hangs I was using an SSD inside the case connected to a daughter board. I didn’t get any hangs with the on-board USB socket. So it seemed the hanging was owing to either
I tried the SSD from inside the case using the same adaptor plugged in to an on-board USB socket. No problems. I then tried the other drive and other adaptor via the daughter board. Then I put it all back to how it was and still had no problems. Converted some more data overnight. Hanged again. The machine is now with my parents, so limited scope for experimenting further. I’ve asked them to log details of any hangs. |
David J. Ruck (33) 1636 posts |
The USB to SATA adaptors I use for SSD are just the straight cable ones, less to go wrong and I don’t see any need for an external case with an SSD as opposed to a more fragile spinning drive. |
Matthew Phillips (473) 721 posts |
Same here. All the drives were SSD connected by a simple USB to SATA adaptor. The machine in question is a PiHard from RISCOSBits. It has a Pi4 inside a case with a daughter-board whose main purpose is to bring the HDMI, power and audio connections round to the same side of the case as the USB and Ethernet. But it also provides an extra USB socket inside the case, into which the SATA adaptor for the internal SSD is connected. |
Jon Abbott (1421) 2651 posts |
I had two more hard locks yesterday, one occurred immediately when dragging a file from FTPc to the RAM drive, so I’m inclined to think the issue is probably USB related. |