Curious hourglass behaviour
Pages: 1 2
Dave Higton (1515) 3534 posts |
I’m writing a multitasking file comparison app, and I’m giving it a good thrashing on some ~2GB files. The app uses very short time slices; one test I use is to play Patience. If Patience operates smoothly with almost imperceptible delays, I think the multitasking is good enough. My app is using almost all the CPU time, according to TaskUsage. It doesn’t use the hourglass at all. The curious thing is that, for periods of several minutes, I’ve seen an hourglass appear, either continuously or flashing. My app seems to provoke it, although it doesn’t directly cause it (no calls to the hourglass module). It’s not the only time I’ve seen this; other very busy apps, also not calling the hourglass module, seem to have done the same. Is there any way to find which app is putting the hourglass up? Has anyone else seen the same behaviour? BBxM, RO 5.27 (20-Nov-18). |
Andrew Conroy (370) 740 posts |
Do you have any ShareFS mounts active? ShareFS can hourglass if it can’t contact the mounted drive (if that’s the correct term). |
Dave Higton (1515) 3534 posts |
There are two other machines on the LAN that have drives shared via ShareFS. I’ll have to look whether they are mounted on my machine when it happens. Although there would be lots of LAN traffic, I can’t see why there should be enough to prevent ShareFS traffic getting through, though. Maybe the fact that the machine is so busy reduces the number of callbacks that can happen? |
Dave Higton (1515) 3534 posts |
Some more tests show that the hourglass happens even though no ShareFS shares are mounted on the BBxM. There is just a “Drives” icon on the icon bar. |
Chris Evans (457) 1614 posts |
I don’t think it relevant to your problem but the colour of the hour glass can easily be changed. The only thing I know that does change the hour glass colour is ShareFS which changes it to red if it hits a problem, it seems to be the normal blue when say it is normally busy but then changes to red if it loses connection. |
Jon Abbott (1421) 2651 posts |
That possibly explains why my Hourglass is always red, I’ve always assumed it was an issue with GraphicsV not passing on the palette when you switch graphics drivers.
That’s curious as the disc imager in ADFFS does a fairly intensive compression, compressing 512 bytes on each Wimp_Poll, but I don’t think I’ve ever seen an hourglass appear. How long after your compression starts does the Hourglass appear? I’d also be interested to know how you got the multitasking to be smooth whilst compressing, as despite my code calling Wimp_Poll thousands of times during the compression cycle, the machine becomes pretty unresponsive. |
Colin (478) 2433 posts |
There is still ShareFS activity when no shares are mounted – to enumerate the shares on the network. This can have problems the same as any other sharefs transfer – this is why there have been problems with shares not appearing. I’d ensure that every computer on the network, including the one with problems, has |
David Feugey (2125) 2709 posts |
Hum. Before having SMP, it would be good to offload this kind of component to cores not (yet) used by RISC OS. Most AMP OSes use only core 0 for the OS and (possibly) the others only for apps. RISC OS could reserve some cores for critical part of the OS. For example SSL code or the whole networking stack. Sorry to be off topic (is it?). Edit: a few minutes later, I think it’s really a good idea. It would be much simpler to offload some parts of the OS to other cores than try to adapt the OS to make the apps running on other cores. IMHO, a very valuable short term solution. A tube like system? Only Jeffrey can do it :) |
Dave Higton (1515) 3534 posts |
I’m doing comparison rather than compression. I’ve had the same sort of results with encryption. The good things about both comparison and encryption is that the two data streams (both in for comparison, in and out for compression) are of identical size, and the processing is almost nothing. The good speed and responsiveness come from file transfers that are block aligned, and in fact are integral powers of 2, which gives the fastest possible disc performance. I’m using 32kiB or 64kiB. I also did a pair of backup and restore apps that are on my website at http://davehigton.me.uk and use zlib for compression and decompression. For compression I use 32kiB chunks of input; clearly I have no choice for the output chunk size. This all seems to fairly fly along too. Only some file types are compressed, but compressing any 32kiB block seems to be almost instantaneous. I’m only using Deflate, which is the quickest (for backup, I think that’s the best choice – your app may have different criteria). |
Dave Higton (1515) 3534 posts |
Maybe a minute or so before it first appears. During a long period of application activity, it comes and goes. |
Dave Higton (1515) 3534 posts |
ShareFSWindow has always been 1 on all my machines. It’s one of the first things I set up when commissioning a new machine. I’d like to understand why a busy CPU and/or network causes ShareFS to protest. If there were significant periods of single-tasking, I could understand it – but there aren’t. The apps I’m working on are considerate citizens :-) They don’t take any long time slices once the files are open. |
Dave Higton (1515) 3534 posts |
On the topic of speed: the file comparison app used to compare each block in BASIC. I’ve changed it to do a word length comparison in assembly language, which is extremely quick of course; if there is any difference, it reverts to the BASIC to count the diffs at byte resolution. The time for ~2GB files came down from about 1642 seconds to 905 seconds. |
Colin (478) 2433 posts |
From what I’ve pieced together so far the problem is the lack of preemption which means that multitasking has to happen at the application end rather than the device end so a swi call can’t have the thread wait in the background for a resource to become available as there is only 1 thread. Networking requires callbacks to move the data from the interrupt context to the main thread and if the callback doesn’t happen incomming data just uses all the mbufs and you get dropped packets. When packets get dropped sharefs sits waiting for a reply that has been dropped and you get a timedout hourglass. You can also get a hourglass on your machine if the remote machine is having problems replying. If you are doing something in a loop in userspace you need a swi call every so often to trigger callbacks. In supervisor mode an OS_LeaveOS OS_EnterOS combination and a bit of glue triggers callbacks though this is not without problems as witnessed by lanmanfs and lanman98fs – the main problem being that you want network callbacks triggered but don’t want other callbacks triggered. A recursive copy command, for example, will stop callbacks because it doesn’t call a swi in usermode after the initial swi call. It only works with, for example, Lanmanfs because LanmanFS explicitly triggers callbacks using the OS_LeaveOS/OS_EnterOS method. Don’t know if this helps you any but it helped me a bit writing it down :-) |
Jeffrey Lee (213) 6048 posts |
Yes, offloading processing done by modules to other cores will be easier than offloading processing done by applications (and is already possible using the prototype SMP module).
Incorrect.
Depends on the flavour of callback. For non-transient callbacks, yes, it’ll only happen on return from a SWI to user mode. But transient callbacks (which are the more commonly used variety) will trigger when a SWI returns to user mode or when an IRQ returns to user mode (and also, I believe, when an RTSupport routine returns to user mode). Apart from the occasional bug/issue related to returned errors blocking callbacks, I wouldn’t expect an app which sits in user mode and calls zero SWIs for long periods of time to cause any problems. |
Dave Higton (1515) 3534 posts |
My thanks to Colin and Jeffrey for the above information. However, my app is doing the most simple and straightforward thing possible: it’s calling Wimp_Poll many times a second, every second (unles there’s any delay in fetching data from the NAS drive or the local SSD). Doesn’t that give callbacks the best chance of being called? One of the files is from NAS and is therefore putting significant traffic over the LAN. But the numbers are a file of ~2GB in about 900 seconds, which by my arithmetic ends up as about 22Mb/s or so plus TCP and IP overheads. The switch is an HP 10/100/1G, so I can’t see it contributing to the problem. The BBxM is running RO 5.27 20-Nov-18) with SharedCLibrary 5.97 (11 Jun 2018). MBufManager doesn’t seem to offer any commands, so is there any way of checking for mbuf exhaustion? |
Chris Evans (457) 1614 posts |
If you are not using ShareFS whilst running your program why not try turning it off to see if it is the culprit? |
Colin (478) 2433 posts |
Data can arrive at the computer faster than that – callback intervals can make the overall transfer rate slower. Just to confirm a networking problem if you compare 2 files where you have a problem on the same ssd does the problem go away?
You would have thought so. |
Rick Murray (539) 13851 posts |
I think I may have seen this and I’m not using ShareFS ! The machine, a Pi2 (ARMv7) may, for unknown reasons, simply freeze for anywhere between 10 and 40 seconds. I don’t recall if the Hourglass is on (I think it is but can’t wear to it), but the SD activity indication is solidly on (no flicker). NumLock toggles the LED, Alt-Break does nothing. The volume is good according to DiscKnight.
Correct me if I’m wrong, but aren’t these supposed to be negotiated protocols, rather than “throw something at the wall and see what sticks”? |
Steve Pampling (1551) 8172 posts |
On average the header contributions will end up giving you roughly 10 bits per byte transferred so for easy maths just shift the decimal by one place divide by time passed (900 sec) and then divide by 10^6 for the number of million bits per second (about 244) and since 244 is around 25/% of theoretical maximum on a (nominal) 1Gb/s interface then unless there’s a “disc” transfer bottleneck I’d be looking for a duplex mismatch on the links on the route because that tends to be the answer in most network transfer problems where the transfer speed is around 20-25% of the theoretical available. Unless you have a managed switch you stand little chance of seeing any evidence other than lots of re-sends in a wireshark capture (which you may find difficult to do unless the endpoints can run wireshark or you’re using a managed switch that can do port mirroring (SPAN if you’re a Cisco droid) |
Colin (478) 2433 posts |
Yes but the requested data goes into mbufs, the mbufs are consumed in a callback. You can get the data arriving quickly and put into mbufs, callbacks can happen some time later. If too much data is requested it can exhaust mbufs before a callback is called causing dropped packets – I think this is more of a problem on GB networking. I’ve seen 250ms callback intervals. mbufs are shared between all sockets – in and out. |
Steve Pampling (1551) 8172 posts |
You connect a device to a network and the negotiation on capability is between the device and the network ingress port. Stick a 1Gb/s port at one end of a network and a 100Mbs port at the other and you find the 10Mb port getting data arriving faster than it expects with the local switch buffer (if any) filling and then packet drops with resends. |
Dave Higton (1515) 3534 posts |
I think I should clarify a couple of things. First: although the hourglass comes on, the machine still multi-tasks as smooth as silk. So nothing appears to be being blocked. Second: I’m not streaming data to the machine willy-nilly; I’m requesting or sending chunks, normally 32kiB or 64kiB, via OS_GBPB 4 or 2. So, unless the OS is doing some read-ahead, I don’t see why the buffers should be overwhelmed. But I will use showstat when I run the comparison again – thank you, Colin, that gives a lot of useful info. |
Dave Higton (1515) 3534 posts |
I RMKilled ShareFS and ran a long compare again. The hourglass still appeared (though, as always, multi-tasking was still fully operative). As before, one file was read via LanMan98, the other from local SSD. showstat -a shows hardly any small mbufs in use, no large mbufs in use, and no mbuf exhaustions. |
Dave Higton (1515) 3534 posts |
Two files from local SSD, ShareFS still NOT running: no hourglass during the transfer. |
Dave Higton (1515) 3534 posts |
I rebooted so that everything is as normal, and re-ran the tests with two files on local SSD. No hourglass. So it looks like the issue is related to the network, but not specifically to ShareFS. That is on the assumption that it was enough to rmkill ShareFS – would I need to kill Freeway (or anything else) in addition? I really don’t know exactly what Freeway does. |
Pages: 1 2