RISC OS Open: Forum: Slow Pandaboard Transfer Speeds

Jul 5, 2020 5:39pm

Chris C. (2322) 197 posts

Hi All,

Trying to troubleshoot my PandaBoard ES transfer speeds. I max out at about 900Kb/s on my 50Mb connection. I went through troubleshooting the PandaBoard ES this morning, made a fresh firmware card with the 5.24 ROM.

I’ve got a 1G switch connected to my Cable Modem, the router shows a 100M connection from the PandaBoard ES, and the PandaBoard ES confirms the same. Any ideas or tweaks I can do to speed up transfers or is that speed normal? I’ve tried transferring a 100MB file from

https://speedtest.tele2.net/100MB.zip and it takes roughly 20 minutes. Cables are about 2 years old and so is the router.

Some stats:

*help etherusb
==> Help on keyword EtherUSB
Module is: EtherUSB 0.41 (21 Mar 2018) James Peacock

*ejinfo
EtherUSB driver for USB network adapters, version 0.41, by James Peacock
DCI version 4.07, prefix ‘ej’, maximum 16 units
Supported backends:
SMSC75xx – SMSC 75xx
SMSC78xx – SMSC 78xx
SMSC95xx – SMSC 95xx
MCS7830 – MosChip MCS7830
Pegasus – Pegasus USB Ethernet Adapter
AX88772 – ASIX AX88772
AX88172 – ASIX AX88172
CDC – USB Communications Class Ethernet [Not functional yet]

ej0: SMSC95xx, USB bus 1, device 7, Devices:$.USB7, up

Interface driver : ej
Interface unit : 0
Interface location : USB bus 1, device 7, Devices:$.USB7
Interface EUI48 : 02:1B:9C:xx:xx:xx
Interface backend : SMSC95xx
Interface media : 100baseTX full duplex
Interface polarity : Correct
Controller mode : Multicast
Packets sent : 15032
Packets received : 32648
Bytes sent : 831602
Bytes received : 47207984
Send errors : 0
Receive errors : 0
Undelivered packets : 744
Send queue overflows : 0

Standard clients:

Type 8035 (AddrLvl 01, ErrLvl 00) handler=(fc33487c/30003504)
Type 0806 (AddrLvl 01, ErrLvl 00) handler=(fc33487c/30003504)
Type 0800 (AddrLvl 02, ErrLvl 00) handler=(fc33487c/30003504

Jul 5, 2020 7:49pm

Timothy Baldwin (184) 242 posts

The RISC OS implementation of TCP is slow.

What client software are you using? Netsurf is/was slow.

What is the round-trip time for this connection? You can measure that using ping.

Jul 5, 2020 8:38pm

Chris C. (2322) 197 posts

I used wget, FTPc after I read about NetSurf being so slow.

Jul 6, 2020 1:46pm

Chris C. (2322) 197 posts

*ping speedtest.tele2.net
PING speedtest.tele2.net (90.130.70.73): 56 data bytes
64 bytes from 90.130.70.73: icmp_seq=0 ttl=53 time=170.000 ms
64 bytes from 90.130.70.73: icmp_seq=2 ttl=53 time=160.000 ms
64 bytes from 90.130.70.73: icmp_seq=3 ttl=53 time=160.000 ms
64 bytes from 90.130.70.73: icmp_seq=4 ttl=53 time=160.000 ms
64 bytes from 90.130.70.73: icmp_seq=5 ttl=53 time=170.000 ms
64 bytes from 90.130.70.73: icmp_seq=6 ttl=53 time=160.000 ms

speedtest.tele2.net ping statistics -

7 packets transmitted, 6 packets received, 14% packet loss
round-trip min/avg/max = 160.000/163.333/170.000 ms

Jul 6, 2020 3:03pm

Martin Avison (27) 1498 posts

I would not have thought speed would have much effect on a ping.
I tried it on this Titanium over a GB LAN and got:

speedtest.tele2.net ping statistics -

101 packets transmitted, 96 packets received, 4% packet loss
round-trip min/avg/max = 12.148/14.167/72.882 ms

Jul 6, 2020 3:08pm

Chris C. (2322) 197 posts

OK, I found and tried a speed test closer to me (California) and got a much better result. 100mb download test at about ~900KB/s now that’s what I am talking about. The other tests results must be poor due to their distance from me.

Jul 6, 2020 3:23pm

Rick Murray (539) 13862 posts

Not necessarily distance.
When running speed tests to see what the line was getting (in the days before the Livebox told you such things), I used to choose a server in Cornwall rather than the one a stone’s throw up the road. Because I’m guessing the local one is on a congested line shared with many people, or maybe there are just numerous hops before it gets to the backbone, or maybe it’s just a crap server. Either way, the one four times further away in a different country across a large body of water…was reliably faster.

Jul 6, 2020 3:37pm

Rick Murray (539) 13862 posts

Just remembered, I have Ookla Speedtest on my phone.

Unfortunately it shows a list of “nearby” servers (under 500km away), it’d have been interesting to have tried other countries (like SKorea, US).

Anyway…

The results are backwards (most recent at top). So the first test that I ran was to Cowes, Isle of Wight. It managed a pretty decent speed. Technically the best of the lot.
The second test was to a nearby server in Rennes, Brittany. Hosted by Orange, who is my service provider. A much smaller ping time, which could make a difference on today’s web where lots of little bits are pulled from everywhere (meaning lots of requests to different servers). Interesting to see that it’s the worst download speed. Honestly you’re unlikely to notice in reality, but if you think closer server means faster data, yeah, about that…
The final test was to Welwyn Garden City, one of the further away servers in the list. It’s ping is slightly better than Cowes, but it’s download rate is closer to (but still better than) the local server. Upload is abysmal, I’m guessing it might be an anomaly of the server or its connection, perhaps it is more restricted one way than the other?
Either way, the obvious morals of this story are:

Closer isn’t always better
Never rely upon just one result to tell you the truth
Don’t get obsessed over local servers in speed tests because “the intertubes” are everywhere, and some countries have an infrastructure barely better than acoustic modems. ;-)
…while the citizens of other countries lose their <beep> if the latency is enough to upset the behaviour of their gaming.

Jul 6, 2020 3:53pm

Chris C. (2322) 197 posts

I think I’m set. Just want to make sure everything is setup correctly on my end. It was puzzling why I was topping out at about. The fun part was that I had a 5.22 ROM on my PandaBoard ES SD card but somehow I was still booting and showing 5.24 until I changed the SD card for a fresh one. Got that all sorted now. Phew.

Jul 7, 2020 1:18am

Chris C. (2322) 197 posts

Kind of curious to what you see on your end.

wget http://speedtest-ca.turnkeyinternet.net/100mb.bin

*ping speedtest-ca.turnkeyinternet.net
PING speedtest-ca.turnkeyinternet.net (209.240.111.30): 56 data bytes
64 bytes from 209.240.111.30: icmp_seq=0 ttl=54 time=20.000 ms
64 bytes from 209.240.111.30: icmp_seq=1 ttl=54 time=10.000 ms
64 bytes from 209.240.111.30: icmp_seq=2 ttl=54 time=20.000 ms
64 bytes from 209.240.111.30: icmp_seq=3 ttl=54 time=10.000 ms
64 bytes from 209.240.111.30: icmp_seq=4 ttl=54 time=10.000 ms

speedtest-ca.turnkeyinternet.net ping statistics -

5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 10.000/14.000/20.000 ms

Jul 7, 2020 9:53am

David J. Ruck (33) 1637 posts

ChrisC: The RISC OS network stack is certainly slow, we can run Linux on Raspberry Pi’s and ARMx6 devices and compare directly with RISC OS on the same hardware. But it’s not that slow, on my Mini.M RISC OS can manage up to about 200 Mbit/s upload and download to other local gigabit devices, so clearly it can more than saturate a 100MBit Ethernet.

However, you wont get anywhere close to measuring the speed your internet connection using a RISC OS browser, they are all just too slow. Most of the speed measurement pages rely on a lot of javascript, and javascript engines are slow on RISC OS. Even just doing a simple file download with Netsurf only does a few hundred KBytes per second on a connection which all the other machines can get 9.1MBytes/s from.

Your best bet is using a file transfer protocol rather than a browser, such as FTP, but check from another machine as well as some ISPs throttle FTP heavily. If your router supports a VPN to a remote site where you can download files with Lanman, that will also show better results.

Jul 7, 2020 10:31pm

Timothy Baldwin (184) 242 posts

But it’s not that slow, on my Mini.M RISC OS can manage up to about 200 Mbit/s upload and download to other local gigabit devices, so clearly it can more than saturate a 100MBit Ethernet.

But that is a local network with presumably sub-millisecond round trip time, the route that Chris was testing had 160 milliseconds of round trip time. The bandwidth that RISC OS can achieve is inversely proportional to the round trip time.

TCP flow control works by the receiver informing the sender how many more bytes it may send. RISC OS will give the sender permission to send just 17376 bytes, and the sender will not send more than that until it receives an acknowledgement. It might go as follows:

RISC OS sends permission to send 17376 bytes.
That message takes 0.1 seconds to reach the server.
The server sends 17376 bytes.
Which also takes 0.1 seconds to reach RISC OS.
RISC OS then requests another 17376 bytes.

That results in 17376 * 5 = 86880 bytes per second.

You can test it locally using an artificial delay such as NetEm in Linux, as I discuss here and here.

Also see the Wikipedia article on Bandwidth-delay product.

Jul 8, 2020 1:21am

Chris C. (2322) 197 posts

I saw your article Timothy. Cool stuff. I wanted to try some transfers with UDP just for fun, just to see the speeds. For now, I’m satisfied with the results I got. I’ll take a look at that article.

Jul 8, 2020 10:17am

David J. Ruck (33) 1637 posts

The real question is why is RISC OS so much worse at on higher latency connections than other OSes? My point was locally it can just about exceed 100Mb Ethernet, but over a 72Mb/s broadband it can be 50x slower than the same device running Linux.

Jul 8, 2020 11:12am

Colin (478) 2433 posts

My guess is lack of PMT.

Without PMT when sending the ethernet driver has no way of giving control back to the desktop so when the device is overrun with packets from riscos it has to drop them and rely on retries which slows things down. There should have been a system to inform the socket module that a packet wasn’t consumed so that it could issue an EWOULDBLOCK and it looks like the driver has that feature – but it doesn’t work.

When receiving you can only read from the device during the interrupt until mbufs are exhausted then the device starts dropping packets so the other end has to retry – slowing things down. The mbufs are processed in a callback (which makes using sockets from a module problematic). Then you have to wait for the app to be paged in to consume the mbufs freeing them for receiving more data.

I’ve never programmed PMT devices but envisage a system where interrupts and processes have their own thread making the whole system much simpler – though my model of PMT may be utopian.

DMA would probably help somewhere in the chain but it’s beyond my paygrade to figure out where.

I sometimes see mbufs cited as being a problem I don’t think so they are just the socket module’s solution to buffering interrupt data back to user mode which all drivers face.

Thats how I see things anyway.

Jul 8, 2020 12:07pm

Rick Murray (539) 13862 posts

My guess is lack of PMT.

My guess is a lack of competent buffering. Remember, it’s not just networking that is slow. The filesystem is too (one of the main reasons I’ve not built myself a newer ROM, it takes forever to delete/copy the thousands of files).

What worked with 10baseT and PIO harddiscs doesn’t really scale to modern networking and storage technologies.

Jul 8, 2020 12:13pm

Rick Murray (539) 13862 posts

but envisage a system where interrupts and processes have their own thread making the whole system much simpler

Perhaps simpler and easier for the programmer, but within the system the processor is only able to do one thing at a time (we’ll gloss over multiple cores and hyperthreading for now), storage devices only handle one request at a time, etc etc.
It’s still smoke and mirrors giving an illusion of multitasking, it’s just different coloured smoke and fancy engraved mirrors, that’s all.

But, all that said, Ethernet ought to be capable of much better buffering and the apps able to understand that data might come in in 256K chunks. That way, nothing needs to be dropped unless things get really badly held up.

Jul 8, 2020 12:32pm

Colin (478) 2433 posts

Yes but with threads you have flow control riscos has no flow control. When sending buffers have little impact. When an app is paged in and sent something it is essentially a single tasking machine and unlike receiving where you have to get data from an interrupt context back to the app, sending should have a direct link with the device and ‘should’ be as fast as you can get. However the driver is written with the riscos multitasking environment in mind and does not block if it can’t put data on the device and I think that is the main bottleneck. If the send was in a thread the thread could block until the device was free so retries were avoided.

One problem I see with mbufs is that send and receive share an mbuf pool so you could get the situation where receiving exhausts mbufs and replies can’t be sent as there are no mbufs to carry the data to the device.

Jul 8, 2020 1:18pm

Jeffrey Lee (213) 6048 posts

I was hoping I’d have something more useful to contribute, but here it is anyway:

This thread and the linked ticket has some notes about my previous investigations into network performance (on a local, low-latency connection)
FIQProf is good for seeing what the CPU’s doing, assuming you have a supported machine (Iyonix, OMAP3, iMx6, maybe Pi), and preferably can build your own ROMs (profanal can extract the symbols from the source tree to give file, function & line locality to addresses).

A FIQProf profile of a (single-tasking?) file upload/download over a high-latency link might allow you to quickly identify where all the busy-wait loops are and what the triggers are for leaving those loops. FIQProf made it trivial to identify that USB mass storage was slow because SCSISoftUSB was issuing one USB transfer per TickerV tick

So basically my contribution is “stop speculating and profile the code, you big dummies!” ;-)

Jul 8, 2020 1:34pm

Colin (478) 2433 posts

I wish I didn’t have a goldfish memory then I could save repeating my drivel and the internet would have a little less to clog it up.

Jul 8, 2020 2:48pm

Colin (478) 2433 posts

It would be handy if you could limit the size of the buffer used by fileraction other than the next slot for transferring files. If I transfer a 1GB file with LanManFS from my armx6 to my pi4 using raspian with a usb3 HDD it takes 182secs with the 4MB next slot I normally use and only 58secs with a 640k next slot. You can see the transfer pausing with the 4M next slot.

Jul 8, 2020 3:15pm

David J. Ruck (33) 1637 posts

I’ve got a 16MB slot on my machines so I can do GCC stuff in TaskWindows without messing around, and most of the copying is done with !DirSync which uses a buffer the size of the next slot.

I assume the reason why a smaller buffer is faster, is that is the only situation RISC OS can do some degree of overlapped I/O, i.e. read up to the socket buffer size of data while the data is being written to disc, or writing out a TCP buffer while the disc is being read.

The use big buffers thing probably dates back to copying files between floppy discs with a single drive, but as RISC OS has never supported overlapped disc I/O (i.e. reading from one file while writing to another) which is affective with small buffers, we’ve never changed the disc tools. Now those disc tools are being used with remote filing systems, they are working non-optimally.

Jul 8, 2020 4:23pm

Rick Murray (539) 13862 posts

and the internet would have a little less to clog it up.

Every day, YouTube serves a billion hours of video to users (and about 70% of them being mobile devices).
Every minute, five hundred hours worth of video is uploaded to YouTube. Which implies seven hundred thousand hours worth of new video every day.

And this is only YouTube. There’s also Facebook, Instagram, blah blah, not to mention all sorts of foreign services like Dailymotion and Youku.

That entire message of yours is smaller than Google’s tracking cookie.

Just putting your comment into context. ;-)

Jul 8, 2020 4:41pm

Colin (478) 2433 posts

I’ve been trying to wrap my head around this for ages. The etherth device has a 16 packet ring buffer (about 24kB) so I don’t see dma being a factor – it’s not like USB where you can give it a large block of memory and say fill it and do something else in the meantime – which riscos doesn’t. Generally you get to the stage where a packet buffer isn’t available and you have to drop the packet – in the etherth driver there is a tiny delay before dropping in the hope that a packet buffer becomes available and you don’t have to drop. This would imply that the transfer is going at 1Gb/S except that it is not as you have to drop packets and that causes delays.

320kB next is slower than 640kB next slot – it’s about 70% of the speed – and 4MB is 33% of he speed, with the current system without any way to stop TCP retries (flow control) I can envisage a situation where a better stack takes up less CPU time, is faster, but transfer speeds are worse.

I usually get to this point in my ponderings I decide to accept riscos for what it is and forget about its problems – and hope Jeffrey fixes them :-).

Edit 480kb/s not 1GB/s armx6 uses the USB clock for Ethernet.

Jul 8, 2020 5:31pm

Steffen Huber (91) 1958 posts

Edit 480kb/s not 1GB/s armx6 uses the USB clock for Ethernet.

480 MBit/s hopefully :-)

Slow Pandaboard Transfer Speeds

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Jul 5, 2020 5:39pm Chris C. (2322) 197 posts	Hi All, Trying to troubleshoot my PandaBoard ES transfer speeds. I max out at about 900Kb/s on my 50Mb connection. I went through troubleshooting the PandaBoard ES this morning, made a fresh firmware card with the 5.24 ROM. I’ve got a 1G switch connected to my Cable Modem, the router shows a 100M connection from the PandaBoard ES, and the PandaBoard ES confirms the same. Any ideas or tweaks I can do to speed up transfers or is that speed normal? I’ve tried transferring a 100MB file from https://speedtest.tele2.net/100MB.zip and it takes roughly 20 minutes. Cables are about 2 years old and so is the router. Some stats: help etherusb ==> Help on keyword EtherUSB Module is: EtherUSB 0.41 (21 Mar 2018) James Peacock ejinfo EtherUSB driver for USB network adapters, version 0.41, by James Peacock DCI version 4.07, prefix ‘ej’, maximum 16 units Supported backends: SMSC75xx – SMSC 75xx SMSC78xx – SMSC 78xx SMSC95xx – SMSC 95xx MCS7830 – MosChip MCS7830 Pegasus – Pegasus USB Ethernet Adapter AX88772 – ASIX AX88772 AX88172 – ASIX AX88172 CDC – USB Communications Class Ethernet [Not functional yet] ej0: SMSC95xx, USB bus 1, device 7, Devices:$.USB7, up Interface driver : ej Interface unit : 0 Interface location : USB bus 1, device 7, Devices:$.USB7 Interface EUI48 : 02:1B:9C:xx:xx:xx Interface backend : SMSC95xx Interface media : 100baseTX full duplex Interface polarity : Correct Controller mode : Multicast Packets sent : 15032 Packets received : 32648 Bytes sent : 831602 Bytes received : 47207984 Send errors : 0 Receive errors : 0 Undelivered packets : 744 Send queue overflows : 0 Standard clients: Type 8035 (AddrLvl 01, ErrLvl 00) handler=(fc33487c/30003504) Type 0806 (AddrLvl 01, ErrLvl 00) handler=(fc33487c/30003504) Type 0800 (AddrLvl 02, ErrLvl 00) handler=(fc33487c/30003504

Jul 5, 2020 7:49pm Timothy Baldwin (184) 242 posts	The RISC OS implementation of TCP is slow. What client software are you using? Netsurf is/was slow. What is the round-trip time for this connection? You can measure that using ping.

Jul 5, 2020 8:38pm Chris C. (2322) 197 posts	I used wget, FTPc after I read about NetSurf being so slow.

Jul 6, 2020 1:46pm Chris C. (2322) 197 posts	*ping speedtest.tele2.net PING speedtest.tele2.net (90.130.70.73): 56 data bytes 64 bytes from 90.130.70.73: icmp_seq=0 ttl=53 time=170.000 ms 64 bytes from 90.130.70.73: icmp_seq=2 ttl=53 time=160.000 ms 64 bytes from 90.130.70.73: icmp_seq=3 ttl=53 time=160.000 ms 64 bytes from 90.130.70.73: icmp_seq=4 ttl=53 time=160.000 ms 64 bytes from 90.130.70.73: icmp_seq=5 ttl=53 time=170.000 ms 64 bytes from 90.130.70.73: icmp_seq=6 ttl=53 time=160.000 ms speedtest.tele2.net ping statistics - 7 packets transmitted, 6 packets received, 14% packet loss round-trip min/avg/max = 160.000/163.333/170.000 ms

Jul 6, 2020 3:03pm Martin Avison (27) 1498 posts	I would not have thought speed would have much effect on a ping. I tried it on this Titanium over a GB LAN and got: speedtest.tele2.net ping statistics - 101 packets transmitted, 96 packets received, 4% packet loss round-trip min/avg/max = 12.148/14.167/72.882 ms

Jul 6, 2020 3:08pm Chris C. (2322) 197 posts	OK, I found and tried a speed test closer to me (California) and got a much better result. 100mb download test at about ~900KB/s now that’s what I am talking about. The other tests results must be poor due to their distance from me.

Jul 6, 2020 3:23pm Rick Murray (539) 13862 posts	Not necessarily distance. When running speed tests to see what the line was getting (in the days before the Livebox told you such things), I used to choose a server in Cornwall rather than the one a stone’s throw up the road. Because I’m guessing the local one is on a congested line shared with many people, or maybe there are just numerous hops before it gets to the backbone, or maybe it’s just a crap server. Either way, the one four times further away in a different country across a large body of water…was reliably faster.

Jul 6, 2020 3:37pm Rick Murray (539) 13862 posts	Just remembered, I have Ookla Speedtest on my phone. Unfortunately it shows a list of “nearby” servers (under 500km away), it’d have been interesting to have tried other countries (like SKorea, US). Anyway… The results are backwards (most recent at top). So the first test that I ran was to Cowes, Isle of Wight. It managed a pretty decent speed. Technically the best of the lot. The second test was to a nearby server in Rennes, Brittany. Hosted by Orange, who is my service provider. A much smaller ping time, which could make a difference on today’s web where lots of little bits are pulled from everywhere (meaning lots of requests to different servers). Interesting to see that it’s the worst download speed. Honestly you’re unlikely to notice in reality, but if you think closer server means faster data, yeah, about that… The final test was to Welwyn Garden City, one of the further away servers in the list. It’s ping is slightly better than Cowes, but it’s download rate is closer to (but still better than) the local server. Upload is abysmal, I’m guessing it might be an anomaly of the server or its connection, perhaps it is more restricted one way than the other? Either way, the obvious morals of this story are: Closer isn’t always better Never rely upon just one result to tell you the truth Don’t get obsessed over local servers in speed tests because “the intertubes” are everywhere, and some countries have an infrastructure barely better than acoustic modems. ;-) …while the citizens of other countries lose their <beep> if the latency is enough to upset the behaviour of their gaming.

Jul 6, 2020 3:53pm Chris C. (2322) 197 posts	I think I’m set. Just want to make sure everything is setup correctly on my end. It was puzzling why I was topping out at about. The fun part was that I had a 5.22 ROM on my PandaBoard ES SD card but somehow I was still booting and showing 5.24 until I changed the SD card for a fresh one. Got that all sorted now. Phew.

Jul 7, 2020 1:18am Chris C. (2322) 197 posts	Kind of curious to what you see on your end. wget http://speedtest-ca.turnkeyinternet.net/100mb.bin *ping speedtest-ca.turnkeyinternet.net PING speedtest-ca.turnkeyinternet.net (209.240.111.30): 56 data bytes 64 bytes from 209.240.111.30: icmp_seq=0 ttl=54 time=20.000 ms 64 bytes from 209.240.111.30: icmp_seq=1 ttl=54 time=10.000 ms 64 bytes from 209.240.111.30: icmp_seq=2 ttl=54 time=20.000 ms 64 bytes from 209.240.111.30: icmp_seq=3 ttl=54 time=10.000 ms 64 bytes from 209.240.111.30: icmp_seq=4 ttl=54 time=10.000 ms speedtest-ca.turnkeyinternet.net ping statistics - 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max = 10.000/14.000/20.000 ms

Jul 7, 2020 9:53am David J. Ruck (33) 1637 posts	ChrisC: The RISC OS network stack is certainly slow, we can run Linux on Raspberry Pi’s and ARMx6 devices and compare directly with RISC OS on the same hardware. But it’s not that slow, on my Mini.M RISC OS can manage up to about 200 Mbit/s upload and download to other local gigabit devices, so clearly it can more than saturate a 100MBit Ethernet. However, you wont get anywhere close to measuring the speed your internet connection using a RISC OS browser, they are all just too slow. Most of the speed measurement pages rely on a lot of javascript, and javascript engines are slow on RISC OS. Even just doing a simple file download with Netsurf only does a few hundred KBytes per second on a connection which all the other machines can get 9.1MBytes/s from. Your best bet is using a file transfer protocol rather than a browser, such as FTP, but check from another machine as well as some ISPs throttle FTP heavily. If your router supports a VPN to a remote site where you can download files with Lanman, that will also show better results.

Jul 7, 2020 10:31pm Timothy Baldwin (184) 242 posts	But it’s not that slow, on my Mini.M RISC OS can manage up to about 200 Mbit/s upload and download to other local gigabit devices, so clearly it can more than saturate a 100MBit Ethernet. But that is a local network with presumably sub-millisecond round trip time, the route that Chris was testing had 160 milliseconds of round trip time. The bandwidth that RISC OS can achieve is inversely proportional to the round trip time. TCP flow control works by the receiver informing the sender how many more bytes it may send. RISC OS will give the sender permission to send just 17376 bytes, and the sender will not send more than that until it receives an acknowledgement. It might go as follows: RISC OS sends permission to send 17376 bytes. That message takes 0.1 seconds to reach the server. The server sends 17376 bytes. Which also takes 0.1 seconds to reach RISC OS. RISC OS then requests another 17376 bytes. That results in 17376 * 5 = 86880 bytes per second. You can test it locally using an artificial delay such as NetEm in Linux, as I discuss here and here. Also see the Wikipedia article on Bandwidth-delay product.

Jul 8, 2020 1:21am Chris C. (2322) 197 posts	I saw your article Timothy. Cool stuff. I wanted to try some transfers with UDP just for fun, just to see the speeds. For now, I’m satisfied with the results I got. I’ll take a look at that article.

Jul 8, 2020 10:17am David J. Ruck (33) 1637 posts	The real question is why is RISC OS so much worse at on higher latency connections than other OSes? My point was locally it can just about exceed 100Mb Ethernet, but over a 72Mb/s broadband it can be 50x slower than the same device running Linux.

Jul 8, 2020 11:12am Colin (478) 2433 posts	My guess is lack of PMT. Without PMT when sending the ethernet driver has no way of giving control back to the desktop so when the device is overrun with packets from riscos it has to drop them and rely on retries which slows things down. There should have been a system to inform the socket module that a packet wasn’t consumed so that it could issue an EWOULDBLOCK and it looks like the driver has that feature – but it doesn’t work. When receiving you can only read from the device during the interrupt until mbufs are exhausted then the device starts dropping packets so the other end has to retry – slowing things down. The mbufs are processed in a callback (which makes using sockets from a module problematic). Then you have to wait for the app to be paged in to consume the mbufs freeing them for receiving more data. I’ve never programmed PMT devices but envisage a system where interrupts and processes have their own thread making the whole system much simpler – though my model of PMT may be utopian. DMA would probably help somewhere in the chain but it’s beyond my paygrade to figure out where. I sometimes see mbufs cited as being a problem I don’t think so they are just the socket module’s solution to buffering interrupt data back to user mode which all drivers face. Thats how I see things anyway.

Jul 8, 2020 12:07pm Rick Murray (539) 13862 posts	My guess is lack of PMT. My guess is a lack of competent buffering. Remember, it’s not just networking that is slow. The filesystem is too (one of the main reasons I’ve not built myself a newer ROM, it takes forever to delete/copy the thousands of files). What worked with 10baseT and PIO harddiscs doesn’t really scale to modern networking and storage technologies.

Jul 8, 2020 12:13pm Rick Murray (539) 13862 posts	but envisage a system where interrupts and processes have their own thread making the whole system much simpler Perhaps simpler and easier for the programmer, but within the system the processor is only able to do one thing at a time (we’ll gloss over multiple cores and hyperthreading for now), storage devices only handle one request at a time, etc etc. It’s still smoke and mirrors giving an illusion of multitasking, it’s just different coloured smoke and fancy engraved mirrors, that’s all. But, all that said, Ethernet ought to be capable of much better buffering and the apps able to understand that data might come in in 256K chunks. That way, nothing needs to be dropped unless things get really badly held up.

Jul 8, 2020 12:32pm Colin (478) 2433 posts	Yes but with threads you have flow control riscos has no flow control. When sending buffers have little impact. When an app is paged in and sent something it is essentially a single tasking machine and unlike receiving where you have to get data from an interrupt context back to the app, sending should have a direct link with the device and ‘should’ be as fast as you can get. However the driver is written with the riscos multitasking environment in mind and does not block if it can’t put data on the device and I think that is the main bottleneck. If the send was in a thread the thread could block until the device was free so retries were avoided. One problem I see with mbufs is that send and receive share an mbuf pool so you could get the situation where receiving exhausts mbufs and replies can’t be sent as there are no mbufs to carry the data to the device.

Jul 8, 2020 1:18pm Jeffrey Lee (213) 6048 posts	I was hoping I’d have something more useful to contribute, but here it is anyway: This thread and the linked ticket has some notes about my previous investigations into network performance (on a local, low-latency connection) FIQProf is good for seeing what the CPU’s doing, assuming you have a supported machine (Iyonix, OMAP3, iMx6, maybe Pi), and preferably can build your own ROMs (profanal can extract the symbols from the source tree to give file, function & line locality to addresses). A FIQProf profile of a (single-tasking?) file upload/download over a high-latency link might allow you to quickly identify where all the busy-wait loops are and what the triggers are for leaving those loops. FIQProf made it trivial to identify that USB mass storage was slow because SCSISoftUSB was issuing one USB transfer per TickerV tick So basically my contribution is “stop speculating and profile the code, you big dummies!” ;-)

Jul 8, 2020 1:34pm Colin (478) 2433 posts	I wish I didn’t have a goldfish memory then I could save repeating my drivel and the internet would have a little less to clog it up.

Jul 8, 2020 2:48pm Colin (478) 2433 posts	It would be handy if you could limit the size of the buffer used by fileraction other than the next slot for transferring files. If I transfer a 1GB file with LanManFS from my armx6 to my pi4 using raspian with a usb3 HDD it takes 182secs with the 4MB next slot I normally use and only 58secs with a 640k next slot. You can see the transfer pausing with the 4M next slot.

Jul 8, 2020 3:15pm David J. Ruck (33) 1637 posts	I’ve got a 16MB slot on my machines so I can do GCC stuff in TaskWindows without messing around, and most of the copying is done with !DirSync which uses a buffer the size of the next slot. I assume the reason why a smaller buffer is faster, is that is the only situation RISC OS can do some degree of overlapped I/O, i.e. read up to the socket buffer size of data while the data is being written to disc, or writing out a TCP buffer while the disc is being read. The use big buffers thing probably dates back to copying files between floppy discs with a single drive, but as RISC OS has never supported overlapped disc I/O (i.e. reading from one file while writing to another) which is affective with small buffers, we’ve never changed the disc tools. Now those disc tools are being used with remote filing systems, they are working non-optimally.

Jul 8, 2020 4:23pm Rick Murray (539) 13862 posts	and the internet would have a little less to clog it up. Every day, YouTube serves a billion hours of video to users (and about 70% of them being mobile devices). Every minute, five hundred hours worth of video is uploaded to YouTube. Which implies seven hundred thousand hours worth of new video every day. And this is only YouTube. There’s also Facebook, Instagram, blah blah, not to mention all sorts of foreign services like Dailymotion and Youku. That entire message of yours is smaller than Google’s tracking cookie. Just putting your comment into context. ;-)

Jul 8, 2020 4:41pm Colin (478) 2433 posts	I’ve been trying to wrap my head around this for ages. The etherth device has a 16 packet ring buffer (about 24kB) so I don’t see dma being a factor – it’s not like USB where you can give it a large block of memory and say fill it and do something else in the meantime – which riscos doesn’t. Generally you get to the stage where a packet buffer isn’t available and you have to drop the packet – in the etherth driver there is a tiny delay before dropping in the hope that a packet buffer becomes available and you don’t have to drop. This would imply that the transfer is going at 1Gb/S except that it is not as you have to drop packets and that causes delays. 320kB next is slower than 640kB next slot – it’s about 70% of the speed – and 4MB is 33% of he speed, with the current system without any way to stop TCP retries (flow control) I can envisage a situation where a better stack takes up less CPU time, is faster, but transfer speeds are worse. I usually get to this point in my ponderings I decide to accept riscos for what it is and forget about its problems – and hope Jeffrey fixes them :-). Edit 480kb/s not 1GB/s armx6 uses the USB clock for Ethernet.

Jul 8, 2020 5:31pm Steffen Huber (91) 1958 posts	Edit 480kb/s not 1GB/s armx6 uses the USB clock for Ethernet. 480 MBit/s hopefully :-)