Polling/Vector Speed

36 posts, 8 voices

Pages: 1 2

May 28, 2015 5:26pm Malcolm Hussain-Gambles (1596) 811 posts	It would be lovely if RISC OS could poll (e.g. using vectors etc. in modules, or wimp_poll_idle) at a finer rate than 100Hz. It would be good for networking. It seems that the maximum read is 17k, which means the maximum read speed is around 1.5MB/sec. The best I’ve seen on 100Mbit is 11MB/sec and on gig 120MB/sec [yes, that does require effort – and obviously there is 10Gbit]. I’m sure there are other uses, but having faster networking for NAS devices etc. would be great [especially on the “next-generation” machines] I appreciate getting close to 100MB/sec is not likely to happen for awhile, or ever, but getting towards 50MB/sec [multitasking] would be lovely. At the moment sitting on the NULL event is the only way I can see to get vaguely reasonable speeds. It doesn’t seem to have a seriously detrimental effect on desktop performance, but it just seems wrong. So I’m hoping there are other reasons people would like this, would it be a candidate for a bounty? Whether it would need wimp_poll_not_so_idle and os_finevector – rather than changing the existing calls, I haven’t got a clue!

May 28, 2015 6:36pm Rick Murray (539) 13840 posts	I think the “typical” way around this is to wrap the data transfer stuff in a module which can work in the background (maybe off an interrupt or event) which buffers the data and then sets a pollword for the Wimp task to notice and respond to. Certainly, one cannot expect RISC OS to poll within any specific time frame – ChangeFSI and NetSurf both are known for hourglassing, and it’s all bets off if something pops up a standard error box. ;-)

May 28, 2015 7:39pm Rick Murray (539) 13840 posts	To continue – one of the problems of RISC OS is that there is no reliable mechanism for calling a piece of code after a set amount of time has expired. Sure, we have CallAfter and CallEvery to call every >=2cs and we could hook to TickerV for centisecond timing, but the problem is that we may well be called in the middle of some non-reentrant code which seriously limits what OS facilities are available – which is all the more important these days as USB devices are implemented as filesystem devices, and the filesystem just can’t be used in that way (it’ll throw “Filecore in use” errors). So we have our nice CallAfter, CallEvery, or TickerV and we need to use them to schedule a Callback. The difference here? Callback gets back to your code when RISC OS is “idle” and it is safe to use most of the OS facilities. The caveat? It will get back to you in n where n can be anything from nanoseconds to entire seconds. There’s no way of knowing. Couple this with borked events (I’ve never ever had any joy with being notified of MIDI data via USB, I still poll the device as that’s what reliably works even though it sucks as a method), it does make some aspects of programming a whole big bag of joy. ;-)

May 29, 2015 10:36am Malcolm Hussain-Gambles (1596) 811 posts	From my perspective, I can predict fairly reliably when there will be data on the sockets to read. newsuk does this to some extent by modifying the poll_idle time depending on the amount of data read off the socket. However for faster reads the only choice is to capture all polls, on a RISC PC this may not be many – however on a pandaboard around 1 in 10 polls are pointless and on the upcoming machines this is likely to increase. At the moment I’m using a combination of wimp poll_idle, poll and modifying the mask dependant on whether there is any traffic and the incoming traffic speed, it would be nice just to be able to use poll_idle (or poll_idle_fine) There is also a calculation of secondary buffer sizing/end disc speed to add into the timings mix as well. i.e. writing 17k blocks to an SD Card is not good for speed and thus smoothness of downloads and desktop, as well as reducing writes to the card. I don’t really care if some other program causes jams or error boxes, that will freeze the download – irritating, but nothing I can do about that. My understanding about the poll word, was the wimp task has to be polling for it to spot it? If that is the case it’s kind of pointless, as it would probably be just as efficient just to do a select (for client side, possibly not if you were doing server side stuff waiting for incoming connections)

May 29, 2015 10:57am Colin (478) 2433 posts	You only need a wimp poll event (pollword non zero or null) to determine if data is available in the first place. After that you keep reading the socket until you receive all of your data or get an EWOULDBLOCK message. If you feel that you are going to hog the processor too long add a wimp poll every so many bytes or cs. You don’t need a wimp poll for every data read/write.

May 29, 2015 11:09am Malcolm Hussain-Gambles (1596) 811 posts	That’s not really true, you can only read a maximum of 17k at a time. So you have to read, then return control to the wimp – otherwise you lock up the machine (e.g. downloading a 2GB file, it would lock up for quite awhile) Then later check again, if you poll in CS then the maximum read speed is 1.5MB/sec. So yes you do need a poll for every read/write, unless I’m missing something????

May 29, 2015 11:15am Colin (478) 2433 posts	If what you say is true then networking wouldn’t work in a single tasking program which it does. The fastest way to transfer data is to continually read data in a loop. Wimp polling is just an extra function in that loop which slows your loop down.

May 29, 2015 1:20pm Rick Murray (539) 13840 posts	was the wimp task has to be polling for it to spot it? If that is the case it’s kind of pointless, as it would probably be just as efficient just to do a select It is better for your transfers to poll (not pollidle) at a rate that depends upon how fast data is coming. The problem is when you do not have data being transferred – do you keep pollidling or do you wait for an external notification. Both are workable, though which you use depends upon how the code is written – the poll word method presupposes some sort of back end.

May 29, 2015 1:25pm Malcolm Hussain-Gambles (1596) 811 posts	Sorry I’m not explaining, of course it will work in single tasking – that’s exactly what my current d4r program does. But if I write a wimp program that sits on a socket and reads until it’s finished, the wimp will lock up until it’s finished, if I’m downloading a 2GB file that means I can’t use my computer for quite awhile. Of course you can use a task window, but that is really awful. Reading sockets in single tasking is an easy process, what is not is dealing with them in a co-operative multitasking system and keeping the WIMP smooth – which means no “hourglass”, returning control back to the OS as quickly as possible. @Rick is it possible to poll at a certain rate?!!!! I’d never even thought that was possible! I’ll do some more digging with wimp_poll then. The way I do it at the moment, is if the transfer rate drops below 1.5MB/sec then I start using poll_idle and alter the idle time depending on the amount of data in the buffer. i.e. if the buffer reads are 12k or above then it means we are reading at a reasonable rate so we use wimp_poll with null event. If the buffer reads go below 9k then the poll_idle time increases until the reads go above 12k. That’s a basic overview, anyway. There is a lot more going on in reality….and a few “what ifs” added in for good measure to prevent it constantly bouncing between wimp_poll null events and idle :-(

May 29, 2015 2:27pm Colin (478) 2433 posts	I write a wimp program that sits on a socket and reads until it’s finished, the wimp will lock up until it’s finished. Which is why I said that if you are hogging the processor you need to add the odd wimp poll during the transfer. Your method seems complicated to me. If you are waiting for data, wait for a null event with wimp_pollidle. Once you are receiving data keep reading it until you have it all. If you feel this hogs the processor too much throw in a wimp_poll at a frequency of your choice. You can’t get high speed and multitasking – it’s a trade off one against the other.

May 29, 2015 2:48pm Malcolm Hussain-Gambles (1596) 811 posts	I’ve managed to get close to high speed and multitasking well. The maximum speed I’ve got on a pandaboard is around 5MB/sec (sitting on a socket and reading) Multitasking I’m getting around 4-4.5MB/sec. I suppose the real test would be to fire up my pi2 (as that I can get 11MB/sec sitting on a socket), but it’s currently my openelec box – so getting access is difficult at the moment! I agree my method is complicated, but it seems to sustain a good balance. My assumptions (which seem to be logical to me) are: The only things that you need to do for networking is ensure the buffer is emptied as quickly as possible. The buffer is filled via an IRQ request by the network card to the driver on incoming traffic, this blocks once its (17k) buffer is full. I’m unsure of whether the 17k buffer is per socket or overall, I’ve not started looking at parallel usage yet. That’s my next step, which will complicate things much more. The other issue is that writing to SD Cards is slow and also I suspect it shares the same channel as the ethernet, hence the use of a secondary (large) buffer using flex. The complicated nature of this, is why my end goal is to have this as a module. That way the whole complexity is hidden. Well that’s not entirely true, the main goal is to have fun!

May 29, 2015 9:34pm Rick Murray (539) 13840 posts	@Rick is it possible to poll at a certain rate?!!!! I’d never even thought that was possible! Depends what side you are looking at it from. You can pass a number of centiseconds to PollIdle; but you can also keep track of the time taken in your program – for example you can set up a CallAfter to notify you in, say, 5cs, at which time you set a flag for the main thread to notice and when it does so, you poll (normal poll, to return on a Null event for the next bit). This is a little more fiddly as you’ll crash the machine if you poll with the CallAfter pending; however it removes the need to continually check the MonotonicTime yourself; though that method works too. Consider it a poor man’s timeslice algorithm. The way I do it at the moment, is if the transfer rate drops below 1.5MB/sec then I start using poll_idle and alter the idle time depending on the amount of data in the buffer. Shouldn’t it depend more upon how fast the machine can respond, not how fast the transfer is running at? Else you run the risk of replicating that annoying NetSurf behaviour where it whinges about how slow the disc cache is on a system that more than likely would benefit from a cache on anything faster than a floppy disc… Yes, SD writes may be slow. But, then, so is my internet. It runs at 2mbit flat out, and I don’t think the Pi (Vonets adaptor, very weak signal) has ever managed anything close to 1.5mbit (I’m guessing this is something in the order of 180KiB/sec?). I’m unsure of whether the 17k buffer is per socket or overall, I think RISC OS allocates 16KiB for each socket, though I cannot check as this crashes the machine when used in a TaskWindow (neverending errors), and returns zero (!!?) when used in ShellCLI *and then* the machine crashes*. `DIM x% 3 SYS "Socket_Getsockopt", sock%, &FFFF, &1002, x%, 4` Useful. Not. For anybody interested, returning from ShellCLI reports that Application may have gone wrong (but there’s no indication of what application is faulty) and if you press Quit a few times (this is a repeating error), the report is: Error Internal error, no stack for trap handler: SWI &205110 not known, pc=&FC15F3C4; registers at 0009600C If you instead press Describe, the message is SWI &205110 not known. If you dismiss this enough times, it goes away and then “the desktop collapses”, namely everything crashes in turn until you are left with a blank black screen and no way to get out of it. The address given (and it is always the same) is within WindowManager. Here’s a video I made earlier: http://www.youtube.com/watch?v=ALV3uLjI0Do You can, if you are brave, adjust the size of the socket using setsockopt* and the “limit” mentioned in the PRM is applied in the sbreserve() function and the value “sb_max” is defined in TCPIPLibs:sys.h.socketvar as `(2561024)` which implies* that you could have a socket buffer of up to 256KiB… the main goal is to have fun! Oh, yes, I’m having lots of fun seeing what doesn’t sanitise inputs so the end result becomes a horrible gooey mess. ;-)

May 30, 2015 7:41am Colin (478) 2433 posts	for example you can set up a CallAfter to notify you in, say, 5cs, at which time you set a flag for the main thread to notice How is that any different than just reading the monotonic time every null event or indeed setting pollidle so that it polls every 5cs? Seems to me that you are just creating a 5cs clock. Setting a flag in the background won’t improve the latency between 5cs being up and you being notified of it in a null event. The only reason to use a module – which you would need for callafter – is to be notified of events via pollword non zero which has the effect of giving your event top priority ie pollword events happen before desktop events, null events after.

May 30, 2015 10:45am Rick Murray (539) 13840 posts	How is that any different than just reading the monotonic time every null event or indeed setting pollidle so that it polls every 5cs? Simple. This is in code like the following: while there is data read some data from the socket if we have taken too long wimp_poll endwhile So, it isn’t done in the regular polling loop, it is done when you are active in order to determine when to force a poll so that the system will keep running while downloading something large; and being based upon time, it will be more consistent than polling every n KiB (which will be highly dependent upon connection speed). Seems to me that you are just creating a 5cs clock. That’s exactly what I’m doing. Setting a flag in the background won’t improve the latency between 5cs being up and you being notified of it in a null event. You do know, I trust, that if you are not writing an application in BASIC and you remove the event prior to polling (if it is pending), you can schedule a CallAfter into your own code in application space? No module needed, and yes, I’ll grant you, it would be pretty dumb to wait for a poll event to know when you should poll. That doesn’t make any sense. ;-) So your code could check for the “timerexpired” variable in its activity loop, and if found to be non-zero, perform a poll. And your CallAfter code? A tiny bit of assembler that will set the “timerexpired” to non-zero and also set “timerpending” to zero (so you know you don’t need to clear it before polling). While this is more complicated than simply reading the ticker yourself, it does remove the overhead of calling OS_ReadMonotonicTime every time you go through the motions. Whether or not you think this is a concern is up to you. ;-) Calling the SWI repeatedly would have the same effect; thus it is why I suggested both options. Check your own time use, or let the OS prod you. Either will work… Thus, your activity could be something like: set_timer while (timerexpired == 0) do_stuff endwhile or: timenow = ReadMonotonicTime while (ReadMonoticTime < (timenow + 5)) do_stuff endwhile

May 30, 2015 12:08pm Colin (478) 2433 posts	So, it isn’t done in the regular polling loop Doesn’t matter where you do a wimp poll you can write the code to use the main loop if you are that way inclined – I used a multi threading system in FTPc all driven by the main loop. I agree that time is the limiting factor and not transfer rate. That’s exactly what I’m doing. As you show using a OS_ReadMonotonic time is the same as using callafter so why use it when we already have a monotonic clock to time things, and no I don’t think it’s a concern calling OS_ReadMonotonic time given the number of swi calls used in interrupt contexts. Most OS_ReadMonotonicTime calls save a wimp poll – the main user of cpu time – and the time taken by the call will probably be taken up by the background process filling the buffer. Just my 2p I just don’t see the need to over think this. On a minor point I think you should use while (ReadmonotonicTime – start_time < 5) as it copes with time wraparound. It may not matter in this case but one day it might.

May 30, 2015 5:13pm Malcolm Hussain-Gambles (1596) 811 posts	@Rick Why not just come out and say it? It’s not being brave, it’s just a sign of insanity. So that sounds just up my street ;-) Controlling the buffer size (up to 256k) would be a great help. The main reason for being a module is to allow this complex and nasty stuff to be totally abstracted. Writing a really cut down version of this in newsuk was messy. I’m heading towards writing a generic RSS reader… For RSS readers, if you were to download all the feeds and data at once… Say 100 feeds, each feed say has 30 news items – each with graphics. NewsUK has quite a lot of feeds off the bbc website, so 100 feeds is fairly reasonable. Thats over 3000 concurrent downloads required… Obviously there is going to have to be some logic as well as techincal solutions to reduce that, which I’ve worked out – which should make it smooth and fast. Once I’ve got all that coded – I’m fairly sure I won’t want to look at any socket code (at that level) for awhile. So sticking it in a module makes the most sense.

May 30, 2015 5:33pm Rick Murray (539) 13840 posts	@Rick Why not just come out and say it? It’s not being brave, it’s just a sign of insanity. So that sounds just up my street ;-) Sometimes the harder ways can be more fun to play with. Controlling the buffer size (up to 256k) would be a great help. Good luck with that. If reading the buffer size trashes the machine… ;-) Thats over 3000 concurrent downloads required… Not really. Remembering the old Google Reader and to a degree how Google’s News app behaves – I would say it connects to retrieve the articles, then defers the images until they would actually be visible. There is no point fetching an awful lot of images for things the user may not even look at. For example with Google News I have Headlines, UK, Sci/Tech, Health, Europe, France and Japan as my categories. However the main content that I look at is the composite view of the top few items in each category; and the times that I look in categories, I rarely read the Health one. It is only there for big stories like “Coffee proven to make men impotent” followed the next week by “Coffee proven to have no effect on sterility”, you know how it is… Anyway, perhaps fetch on demand would be better than trying to preload everything? Once I’ve got all that coded – I’m fairly sure I won’t want to look at any socket code (at that level) for awhile. ;-) I felt like that the last time I went near the font manager. So sticking it in a module makes the most sense. Background downloads FTW!

May 30, 2015 7:17pm Malcolm Hussain-Gambles (1596) 811 posts	I agree the harder ways are far more interesting. Trashing my machine is a normal day for me. I tend to test things in bizarre ways, if I can see some way that it could crash I modify the input (even if it’s wrong) until it crashes, then fix it. I’ve spent a long time with the header analysis on my current version (possibly too long) I’ll probably spend even longer on my time-slicing algorithms – as I want to get the maximum number of time slices possible, so I can do other things in between with the data as it’s being read in and if possible never write the original data to disc, or have delayed writes. There’s so many ways of optimising things, you have to be careful to keep things relatively simple! One of the other things you can do is check that http headers to see if they have been updated, for both the RSS and the images. Building image caches is another Ordered loading (so it may load all RSS feeds immediately, as they are fairly small, then load the graphics in time order) – as attempting 3000 concurrent downloads is plainly mad, even if it is theoretically possible. Definitely having the default option to load on demand, but be able to preload certain feeds is a great idea. Background downloads is definitely the goal, no worries about SSL/TLS, no worries about anything. I want this data, now please get it….I hope I get there!

May 30, 2015 8:17pm Rick Murray (539) 13840 posts	I agree the harder ways are far more interesting. Trashing my machine is a normal day for me. That is quite often the case for me – though trashing things is… rarely by intention. More a case of the infamous Tea Buffer Exhausted error. I tend to test things in bizarre ways, if I can see some way that it could crash I modify the input (even if it’s wrong) until it crashes, then fix it. Well, there is logic to it. Unless your code is completely self contained and won’t have any external input or user facing API, you cannot guarantee that the input is going to make sense. Sanitising input is something RISC OS perhaps ought to do more of. ;-) I’ve spent a long time with the header analysis on my current version (possibly too long) Oh, I don’t know. All the people who were saying how much better Opera was than Firefox were silenced in 2010 when it was discovered that a problem in header parsing could lead to the browser being hijacked (remote execution). In order to downplay this, Opera said that it would rarely be a serious security issue as the browser would crash or be terminated. Unfortunately they were relying upon DEP which the lower-end systems either didn’t support, or only supported for specific tasks. In other words, the only safe option was to cease using Opera until the problem was resolved. https://security.ias.edu/opera-browser-content-length-header-buffer-overflow-vulnerability-newly-released-1050 Buffer overflows and the like are almost always the cause for zero-days. In our grace, to be able to attack your program (or NetSurf, etc) through a buffer overflow – should one be discovered – would require extensive knowledge of how RISC OS uses the stack, how to insert code and jump to it, and so on. There are many more lucrative targets and easier ways (like an app that just requests details of your contacts, etc). So, yes, sanitising input is a good thing. With my server (the one very briefly shown in the video linked above), I receive countless uninvited visitors. Thankfully most are silent because the login process doesn’t look or feel like a Unix login so the script obviously stalls. Unfortunately the script is stupid and frequently the same IP will repeatedly connect until the “firebomb” trap is triggered, at which point the IP address will receive a rejection notice (and they carry on trying to connect for a while). Thankfully nobody has tried to do malicious things yet, but I cannot assume that they won’t. I might dig up my VisualBasic install to see what happens if I dump large amounts of data to the server in one glob, and other nasty stuff like that… Why? Because it is fun? No. Because I want the thing to be more bulletproof in its interactions with the outside world. Just assume that the guy knocking on your door has a sledgehammer and wants to smash up stuff for the lulz. I am not going to try stuff like flood ping with malformed packets – that’s out of my hands. Let’s hope the Internet module isn’t (too) buggy. I’ll probably spend even longer on my time-slicing algorithms […] There’s so many ways of optimising things, you have to be careful to keep things relatively simple! Not to mention the variety of different setups that you may encounter – a 30-something MHz ARM6 is going to have a little less oomph behind it than a Pandaboard. Likewise, what might work nicely on a 20 megabit connection may be horrible at 33 kilobit (modem speed). Building image caches is another Definitely. as attempting 3000 concurrent downloads is plainly mad, even if it is theoretically possible. Theoretically? Maybe on the sort of hardware Google and Facebook use; I think RISC OS would flake long before then (isn’t there a limit something like 56 handles at one time? or was that the DCI2 stack?). As for Windows, I found this: Prior to Windows 7, any Windows desktop version of operating system limited you to only allow up to 10 concurrent connections maximum, or only 5 in Vista Home Basic edition. Has this changed in Windows 7? Sure it does. With Windows 7, all editions support 20 concurrent connections for services such as File Services, Print Services, IIS, Internet Connection Sharing, and Telephony. [ https://social.technet.microsoft.com/Forums/windows/en-US/18667011-c034-43bc-ab2e-0e87bf811e5e/windows-7-increase-the-limit-of-concurrent-tcp-connections-not-related-to-eula-file-sharing?forum=w7itpronetworking ] Mmm, NetSurf’s RISC OS caveats section says that the number of sockets that can be open may be as low as 64 [ http://wiki.netsurf-browser.org/Caveat_RISC_OS ]. I can’t find any specific mention of socket limits in any official documentation… suffice to say that 3000 at once… it ain’t happ’nin’.

May 31, 2015 3:20pm Malcolm Hussain-Gambles (1596) 811 posts	The other nice thing is it’s likely, given the workload and given my internet speed – It should probably “feel” like 3000 concurent downloads. 20Mbit isn’t that modem speed? I’ve got 120Mbit+ internet feed at home ;-) Most of the graphics on RSS are tiny, so the bottleneck is going to be image translation rather than anything else. That’s going to have to be timesliced and organised in a similar way to downloads as well. Modifying the buffer size to low sizes (3-12 bytes) helps to simulate a slow connection quite nicely. That’s what I’m testing at the moment. As far as different CPU speeds, I do test my programs on a RISC PC, Pi/Pi2 and pandaboard. But the target device is a pandaboard, so the main aim is to have it running smoothly on that. This isn’t aimed at anyone, just a vent whilst the topic of old hardware has been brought up. So please feel free to ignore this. Whilst I do like to support legacy hardware and it is nice to support it, there is no point in looking back. There’s plenty of new hardware for RISC OS and that is where I think we should all be looking. IMHO if you want to run virtual or a RISC PC/Iynonix etc. then don’t expect any support in the future. There is no excuse at all now. I can understand if people want to run old software, then fine. But RISC OS needs new software, new hardware, new blood. A little harsh perhaps, but there are a lot of people doing some fantastic work on RISC OS at the moment. I’d hate to see them go away due to apathy.

May 31, 2015 11:58pm Alan Robertson (52) 420 posts	I do like to support legacy hardware and it is nice to support it, there is no point in looking back. There’s plenty of new hardware for RISC OS and that is where I think we should all be looking. This. +1 For all things to progress, decisions must be made around compatibility. Look how quickly ARM Holdings extend and deprecate their own processors.

Jun 1, 2015 7:12am Malcolm Hussain-Gambles (1596) 811 posts	The next on my to do list is pipelining – this should also reduce the number of connections vastly, whilst still allowing high concurrency rates. Fun, fun and more fun ;-) Not forgeting to implent redirect and range support as well, which are perhaps more important. But slightly back on thread, it would be lovely to have a finer polling speed!

Jun 1, 2015 9:54am Colin (478) 2433 posts	Do you mean finer polling speed of finer clock. You can use the hal timers for a finer clock. I can’t see any use for a finer wimp_pollidle. As I see it if you are reading data you always want wimp polls that you call while reading to return null events as soon as possible so you need wimp_poll. If data is not available notification of available data can be achieved via the internet event and pollword non zero which informs your program of data as soon as possible. Either way your program gets cpu time as soon as possible. If you are concerned about receiving a null event where no data arrives what is the problem file transfer is a legitimate use of null events if you can’t use them then when can you use them? May as well deprecate their use.

Jun 1, 2015 10:54am Malcolm Hussain-Gambles (1596) 811 posts	At the moment I have to use wimp_poll with null events and dis-guard 90% of them. If I use poll idle, then I can’t read the data off quickly enough. As cpu’s are increasing in speed, this issue is going to get worse and worse.

Jun 1, 2015 11:23am Colin (478) 2433 posts	If you are getting null events where you don’t have any data then you don’t need to use null events to continue your thread. You can use pollword non zero and the internet event to tell you when data is available. Personally I don’t care if 90% of null events do nothing the computer has to do nothing somewhere.

Pages: 1 2

Reply

To post replies, please first log in.

Forums → Wish lists →

Polling/Vector Speed

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options