Cooperative Multitasking
Lothar (3292) 134 posts |
While all want RISC OS to go for Preemptive Multitasking, under Windows / Linux the growing troubles with handling resources, makes programmers want to go for Cooperative Multitasking, by means of callbacks, background workers, cooperative threads, and co-routines: https://luminousmen.com/post/asynchronous-programming-cooperative-multitasking |
Paolo Fabio Zaino (28) 1882 posts |
Hey Lothar :)
Who’s “all”? I am good with RISC OS being Cooperative, the article you’ve shared seems to have reached a similar conclusion I had few years ago, for IoT and embedded applications. In the past, when RISC OS had a chance to become a popular desktop OS (we are talking 90s/end of 90s) I was one of the people who proposed the transition to pre-emptive multi-tasking, but this was for a number of reasons that were true at that time. Back at the end of the 90s:
However there was also a set of issues to move to pre-emptive:
Fast-forward to today… Right now there are still people who wish to improve RISC OS as a modern Desktop (and there is nothing wrong with that, except now it’s going to be a loooot of work to try to compete with OSes that have evolved for the last 30 years) Cooperative multi-task still has strength in certain markets like the fore-mentioned embedded software and IoT, but that is not a WIMP-orientated market. Cooperative multi-tasking can also very efficient for Desktop, don’t get me wrong, but that benefit comes with a price: The user has to be aware of what he/she is doing, the software has to be aware of what it’s doing. I think a potential good future for a RISC OS Desktop could be in between: User space Tasks scheduling in Cooperative fashion while the OS Kernel gets rewritten to be fully re-entrant and support a pre-emptive kernel threading model, this to allow a more responsive multi-core access as well as more responsive I/O operations. It could be possible to achieve this also using just cooperative approach, but that could results in an even more complex kernel code… Anyway thanks for sharing that article! :) |
Lothar (3292) 134 posts |
> to allow a more responsive multi-core access In my understanding, the current SMP module rather implements HMP, but even with this it should be possible, instead of having a WIMP task doing calculations in its main loop, through null reasons, that WIMP task could repeatedly start asynchronous threads on the remaining cores, for doing calculations, and just poll these threads completion flags. Nothing needs to be re-entrant or thread-save. And with 4 cores, should be 300% faster … But I have to admit, I do not quite yet understand the SMP module sample code enough, to try this, with my WIMP fractal demo. > A generic user may run “only god knows” what on his computer I heard a rumor, on the recent MacOS, to start an APP it needs to be signed, and ask for online starting permission from Apple … |
David Feugey (2125) 2709 posts |
Or even AMP. AMP is not a bad approach. CMT is not too, but perhaps it could be enhanced with time limits on tasks. A super Wimp2 :) |
Paolo Fabio Zaino (28) 1882 posts |
@ Lothar to allow a more responsive multi-core access That assumes the only thing the extra cores are running is your calculation routines. While the article you’ve shared implies a scheduler for the extra cores as well. So please let me give some considerations here just to make sure we are all on the same page:
The above are just some considerations/assumptions, there is more, but I am trying not to write an essay lol :D It is true however that if the Kernel could support thread scheduling on multiple cores allowing them also to compete it would be a big step forward for RISC OS, if nothing else just to interrupt our WIMP Task execution more often so that the WIMP would result more “PMT” like (this assumption comes from the fact that a task could still be running part of itself in a separate core for the code that doesn’t need to access the WIMP API) |
Paolo Fabio Zaino (28) 1882 posts |
@ David
Yup that’s the point of the hybrid approach. However it’s easier said than done lol, let me give some considerations for the case above:
As an example:
|
Lothar (3292) 134 posts |
In my understanding, the mentioned problems need not happen with an AMP approach: RISC OS and the WIMP will still see only the main core, so nothing changes. But a WIMP task, like ChangeFSI, could claim the additional cores, using the existing SMP_CreateThread. When this is done, nothing changes for the other running WIMP programs. But ChangeFSI will get much faster. Now lets say, Iris Browser is started, and wants to claim an additional core, for making a Youtube video faster. If all additional cores are already claimed, Iris Browser would run just as expected. If additional core can be claimed, Iris Browser will get faster. If ChangeFSI makes a mistake, its claimed additional cores will get locked. But ChangeFSI should recognize that additional cores completion flags do not come, and call SMP_DestroyThread for them. If this does not work, RISC OS will crash. But currently, RISC OS crashes anyway, if an error cannot be handled. So nothing changes :-) There is no need to show messages of taking too much time. Even though there are such annoying messages in Windows. Instead, the Task Manager could show claims on the additional cores. And Task Manager Quit should call SMP_DestroyThread There may be one problem with the current SMP module. Its threads are not limited to one thread per additional core. Therefore threads must regularly call SMP_Yield. In my view, only one thread per additional core should be allowed. Because with SMP_Yield, two threads could lock each other, and therefore lock the additional core: https://gitlab.riscosopen.org/jlee/SMP/-/blob/master/docs/SMP |
Paolo Fabio Zaino (28) 1882 posts |
@ Lothar Interesting thoughts, thanks for sharing. About your concerns with the current SMP module, yup the README reports that. A bit more details below. The actually thread_yeld function calls spinrw_write_lock to set up a write spinlock, now that function is from SyncLib and will wait forever if it can’t set the write spinlock (is designed to do so). Such function also disables IRQs and has to be executed in privileged mode, hence there is also a context switching involved to call thread_yeld (if the thread is in user-space). However spinrw_write_lock should sleep the thread that is waiting to get the write lock, hence it’s possible that, with IRQs still disabled, and other threads trying to join there might be situations where that core could be made unaccessible and if this happens and the thread is not asynchronous with the main thread on WIMP Core then the WIMP Task may stuck and freeze the Desktop. But if the WIMP Task is asynchronous with such thread then the WIMP Task may just wait for a signal back and so keep working regularly without completing its job (which would be more like nobody knows what’s happening) which is a different outcome than the crash scenario and not what we want in both cases. Using a single thread per core is surely safer, but will make a user confused on the performance, because there will be big bumps: when a core is available then performance could increase a lot and when it’s not available then performance may not increase at all and the software in execution may result much slower and for no apparent reasons… So as you can see here comes the famous “price to pay” for these approaches. The user is required to know what he/she is doing and so does the software. The big rise of PMT simply solved all the issues from the user perspective and the generic developer perspective hence convenience at the price of optimisation and latency, but for years CPUs and computers got faster and faster and so this was not a concern. Just my 0.5c |
Lothar (3292) 134 posts |
> but will make a user confused on the performance, because there will be big bumps It will then just feel like Windows, when “Windows Modules Installer Worker” or “Defender” pop up and take near 100% load on all cores :-) |
Paolo Fabio Zaino (28) 1882 posts |
loooool :D |
David J. Ruck (33) 1636 posts |
This a good point. You still are not going to be able to port large applications such as web browsers and have them work at a reasonable speed even on high end hardware, unless they can do proper PMT and SMT. Even for the trivial stuff that I write, I’m not going to re-structure code which works everywhere else just for RISC OS if it uses some neither fish nor fowl scheme of CMT on the primary core and yielding threads on other cores. |
Steve Pampling (1551) 8172 posts |
Everybody is lazy, programmers habitually so, that’s why they create libraries instead of recoding and testing each time. I think what you meant was “sloppy in what they did in the code and equally sloppy in their testing.” |
Paolo Fabio Zaino (28) 1882 posts |
@ Druck
Absolutely correct, which is why for the Cloverleaf project and the RISC OS Direct that seems to want to push RISC OS Desktop to the masses I am not so sure it’s being done the right way. I repeat my self (sorry for being pedantic), right now RISC OS has potentials for embedded applications and IoT with the smallest number of changes. While, for an effective modern Desktop experience, it probably needs some re-thinking, where probably means I am trying to be well mannered and not aggressive, but with the complexity that CMT + AMP will introduce on both the average Jo user and the general developer, I think RISC OS may not have a huge success as a modern Desktop, I hope I am wrong. On top of that we have the security issues, so really not sure what to say guys, however I hope for the best as always. |
Charlotte Benton (8631) 168 posts |
Ironically, technological “progress” may be pushing things in a favourable direction, thanks to the recent trend of doing f*****g everything with f*****g web applications. If the Iris project delivers a browser capable of both running all the usual “productivity” stuff, and facilitating all modern ways of picking fights with complete strangers, then RISC OS as a whole would be far more usable. To this end, a good multitasking model (albeit as a sticking-plaster rather than a long term solution) might be “Iris, when running, gets to do whatever the hell it likes”. |
Stuart Swales (1481) 351 posts |
Soon we will have Irisc OS. |
Rick Murray (539) 13850 posts |
What could possibly go wrong?
If forcing preemption, every X centiseconds since the last poll unless the program says otherwise. If you’re wondering about the “unless the program says otherwise”, think how printing works…
Yeah, I’ve come across that. Not often, but it can happen on XP. Trying anything with the program gets the “bong!” error sound and redraws go to hell with bits of stuff splattered all over.
I disagree. How and why the multitasking works is not a user problem (unless said user is a programmer). I run apps on my phone. What goes on inside? NMFP, I just want the app to work.
Surely this is just an implementation problem? Threads, when created, ought to be shared out as necessary with no requirement of the program to know what core it is running on. Why? Think what ought to logically happen on a single core machine. |
Paolo Fabio Zaino (28) 1882 posts |
I agree with your point, but I also have to mention that thanks to the continuous changing I have a job! Otherwise we would still be using Dec vt100 terminals ;) |
Paolo Fabio Zaino (28) 1882 posts |
How does that check should work? Rick I wish it was that simple, here is a real life example for you then:
(hint they are both working fine, just waiting for the OS to give them their mutex back…) In another thread you’ve mentioned Tannembaum, the example above is THE example of how PMT can gets complicated. The solution in this case is to use graph theory and map all the resources on a system and blah blah blah, tons of literature on the matter Now this means that on RISC OS not only we need to add the PMT, but also the above and believe me when I say to get that stuff right is a job on its own, you guys have mentioned zombie processes… let’s think about it for a moment, grab your mug of tea, have a sip, think about it… and yes coffee for me!!!! :D |
Paolo Fabio Zaino (28) 1882 posts |
@ Stuart
loooool true! :D |
Paolo Fabio Zaino (28) 1882 posts |
@ Rick The user is required to know what he/she is doing In a CMT that means no help/support from the OS itself on a number of things:
Are the above enough or do you need more? (not trying to change your opinion, just providing more details to explain mine) Just my 0.5c |
David J. Ruck (33) 1636 posts |
What you are describing isn’t porting, it’s re-writing from scratch for an OS which works in a completely different way to anything else. That was tried and failed when browsers were 1/1000th of the complexity they are now. The only way to port something is to produce compatible versions of the support libraries it uses, and to have an OS which doesn’t lack fundamental features or needs a completely different way of working. Specifically we are lacking SMT i.e. threads which can use all OS APIs, and all the side effects of PMT such as blocking IO which RISC OS can’t do only having CMT. |
Steve Pampling (1551) 8172 posts |
It’s almost paid for this place, probably a “snap” at Chez Ruck and a number of others round ‘here’ |
David Feugey (2125) 2709 posts |
True and not so true. For Firefox, for example, there is one codebase that could be easier to adapt than the others: the Android version. |
Charlotte Benton (8631) 168 posts |
Quite a lot of things, I readily concede. |
Rick Murray (539) 13850 posts |
Well, there’s your first problem. RISC OS is basically a one-thing-at-a-time OS. How we multitask is a matter of smoke and mirrors and not making assumptions (like expecting font, colour, or OS_GBPB position 1) to be the same between polls. Hell, one cannot reliably assume file handles are constant across polling (especially if something naughty did a *Close and a bunch of handles were reopened with one of them using the same handle you’re using). How will it work in practice? Well, there’s the question. That process will probably stop pending something somewhere servicing what it is asking for.
Yeah, a resource deadlock. A possible way around that is not to have processes that require resources from other processes that require resources from it. After all, this isn’t exactly a Unix system. ;-)
Yup. Read that. Didn’t he call it the Diner’s Problem or something?
You don’t really “get” zombie processes on RISC OS. The application will either crash (and the OS usually takes care of error message and recovery, even these days trying to gloss over the backtrace gibberish) or it will appear to freeze. At which point the user can press Alt-Break to try to kill the app. [ personally, I don’t bother with the process manager, I fire up ProcessExplorer and do it directly ] So, you were saying what about CMT requiring more user knowledge? It’s not an issue of what type of multitasking is in use, it’s an issue of how the system is designed. For example, RISC OS can (usually) survive power cuts or pulling the plug, so long as this doesn’t happen at the exact instant of a disc write. XP, on the other hand, needs to be shut down gracefully. I made a fair bit of tea-and-biscuits money back 15-odd years ago dealing with XP bluescreening to UNMOUNTABLE_BOOT_VOLUME because some twit yanked the plug rather than waiting a minute or two to shut down properly (of course, trying to get proper clueless users to understand that “Shutdown” is behind the “Start” button is a battle in itself). When using a computer, the user needs to have some idea of how to do certain tasks. Things can (and will) go wrong, and each problem at user level (in other words not BSOD stuff) will have a solution that should work. How to get rid of unwanted/errant tasks. You know, swipe-up on iOS or swipe-sideways on Android (though I think they’ve changed it again) once you’ve brought up the list of active applications. And yeah, that’s something else useful to know. Whether it is CMT or PMT is really not relevant in the discussion.
My XP is really quite stable, and only dies horribly if I use USB serial and USB networking at the same time. One or other of them is a lame-ass driver that is broken. On the other hand, I got a LiveCD version of Ubuntu to kernel panic simply by trying it out and running a few apps. (…LOL)
Which is why the OS helpfully provides an hourglass. It’s smart enough that it won’t turn on for a third of a second, so you can drop it into your program as necessary, and it will only appear when the activity is slow enough to be noticed. But most of all, it’s probably worth looking at the algorithm to work out if there are places where one can drop in a few calls to Wimp_Poll, set to return immediately on a NULL event (see, it does have a use). This means the app can do it’s stuff and the desktop keeps on chugging. That’s how Manga parses its big list ’o stuff.
There’s such a thing as buffering and flow control. I never had a problem sending and receiving emails while intensive operations were ongoing, like debatching news or processing fido packets. The machine (an A5000, 25MHz ARM3) would stutter as the job was done, but by and large everything kept on going. An email client probably shouldn’t make silly assumptions regarding the ‘speed’ of the data link, the user might be using a 14k4 modem, for example. As for the internal time out of a socket, I think it’s a good long time. Not long enough to support carrier pigeon packets, but long enough to deal with hiccups and glitches. I’ll let Authentic Steve chime in here, he probably knows this.
UDP doesn’t give the guarantees of TCP. Anything using it ought to be either managing itself, or able to cope with missing packets.
Copy-paste the above paragraph regarding the hourglass. :-)
I am looking at this specifically from the point of view of RISC OS. There will soon be a method to allow software to reside on other cores. It should also involve task switching (so that it can correctly run on n cores, where n is >=1, spreading out the workload as necessary). That’s where I’m approaching this from. So, yeah, if we’re going to add any sort of PMT to RISC OS, it’s going to be PMT in the very loosest definition possible. Basically enforced time slicing, akin to how Wimp2 did things. And, note, that ran into a number of difficulties regarding message/event queueing and needed to do some hacky patching to handle interrupting something reading from file. More details elsewhere, suffice to say, “it’s not going to be easy”. 3 1 Special exception for FilerAction, because that’s far from the only dumb assumption it makes… 2 Not strictly true, some module tasks can run in place, modules themselves, and utilities (that load into the RMA). But anything with filetype Absolute (&FF8) expects to start at &8000, and BASIC programs start up “sort of there” (usually ~&8F00 after some workspace for BASIC to use). 3 Defining “not easy” as “a ground up rewrite with an entirely different API that can actually do this stuff in the sort of way that a preempted system needs in order to function”. That kind of not easy. |