Thinking ahead: Supporting multicore CPUs
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 26
Rick Murray (539) 13850 posts |
I don’t want to get banned, excommunicated or exorcised… <shhh>…isn’t this what The Other Guys did?</shhh> |
Steve Pampling (1551) 8172 posts |
As Rick said “what The Other Guys did” – or more accurately what Justin did, because I think he was largely(if not totally) working alone. |
Rick Murray (539) 13850 posts |
A longer reply now that I’m not trying to write something on a phone. How much interdependency is there in the kernel? If we are going to extract the OS_Byte/OS_Word into its own module, could we maybe make a crack at defining a more cohesive API for certain functions than a pile of magic numbers? Wouldn’t it be better to have a keyboard handler provide ReadKey, IsDown, Status (etc) instead of….whatever… various OS_Byte calls. The Others, I think, did this work to tidy up parts of the OS that were a bit higgeldy-piggeldy, our goal here is slightly different, and I think it could be extremely interesting if the core kernel was small and hived off most of the work to helper modules. Why? Well, think about multicore devices, it might be possible to start the kernel on the other cores with a different set of helper modules (that can deal with communication with the primary core). If, of course, this is feasible. ;-) Sorry, can’t offer to help, unless you can pull off some epic magic where RISC OS runs on the Pi and the second core is the Beagle xM communicating via Ethernet. :-P |
Steve Pampling (1551) 8172 posts |
http://www.jaffasoft.co.uk/wwv2/activeapps/ which references http://zeromq.org/ as a possible route for re-implementation. The latter does reference a license under LGPL so you may wish to run screaming from that. |
Chris Evans (457) 1614 posts |
Steve: I suspect you realise it, but for others:
The ‘largely’ part is I believe true, when talking of the Kernel. There were quite a few others involved in many other parts of the OS. It is interesting to note that he got quite a bit of criticism at the time for going for a microkernel. |
William Harden (2174) 244 posts |
Rick: If you don’t mind an international buy, there are a couple of A3 Pandaboards going on eBay for around £60: http://www.ebay.co.uk/itm/Pandaboard-Rev-A3-Board-Assy-750-2152-021-A-O1468-/231221217538?pt=LH_DefaultDomain_0&hash=item35d5dbc902 |
Ralph Barrett (1603) 154 posts |
Just bought one :-) GBP 103.00 with postage and duty – ouch ! :-( Now, what do I do with a Panda ? Ralph |
Jeffrey Lee (213) 6048 posts |
Yes, making things thread-safe would probably be a bit easier if we didn’t have OS_Byte/OS_Word making a mess of things. Of course, there’s nothing stopping us from defining (and implementing) a more cohesive API now, if you’re offering ;-) In terms of inter-dependencies in the kernel, I don’t think it’s actually that bad. From the various comments and bits of old code I’ve seen in the kernel sources I get the impression that the kernel actually started out as a microkernel, and then had all the other bits (VDU, CLI, keyboard drivers, etc.) bolted on to it during development (with some of those bits being prototyped as modules to begin with). So it shouldn’t be too hard to split things back up again. Even untangling OS_Byte/OS_Word shouldn’t be that bad, since they are both vectored SWIs. Of course developer time is limited, so diving headlong into splitting up the kernel and making the OS thread-safe probably isn’t the best decision. Instead, we should probably focus on defining the threading API and getting a bespoke microkernel running on the second core to handle thread scheduling. With a little bit of support from the host OS (should be possible using just a softloaded module) that would be enough to allow threaded apps to be developed (as long as the code on the second core only uses the threading SWIs!). Then from there we can tackle the problems that application developers raise – e.g. extend the microkernel so that thread-safe host-side modules can be called, get OS_Heap and the C library thread safe, etc. After all, there’s not much point making the whole OS thread-safe if it turns out that the multi-threaded code that people want to write is only interested in 10% of it. |
William Harden (2174) 244 posts |
Jeffrey – agree. I think what you are proposing makes perfect sense for many reasons. Firstly it will help with multi-threaded work. Secondly, if we ever had to transition to later versions of ARM and address incompatibilities (ie. run the ‘32-bitting’ gauntlet’) the smaller the core code needed to get the system up, the better. |
Greg (2474) 144 posts |
Im not anywhere near the skill sets of the guys that are bringing RISC OS up to date but while Jeffrey and friends are busy figuring out the best way forward with multi core should we also be thinking off moving into 64 bit. Maybe just bearing it in mind while the move to multi core takes place so as to make the move to 64 bit a tad smoother. Seems to make sense as the current 64 bit arm socs will run 32 bit code as well which should allow for an ideal platform to test on as the code could be a mixture of 32 / 64 bit. I assume this will be likely the next step after multi core |
Jeffrey Lee (213) 6048 posts |
On paper, producing a 64bit-only build of RISC OS is quite simple (just rewrite as much as possible in C and use 8/4 byte pointers as appropriate), but there’ll be a lot of work involved in order to make it happen. Getting 32bit programs to run within a 64bit OS will be a bit trickier. I think the only sensible way would be to go with the hypervisor approach and run a 32bit build of the OS which talks to virtual hardware provided by the 64bit host. With tight integration in the Wimp it shouldn’t be too bad from a user’s perspective. The only other alternative I can think of would be to reserve the low 4G of memory for 32bit code/data and allow (32bit-friendly) modules to be called from either 32bit or 64bit mode. I.e. so that when a module is called in 32bit mode it has to make sure any I/O buffers that are passed in/out are located in the low 4G of memory and only use 4 byte pointers. But I think that would turn into too much of a support burden for developers and require lots of nastyness to allow 64bit code to deal with the two different pointer sizes at runtime. |
David Feugey (2125) 2709 posts |
64 bit is cool, but virtualisation is even more… Think for example of small sessions of RISC OS, to host some applications. One session per core = AMP system. Many sessions with virtualisation = protected environment for applications. Even better than memory protection :) Hardware virtualisation is available with Cortex-A15 and more, and said to be easy to use (dixit an ARM executive). Perhaps it’s time to have a layer to implement AMP and virtualised systems? Nota: a virtualisation compliant version of RISC OS will also mean the possibility to use it on every Linux ARM system. Less cool than bare metal, but not bad either. |
Jess Hampshire (158) 865 posts |
If you are looking at virtualisation being the only option to run existing programs on a 64 bit system, wouldn’t it make more sense to implement that on an existing alternative system, (such as BSD) and add all the good bits of RISC OS to the host system, leaving out all the bits that aren’t so good? |
David Feugey (2125) 2709 posts |
Unfortunately, virtualisation will not transform a 32 bit monocore OS in a 64 bit multicore one. It’ll provide no benefits for RISC OS. Anyway, I agree on one point: RISC OS should be adapted to qemu-arm, so it will be available on almost every arm motherboard, with a few limitations. |
h0bby1 (2567) 480 posts |
aaaaa |
h0bby1 (2567) 480 posts |
aaaaa |
David Feugey (2125) 2709 posts |
That’s very interesting but don’t forget that existing applications (there are thousand of them) must work with the new scheduler. That’s why I did suggest ‘advanced CMT’ + ‘cluster like AMP’ and not CMT + SMP. |
h0bby1 (2567) 480 posts |
aaaaa |
h0bby1 (2567) 480 posts |
aaaaa |
David Feugey (2125) 2709 posts |
Perhaps it could be a good idea to apply all of this first on a non Wimp version of RISC OS? |
h0bby1 (2567) 480 posts |
aaaaa |
h0bby1 (2567) 480 posts |
aaaaa |
Rick Murray (539) 13850 posts |
Quick reply, I should be asleep now..!
Depends upon what you mean by “performance”. We accept that computers take a lot of electricity and time to switch tasks all over the place, because being able to run a number of favourite applications at the same time is more useful than running one application quickly. It’s a bit like how we accept slow “virtual memory” because the delays in swapping are better than the result of running out of memory.
That’s not so realistic when you consider what is actually going on inside a modern computer. Let’s just say that Windows threads carry an ID and ProcessExplorer can list them and… it would be a large number.
Depends upon your OS and level of multiprocessor support. At its most basic, the kernel-per-core could present each core as a separate “computer”. Applications running would think that they are the OS and the CPU and they would be unaware of the other stuff.
Except, perhaps, a RaspberryPi. ;-) [I’m referring to the FIQ “fix” to stop USB keyboards losing data; though how a 700MHz processor can’t keep up with typing is mind-boggling.
Mutex semaphores come in useful for arbitrating who can poke what and when.
Go tell that to Microsoft. Win16 was co-operative. Win32 was not. Yet Windows95 could run Windows 3.1 software. How? An emulation layer that provided a co-operative model…which was itself pre-empted. [more or less]
Or take the much better option that Windows XP started, namely fail out any application that so much as attempts to access hardware without jumping through the correct hoops. Seriously, people, if you are going to claim exclusive access to a piece of hardware and run your code as user level code in a pre-empted system, you deserve all the pain that you’ll have coming. RISC OS makes it easy to access hardware, but this ease comes with some responsibilities. Part of this is why we have a Device Claim Protocol.
The OS deals with this, not the application. Every desktop application “thinks” that it is loaded at &8000, and has from there to <WimpSlot> memory for their own use. Etc (there’s more, but that’s for another day).
Yes. I have an older MFC C compiler and I only ever used it to build DOS programs. Never figured out how to do anything much beyond the demos under Windows. |
h0bby1 (2567) 480 posts |
aaaaa |
h0bby1 (2567) 480 posts |
aaaaa |
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 26