Idea for discussion: Rewriting RISC OS using a high-level language
Andrew Rawnsley (492) 1445 posts |
You’ll have to forgive me – personally I’ve never really gone too far with C++ because my RISC OS work simply hasn’t needed it. I know Alan here does use it, but maybe within the tight confines of what’s available. From an early era (Cfront era) I was given the impression that C++ tended to result in somewhat over-blown, inflated and sub-optimal code, so I stuck with the simplicity of plain C. This may well no longer be the case, so please take my comments with that proviso. Ultimately that’s my concern with C++ – I see so many large, bloated C++ projects that I wonder if it can ever be a sensible choice for ARM dev. My recent experiences with Ubuntu on ARM was so treacle-like (made the RISC OS firefox port look quite reasonable performance-wise!) that it rather put me off high-level language-based OSs, but I suspect this may have rather more to do with the layers of stuff (X, Windows managers, compositors and so on) running on a modern Linux distro, rather than the core OS itself. Add to all of this, that historically Norcroft generated much more efficient ARM code than GCC (I believe the difference is much more minimal with GCC 4), and I ended up very biased. Not a good thing :( If C++ genuinely brings benefits in terms of stability without significant overheads, then I can see the desirability of that… if we can get the coders in to do it! |
Terje Slettebø (285) 275 posts |
Hi Andrew. Thanks for your thoughtful reply, much appreciated. I’d also like to commend you on your software, such as Rcomp’s Messenger Pro, which I find to be an excellent mail client. That just goes to show that it is possible to write excellent software also in C (RISC OS is another example). :) I come from a C background, myself, or actually a bunch of languages… I started out with BASIC on a FX-702P “pocket computer”, then moved to VIC-20, BBC Micro, etc. (I guess I’m dating myself, here… :) ). On the BBC, I started to learn a little 6502 assembly, but only a little, as writing 6502 assembly was quite elaborate, with only a single general-purpose register available, and two “index registers”… It’s a little miracle that the greatest space game of all time (IMHO), Elite, was written in it! At the end of the 80’s I got myself an Archimedes (after a brief “detour” to Amiga 1000, which I wanted to like, but never did), happy to be back to an OS that was familiar and well-designed… :) There, besides continuing with BASIC, I learned Pascal, and then C, which I liked much more than Pascal, as I found Pascal terribly “wordy”: You had to write a lot to get anything done. With C, you just did it! The terseness of C code appealed to my sense of aesthetics. Around the middle of the 90’s, I came across C++, and started learning about it. I’ve always been interested in computer graphics and animation, so at the time, I had written quite a bit of ARM assembly, as well (a processor I really find elegant), and I was starting to move towards C. Learning about C++, some of its features immediately struck me as useful for what I was doing, such as function inlining, classes and operator overloading. You see, I was working on a 3D graphics library, written in C (earlier written in ARM assembly), and I had code like this: struct Vector3D Then I had macros like this: #define set_vector3(v,xvalue,yvalue,zvalue) v.x=xvalue; v.y=yvalue; v.z=zvalue; And I used it like this: Vector3D result,a,b; set_vector3(a,1,2,3); The reason I used macros was to avoid the overhead of function calls for this simple code. As you can imagine, I didn’t find this particularly elegant… Then along came C++, and I realised I could write it like this: Vector3D a(1,2,3); Vector result=a+b; How’s that for elegance… :) Moreover, with a decent compiler, this should result in identical code as the C version, and indeed, on the PC it did, but sadly not using Norcroft (or GCC, at the time, which was an ancient version of the compiler, even at that time), on the Archimedes… This put me off using C++ on RISC OS, but although I haven’t actually disassembled the same code when compiled on GCC, today, I’d be very surprised if the the generated assembly code wouldn’t be practically identical to that generated from the C version. Yes, C++ has got a bad perception for resulting in bloated, slow applications, and the reasons are probably several:
However, it’s important to note that this last objection is not a fault with the language as such, but with the use of it. Bjarne Stroustrup has been quoted, saying: C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off. What he meant (as he explains there): “As you protect people from simple dangers, they get themselves into new and less obvious problems.” An example of a “simple problem” may be making sure always matching an allocation with a deallocation. This kind of trade-off is something each person will have to do for themselves (hence my “live and let live” comment :) ): Some people may prefer a simpler language, like C, even if it may result in having to write more code. C++ is a more complex language and harder to master (although it does have a gentle learning slope, if you start from C), but once you’ve mastered it, you may write pretty succinct code, that expresses your intention well, uncluttered by implementation details. There is the perception of some people that C++ is inherently slower than C, and at least with a modern compiler, that simply isn’t the case. C is effectively a subset of C++, so there’s no reason that the same C program, compiled using a C++ compiler, should not produce identical, or virtually identical results. Some features, like exception handling, may make the executable somewhat larger, but modern exception handling results in zero overhead as long as no exception is thrown, and the overhead – both in time and space – of a C program with proper error handling (essentially simulating exception handling) could easily be much larger than that of C++. To address these things, the C++ standards committee issued a Technical Report on C++ Performance a few years ago. |
Terje Slettebø (285) 275 posts |
I can also add that I suspect that people knowing how to program in assembly code (and willing to do so) is a dying breed, so if Jeffrey (and the other RISC OS contributors) were to hang up their RISC OS programming coats one day, and most of RISC OS is still written in assembly code, then I fear for RISC OS’s future… Reimplementing RISC OS in C/C++ is a large task, but it should be doable, given enough people working on it (although I have no illusions here, there should at least be more people who know how to program in C/C++ than there are people who knows ARM assembly programming), and such an undertaking could help secure RISC OS’s survival in the long term… With some help from the GCC SDK mailing list, I’ve been able to get a working example module using C++ (I had some link errors at first, which came from the linker being called with “gcc”, rather than “g++”; I’m still something of a newbie when it comes to GCC…). What I thought could be a good idea first was to pick a small module, and write a “proxy”: A version of the module that just passes on all the calls to the original module, to see if the function call standard and stack setup is compatible. Making such a proxy may be useful in any case: It would allow us to monitor what calls are being made, and perhaps implement modules on a SWI by SWI basis (letting the rest be forwarded to the original). Edit: I’ve found my small module: “ARM” Here’s the plan: 1. Develop a proxy, that just forwards to the original. 2. Reimplement the ARM module. I’ll be back… |
Terje Slettebø (285) 275 posts |
Having thought some more about this, I’ve come up with a potential plan for reimplementing RISC OS modules (or indeed any modules), that may be of interest to anyone else contemplating the same:
The idea is to essentially duplicate the work of the original module in the new module, and let them run together on the same input, and only when you see they produce the same result, switch over to the new functionality. This approach should reduce the risk of crashes or bugs, by not switching over to the new functionality until you’ve been able to determine that the new module works the same way as the old one. |
GavinWraith (26) 1563 posts |
Perhaps this could be called just-in-time reimplementation? ;) |
Chris (121) 472 posts |
Not sure if this is relevant to the discussion, but Graham Shaw did start work some time ago on rewriting the Wimp and Filer in C++. The source is held in a Subversion repository: see here (for the Window Manager source – other modules are accessible from the dropdown menu at the top of the page). It looks rather as if this work has ground to a halt, but it might be useful as a starting point. |
Steffen Huber (91) 1953 posts |
Graham also started RTK, which looked like a nice toolkit for C++ app usage. It had a real book as documentation, too. Graham also started RiscTerm with Kerberos authentication. And the Packaging Project along with the package manager. And a replacement Shared C Library. And these projects are only what I remember. Anyone heard of Graham recently? Would be interesting to hear which projects he is still interested in. |
Terje Slettebø (285) 275 posts |
Well, that’s certainly relevant… I’ve downloaded both and will have a look at them. It appears that the WindowManager project is barely started, but the Filer project has much more code, so anyone contemplating working on the Filer should definitely have a look there.
I’ll have a look at that one, too. Thanks for the pointers. :) |
Jeff Doggett (257) 234 posts |
Recoding the filer in a high level language should also have the side effect of increasing the maximum filesize from 2GB to 4GB. I’ve a suspicion that there’s at least one Risc OS coder that doesn’t understand the difference between instructions like BGT and BHI. |
Terje Slettebø (285) 275 posts |
Not to mention moving to 64-bit filesize… |
hilltop (573) 11 posts |
I suppose you could use OSLibSupport32 and “raise SIGOSERROR;”. So your code just has to use try/except (macros which map to setjmp/longjmp) and check for your function ‘erroring’ in the same way as a non-X SWI does. It could be argued that C++ support in OSLib at least is behind that for C – does anyone know of a way of catching a non-X SWI in C++? |
hilltop (573) 11 posts |
Not sure if you mean applications or OS level modules by ‘world’ here, but there is a FAQ in the realms of Linux along the lines of ‘why isn’t Linux (going to be) coded in C++’ which is answered by Linus himself in various places and provides maybe a cautionary tale. (He’s a C++ hater – git is also written in C.) Personally I’m inclined to think that the bar is higher for creating good code in C++ than something which works in C.
I wonder what there is in terms of source-available C++ software for RISC OS? I can only think of !PDF and the UPP code. !PDF has a Wimp class library, again I’m not sure of anything (other than the previously mentioned RTK) similar available for reuse/development. This means again, sadly, the bar is higher for C++. |
Terje Slettebø (285) 275 posts |
I assume it may be this one: http://www.tux.org/lkml/#s15-3 I’ll have a look at the arguments presented there.
I’m not sure what you mean, here, but I think it may be argued that C++ is actually a more beginner-friendly language than C, and that it may be easier to write good code there than in C (for both beginners and experts), because you have facilities that shields you from low-level details. Case in point: Try writing code for adding two strings (of arbitrary length) in C and C++. In C++, it’s simply: std::string a(…), b(…); std::string result=a+b; With C, you have to muck around with resource management and ownership issues. And this is just the tip of the iceberg… |
Jeffrey Lee (213) 6048 posts |
There’s also my decgen tool, and I’m sure there will be other bits and pieces released by other people. I think one of the big turn-offs for C++ on RISC OS is that compiling C++ (with GCC) takes a lot longer than compiling C code, so unless you’ve got a cross-compiler set up it’s a bit of a pain to have to develop in it.
Unfortunately, writing good code often requires you to know the low-level details. If you have no idea how std::string works, you may think that “a+b” is a valid way of concatenating two strings, no matter what situation you’re in. But if you know how it works then you’ll know that concatenating the strings will require a new heap allocation, something that may be impossible or inadvisable in certain situations. So a different approach to the code, or a different string class, is necessary. |
Terje Slettebø (285) 275 posts |
Having had a look at the arguments, they are as follows (note that this is about the Linux kernel, and as such only applies to a relatively small part of RISC OS, which could well stay in assembly or C, as far as I’m concerned):
Elsewhere in that entry it’s argued that today it does have such an efficient implementation, so this is unlikely to be a valid argument anymore.
That may well be the case, but as the number of developers on RISC OS itself is close to none, anyway, this may not matter much, as long as the principial developers are not strongly against using C++ for at least some of the OS.
C++ is not an “OO language”, and never was. C++ is a language with support for OO programming, but also support for data-abstraction, generic programming, etc. In short C++ is a multi-paradigm language, and no argument against using such a language for OS design is given there.
I agree that the potential for reduced performance is there (for any reimplementation), as well as for increased performance, actually (a “smarter” implementation), and that this would need to be benchmarked, and any necessary optimisations done. In general, however, the clearest code, the code that most clearly express your intent, gives the compiler the best chance for producing efficient code. C++ has an advantage here, in that it’s possible to put more of your intent into the code. One potential downside is that, just by looking at the code, it may be harder to see any performance problems (compared to the equivalent C code), since quite a bit of it is done “under the covers”, but a good knowledge of what C++ actually does “under the covers” certainly helps to avoid performance problems.
As the RISC OS kernel is written 100% in assembly code, the advantage of going from C → C++ does not have to be demonstrated, and the move from assmebly to C/C++ should be easier to justify, as long as the performance is about the same.
This is not an issue when the original is assembly code.
Yes, and the same goes for C, although it may be more obvious, there, since the size of the executable is perhaps easier to judge from the size of the code in C. C++ is a powerful tool, and as such must be used responsibly. I think this is a better approach, than to ban powerful tools because they might be misused…
That may be, given that the compiler does more for you, so you need to be aware of what it does, to avoid bloat and inefficiencies. I do think this education is well paid back by shorter development time and simpler development, though. Some comments from Linus Torvalds:
I’ve tried to find the explanation for this opinion, having searched the Internet, only endless quotes of this claim. Endless repetitions of the same claim doesn’t make it true… No C++ expert I know thinks that C++ exception handling is “fundamentally broken”. On the contrary, it’s generally regarded to be one of the more well-designed features of C++.
I see no disadvantage of having memory management handled automatically, it’s certainly not done behind your back, just less visibly in the code, so that you may focus on the application logic, rather than low-level details like resource management.
Indeed you can, just like you can write structured code in assembly, but that’s not an argument against using C rather than assembly, is it? |
hilltop (573) 11 posts |
AFAIK the UtilityModule (RISC OS kernel proper) was rewritten from scratch (in assembler) for RISC OS 3.7. Then major changes were made for 32bit/HAL in RO5 (ignoring branch issues). So yes, it’s a very small part of the total, isn’t ‘old code’ relatively speaking, and would be the last part of the system to be considered if an ‘outside in’ approach for rewriting (application suite first, then peripheral modules, then core, then kernel) was taken. From a mechanics issue I would have thought that the SharedCLibrary would have to be ‘higher up’ the module chain than any modules written in C/C++. This may sound obvious, but changing the module order may cause things to break.
I think there is a good deal of intertia to overcome (and my argument also was not just in RISC OS-land), not just the rights and wrongs. If an OS-level module written in C++ can demonstrate, in terms of features, stability and speed, its usefulness then acceptance would be a lot closer. Anyone for CDFS++? ;-) |
Alan Buckley (167) 232 posts |
I’d just like to say I do like this idea of being able to develop for RISC OS in C++. I’ve been programming on Windows and RISC OS in C++ for many years and will follow the developments of this idea with much interest. Once the techniques to get it all working have been proved and documented I will see if there is anything I can do to help. To add to the list of RISC OS C++ programs. Both my !PackMan and !PackIt programs are both C++. I’m also developing the C++ library I created for these programs (and some other personal stuff) and am hoping to release a version with better documentation and examples in the near future. The library is called TBX and the current version used in the above programs is available via the RISC OS Packaging Project already. I’m not sure if this TBX library would be useful for the RISC OS modules though as it is aimed at creating applications using the Toolbox. |
Terje Slettebø (285) 275 posts |
Yes, good point, so we’ll have to make sure that any module being dependent on is already initialised (I guess like it is today with SharedCLibrary, now that Edit, etc. is also in ROM).
Yeah, that was also why I intentionally didn’t want to get into discussions of specific language at the start of this thread (using instead “C/C++” as the term), because I’ve been in way too many “C vs C++” discussions, and I’d rather prefer to write code, than spending time discussing the merits of various languages… So, yeah, I think we should have the code first (regardless of the language used, C or C++), and after seeing how it looks and works, then let’s talk… :) |
Terje Slettebø (285) 275 posts |
Yeah, that’s a valid concern…
Yeah, ref Bjarne Stroustrup’s quote about blowing your whole leg off… :) C++ has to be used with care, I don’t deny that, and there’s a lot of “gotchas”, but then again, C has its own “challenges”, where you’re basically left to take care of things on your own, so I’m not that convinced that it takes more skill to write code in C++ than in C, but then again, it may also depend on the domain and the sensitity to things like resource usage.
Yeah, string concatenation should be done with the append operator, which here is “+=”. On the other hand, C has its own share of things you have to keep in mind, such as avoiding buffer overruns when concatinating strings, something you don’t need to be concerned with using std::string.
Yeah, you could say that inefficiencies tend to become more visible in C (since you have to do these things manually). On the other hand, the lack of facilities for abstractions in C also means that the intent of the code easily gets lost in a sea of implementation details, and that has its own problems, when it comes to being able to understand and working effectively with the code. If you don’t have an overview of what some code does, it may be harder to write efficient code, too, perhaps not seeing entirely different ways you may do something. |
Jeffrey Lee (213) 6048 posts |
If you’re looking for a non-trivial module to try and convert (either to C or C++) then may I suggest SCSIFS? :-) One of the things on my todo list is to add support for background file transfers, which will then allow us to enable FileCore’s disc caching code, which should make file access quite a bit faster on the BB. But the current SCSIFS source is a bit scary, so I was thinking about rewriting the module in C first. Although the module is fairly simple, it does need to interface with the FileCore & SCSISwitch modules, so it could be a good place to start if you want to get some practice with how to deal with unusual function call interfaces. |
Terje Slettebø (285) 275 posts |
No kidding… :) I’m afraid I would be rather “over my head” working on that one, unfortunately, at least as a first project… My experience lies more in higher-level abstractions, and not so much in lower-level, hardware/device-driver kind of code, where I really don’t have much experience… I was thinking of taking a stab at WindowManager, but I’ve come to, again, that that’s probably way too ambitious as a first project… In particular because it deals with things like setting up task contexts, swapping tasks in and out of the application space, etc., things I don’t know that much about. The “ideal” project for me I think would be something relatively small and well-defined in terms of functionality, and I’ve come to that trying to reimplement OS_SpriteOp could give useful experience… Now, I understand that sprite handling (unfortunately) is in the kernel, and therefore can’t be easily changed by loading a new module (although you may be able to handle it by claiming SpriteV, like SpriteExtend apparently does), and that not that much work is planned for improving it, although there is some, and – finally – that a reimplemented version will have to be designed and tested carefully, to make sure that inefficiencies are not introduced, since it’s used all over the place. However, even given all that, I feel that – even if we never end up using the reimplemented version – this will at least give us some valuable experience when it comes to reimplementing OS functionality. Sorry I couldn’t be more helpful, Jeffrey, but I try not to take on tasks I won’t be able to do… :) |
Steffen Huber (91) 1953 posts |
Hi Terje, since I am just like Jeffrey a big fan of motivating you to rewrite SCSIFS, let me explain why this job is ideally suited for you ;-) You could start really small. Start to implement a generic filecore client module where you can just “plug in” the necessary routines to read/write data blocks from somewhere (e.g. from a file, or from a hardware device, or from a dynamic area). So replacing RAMFS would be a great start, and a good testbed for testing support for filecore buffering. You might run into problems with the file idea, because of filecore’s non-reentrant nature… I wouldn’t mind if you start to rewrite CDFS, too – there, you can start with a generic fileswitch client module where you just “plug in” routines to decode blocks of directory data, and again routines to read/write data blocks from somewhere. I think all this is a lot simpler than trying your luck with WindowManager! |
Terje Slettebø (285) 275 posts |
I’m happy to hear that, Alan. :) Like I mentioned earlier, I have a “live and let live” attitude to the use of programming languages (having used, and is using, a bunch of them myself), and I think the most important thing is to encourage development of RISC OS, and applications for it of course, regardless of what language is used… |
Terje Slettebø (285) 275 posts |
Hi Steffen.
Rewriting the filing system stack (either using C or C++) certainly sounds like an exciting prospect, but at my level (when it comes to knowledge of RISC OS internals), I’m afraid I need to pick on something “higher up in the stack”, where less knowledge of the internals/hardware is needed… Yes, WindowManager is definitely out for the moment (even if I’ve already written a bunch of classes for it… ;) That was until I realised the non-function nature of Wimp_StartTask, etc…). I still feel that OS_SpriteOp is a reasonable project, something I should be able to do, with my current RISC OS knowledge, and even here, there have been suggestions for improvements, RISC OS Select features like alpha channel and CMYK support. |
Trevor Johnson (329) 1645 posts |
There seems to be quite a bit of content on optimised assembly vs. C here – not sure how relevant this is but it may have some useful arguments worth considering. |