Getting modules ready for 64-bit RISC OS
Julie Stamp (8365) 474 posts |
When I’m working on OS modules at the moment I’m aware there are some big changes ahead (64-bit and SMP), and I’d like to be able to write my code now to work with that, so I don’t have to go back and change things later. I hope people have some ideas or checklists to help with this…
|
David J. Ruck (33) 1649 posts |
It’s impossible to make a module 64 bit ready until the a 64 bit RISC OS API has been published. Just write the module now, and in 20 years time you can worry about making the changes for 64 bit. |
Rick Murray (539) 13908 posts |
☝️ This. I’ll worry about 64 bit when I see it. PS: Only twenty years, Druck? |
Rick Murray (539) 13908 posts |
On a more serious note…
You don’t. So unless somebody devises some weird mojo of stuffing two binaries into one module, you WILL NOT (categorically and without exception) get an ARM64 processor to execute anything that runs on RISC OS today whilst in 64 bit mode. ARM64 is not a superset of ARM32. It’s something completely different. Different encoding, different behaviour, different register sets… |
David J. Ruck (33) 1649 posts |
I think Julie who has written some modules in C, is thinking more on what sort of thing you should do/avoid to enable the code to be easier to port to the eventual 64 bit API. In the same way you can write C programs which will work on both 32 and 64 bit versions of Linux and Windows, by being careful with explicitly sized variable types, and then it’s just a compile option to build a 64 bit variant. But until the 64 bit RISC OS API and is defined and we get C libraries for it, we can’t offer similar advice.
I’m ever the optimist. |
Colin Ferris (399) 1822 posts |
Perhaps working a way of running modules in USR mode – for use in Linux RO. |
Charles Ferguson (8243) 429 posts |
It seems that most of the responses here are negative or sarcastic. I hope that I can offer something more concrete, which doesn’t drift too far into “told you so”. Also, please treat ‘you’ as ‘anyone writing RISC OS code’, not you specifically, Julie. Getting modules (or for that matter, any other application or tool) into a state for which they are capable of being used in a different architecture has been on the cards for a very long time. Fortunately, if you’ve been doing the sensible thing and writing your modules in C or other high level language, you’ve been working in the right direction. If you have been writing your code in a structured manner, with libraries to implement the operations you want to perform, and with typing that allows you to change behaviour, you’ll have been moving in the right direction. If you’re been writing tests to ensure that the behaviour of those libraries and your component were working correctly, you’ll have been going in the right direction. If you’ve been focused on what you’re trying to do, rather than how you do it, then you’ll have been going in the right direction. Fortunately, these are things that people should have been doing anyhow. They allow you to write well structured and maintainable code which is testable, and easily retargetable, should the interfaces change in the future. So, having said what you should have been doing – fully with the knowledge that it’s unlikely – I’ll try to expand on this a little. If you write your code in small libraries that do the operation you want to do, with a clear interface and expectations, it will make the code easier to change if you need to change the underlying system that you want to use. It will make it easier to test in whatever system you are running it in. Let’s make a contrived example. Let’s say you have a module which provides a rendering system for lines of text – an area that shows text, maybe. The code that handles the list of text lines, functions to allow content to be added and removed, inserted and other operations. All of this code should be in a library. But ONLY that code. None of the code that knows about rendering, or fonts, or anything else. Why? Because it’s not relevant to the region of text. You can then write tests for that code, and make sure that it does the right thing. You can then provide some rendering routines in a library, but – and this is important – you only do the rendering concepts in the library. You don’t actually do the operations that will put anything on the screen. You can stub those routines, and allow them to be entirely divorced from the code that’s trying to render the code. A function for text size, a function for text plotting, a function for line drawing, maybe. In your test implementation you make those do nothing or return dummy data. Maybe you provide an implementation that uses the VDU system, rather than the Font system. The point is that the OS interfaces – the parts that you’re going to find which might change and might be different in the future, you’ve kept away from the business logic of adding lines, or working out how a line should be layed out. In C, you might do this with a structure that defines how you do the rendering with function pointers. In C++ you have a class that defines the rendering methods for a context. In assembler, you implement similar things but using a dispatch table for operations. In Python, you implement a class that has different rendering methods like in C++. If you were to do this in C it might look like this: typedef struct gcontext_s { void (*rectangle_fill)(unsigned long colour, int x0, int y0, int x1, int y1); void (*rectangle_outline)(unsigned long colour, int x0, int y0, int x1, int y1); void (*triangle_fill)(unsigned long colour, int x0, int y0, int x1, int y1, int x2, int y2); void (*line_start)(unsigned long line_colour); void (*line_line)(int x0, int y0, int x1, int y1); void (*line_end)(void); void (*fill_start)(unsigned long fill_colour); void (*fill_move)(int x, int y); void (*fill_line)(int x, int y); void (*fill_close)(void); void (*fill_end)(void); font_t (*text_findfont)(const char *name, int xsize, int ysize); void (*text_losefont)(font_t handle); void (*text_getstringsize)(font_t, stringbounds_t *bounds, int xlimit, const char *str, int strsize, char split_char); bounds_t (*text_getemsize)(font_t); coords_t (*text_paint)(font_t handle, int xb, int yb, unsigned long bg, unsigned long fg, const char *str, int strsize); coords_t (*text_paintattrib)(font_t handle, int x, int y, int x1, int y1, const char *str, int strsize, unsigned long bg, unsigned long fg, unsigned long attrib); } gcontext_t; (Find the sources here: https://github.com/gerph/riscos-presenter/blob/master/h/gcontext) Inside your code that’s doing the line manipulation you never have anything at all that’s related to the external structures that the OS provides. If you find yourself doing something like sizing a text string as a font, in the middle of adding the line, you’re doing things wrong – don’t mix underlying system calls with your business logic. If you do that, then you’ll find that if you need to work with different architectures or systems, you’re going to be working all over the codebase not in isolated sections of code. Even if you’re using interfaces that you know need to be system calls, move them to a implemetor function that does the semantics of the operation you want. For example, you could do a Why do this?
What does this mean for moving to 64bit? It means all the world. If you can prove that your main logic is sound in itself with tests, then once you have a 64bit system you can know that it’s still going to work because those tests will tell you. “All you have to do is” implement the underlying system calls that might have changed. And that’s trivial compared to the business logic of the code (assuming you haven’t mixed the two together). If you then run the same code on Linux then you can very easily see that it’s going to work in a 64bit environment. If you want a possible example of how mixing your system calls and business logic makes it hard to know whether it will be retargettable to a different system, look at most of RISC_OSLib. For an example that is similar to the text area example I’ve just discussed, look at TextGadgets – for example https://gitlab.riscosopen.org/RiscOS/Sources/Toolbox/Gadgets/-/blob/master/c/TextMan#L1346 which is a rendering operation that seems to have been abstracted to a ‘draw the text region’, but actually launches into system calls from the start. This makes it a lot harder to decide if you need to change those parts of the code to make them safe. Of course, if you’re doing lower level things then you’re going to have bigger problems with interface changes, but again, keeping the abstraction of what you’re trying to do from how you do it will help. If for example you wanted to build a page table definition for the layout of the operating system (say, if you were implementing that in the Kernel), you could use a descriptor table that said what you wanted to happen, and have the functional implementation write the resulting page table descriptor into memory, with suitable operations to make it live (similar to what the HAL does right now, or the system init environment did in Select). Your code above the page table initialisation is exactly the same. It’s only the architecture specific part you have that’s different. And, again, if that code is isolatable you can compile and run that code in user mode on RISC OS right now, or on Linux or whatever, and check whether the data structures you’ve created are correct. Are there other things that you can do? Of course, but it will depend on the domain you’re working in. Module SWI interfaces will need to be different in many cases, so design the interface to fit what you want right now, and ensure that the back end code can be made to work if the external interface changes – a veneer that takes a structure of pointers to memory (+0 → 32bit word, +4 → 32bit word, etc) and converts it to a set of pointers for the general code in your module will work fine right now. And then when a 64bit system comes along and you need to have a different structure you either change the API definition in 64bit or offer an alternative SWI interface that is for the 64bit memory environment. All your back end code stays the same, but your veneer that takes the user’s input changes to handle it. Bit manipulation and magic values? Don’t rely on the size of a word, or the values when signed or unsigned. So if you take a 32bit value, ensure that you really are dealing with it in 32bits, but if you take an address pointer, treat it differently. In C you can use different types for these, obviously. If your interfaces take signed values, use the signed values. Never use 0xFFFFFFFF to mean -1, and take care not to look for -ve values when you really meant “I want -1” (eg a non-handle might be -1, but if you check for -ve you might find it gets tripped up in the future when pointers are up in high memory – that’s something that already happens but you should take care). Never try to stash more values into a word – if you have a 64bit value that could be an address or a value, and you used bit 63 to indicate that this was a number not a pointer, that’s fine in 32bit memory space, but not in 64bit memory space. These are all usual things to think on. I’ve mentioned not mixing system calls into your business logic, but I’ll come back to a thing that I am very clear on… assembler has very little place in anything but performance critical code. Don’t write assembler – you’ll only have to write it again. Never inline assembler unless you have no other recourse. Performance of unmaintainable code is irrelevant. If you perpetuate such bad practices, you hurt yourself in the future by making things harder. If you need to use assembler, isolate it to separate implementation files and libraries – the same as you do with the system calls. Make sure that it’s easy to swap in and out. And then what about testing your module? Same principle as I’ve laid out previously – write your code to be able to be run in USR mode, either by stubbing, dummy functions or other mechanisms, and check that you’ve got something that works. Include the positive and negative cases. Include stupid cases. If you find a bug, write a test for it. Show that you can check the behaviour works on the current system, and when a new system comes along, you re-use those tests to check that the behaviour is the same. Then when it isn’t the same, you modify the code and re-test on the 32bit environment and see that you haven’t broken things. Yes, a lot of this is about structuring your code well, in more general, retargettable and testable ways. Yes, there’s a lot of “make it testable”. You cannot know what’s going to come, but you can make what you have done easily modifiable to make it so that it will work in a different world. Does this answer your questions about how you make your module 64bit ready? No. Can I answer that question? No. Can anyone ? No. Because such a system doesn’t exist yet, and the interfaces are not defined. But if you do all of the above – which you should be doing anyway because it will improve the maintenance, reliability and reusability of your code, you’ll find that you are in a much better position to just move a few things around, change a couple of interfaces and be able to meet the needs of a different system. |
Colin Ferris (399) 1822 posts |
Reading through this – it seems Gerf wanting to produce new SWIs creates a empty module of which the SWI calls a star command which can be debuged in USR mode. Which I suppose can be tested on Win/Linux machines. |
Charles Ferguson (8243) 429 posts |
Um… I’m not sure what you mean there. I’m not suggesting anything at all about what you should do with your module or how you should provide an API. And definately I would never recommend any SWI calls should ever call a *command. *Commands can replace the running application, whilst SWI calls can be used for any operations at all that you want to use, including those on ticker events and callbacks. Because SWIs can be used at any time, you must never call a *Command that might potentially disrupt the running application, which is something that a *Command can do. So no, I never want you to calls *Commands from SWIs. I’m saying that the underlying code that you produce should be retargettable to be able to be used (and tested) in different environments. I expect that you would recompile, or reuse, the code that you intend for the module in the safer environment (like USR mode on RISC OS, or on other systems). You don’t even go near the module itself to do that testing of the main code. However, this is talking largely about how you write your code to be testable – have a look at https://www.iconbar.com/forums/viewthread.php?threadid=12842&page=1 for a little more discussion about how to split your code up to be more testable and isolated. There was a presentation on this here: https://presentations.riscos.online/testing/index.html – I don’t think there was ever a video made of it. If you want a more concrete example of a module that has isolated code in the manner that I suggest, take a look at the IconBorderFob module – https://github.com/gerph/iconborders-fob . This module provides an interface to render borders on Wimp icons. As such it registers a filter on the wimp (using the IconFilters in the FilterManager) and replaces the border rendering code. You never want to be writing such code only in the module because during development and testing, you’re going to make mistakes and the RISC OS problems with lack of isolation of such components mean that you’re likely to block the system away and need to reboot many times when you mess up. So instead it was developed using reusable and functional components. It has a library that implements that rendering component of the calls – issuing system calls to draw lines and shapes. This is in the [ch].graphics file. If the icon rendering were to be transferred to linux and made to use Cairo, this is the only part of the rendering that needs to change. It’s easy to see what it does and how it does it, and if the interfaces on another system (like 64 bit ARM) are different, that’s all that needs to change. It uses different functions rather than a dispatch structure like I suggested for the text handling, above, but it’s the same principle. The actual border rendering is defined in [ch].borders. This does the actual drawing – you have a defined interface (that looks something like the one used by the filter system) which is used to draw the borders on the screen. It calls the functions in c.graphics, so if you had changed the underlying system, you don’t need to change the business logic of the shape of the icons or how they’re drawn. Again, none of this code cares whether it’s in a module or not. There’s a test program, c.testborders, which is written to exercise the rendering of the code. This just calls the c.borders interfaces in a similar way to how it would be used in the module, but it’s just an application like any other. You can test it out, see when it crashes or draws things stupidly, and modify the code all in the safety of user space. Or, if you had ported the primitives to linux or Windows, in those operating systems instead. Not only aren’t you needing to do anything about being a module, you might not even be using RISC OS. Finally there’s the veneer code – in CMHG + c.veneer. All this does is the simple operation of interfacing between your functional library (c.borders) and the OS interfaces. Since this interfacing is minimal, there will be very little to go wrong so it is very easy to get right. The border code that lies underneath it has already been tested to make sure that it works, when used correctly. So the only thing you need to do in the veneer is make sure you use it correctly. If the 32bit/64bit interfaces are different, the only parts that interface with the OS are the lower level rendering library and the outer veneer. Either of which is easy to change without affecting the rest of the system. If you had made the veneer code also deal with the business logic of drawing the borders, you’d be mixing the different areas of the code. You’d find that it was a lot harder to make the code safe in different environments – not knowing where the OS interface boundary is, or maybe having many different interface points littered through your code, which increases the amount of work you need to do. The part that’s running in USR mode is the business logic and you just don’t get involved with the rest of the OS interfaces. Not all modules will have this type of separation, but most will. If you’re heavily implementing a OS interface, then you’re at the mercy of those OS interfaces being different in a new architecture, so maybe you can only isolate your code so much. Even things like filing systems don’t need to run in SVC space. For example, you can make the filesystem code run as a command line tool and check all of its behaviour in user space without having to care about the OS interfaces at all. Only the veneer between the code that does file system address and the OS interfaces need actually be system specific. |
Paolo Fabio Zaino (28) 1893 posts |
Beside the detailed and really well thought explanation from Gerph, I would suggest:
Making Your Module 64-bit Ready:
// Instead of using datatypes like double or long int do the following: typedef long int my_l_int_t; And then use it as: void myfunc(my_l_int_t x, my_l_int_t y) { ... }
https://github.com/RISC-OS-Community/ZVector This compiles with GCC on RO and on other OSes and with DDE, runs fine on AArch32/64, x86, x86_64, RISC-V (32 and 64), PPC and even the 68K As Gerph mentioned, use libraries! Ideally we should create a repository of libraries that work in both cases and for both Apps and Modules, so that we don’t need to do the same work every single time! IMHO, most of the work is in ensuring that your code is 32/64 neutral.
Testing for 64-bit Readiness:
I wrote my UltimaVM so that it can actually even runs 64 bit code on existsing RISC OS 5.28 and all the way down to RISC OS 3.11, compile on 64 and 32 bit hosts and can be built as a module (only 32 bit module for now). So, it’s completely possible to do everything Gerph has mentioned and more. It just takes more work and more testing on RISC OS than it does on Linux. That is the not fun part, sorry! For collaboration and standardization, I recommend checking out the StandardRepoTemplate on the Riscos Community GitHub. It includes helpful scripts and a structure conducive to using various code analysis tools (e.g., SonarQube, Fortify, CodeQL, Codacy, GitHub Actions, Sourcetrail and more) to improve code quality and productivity. BTW you helped me initially with this effort, did you stop following it? https://github.com/RISC-OS-Community/StandardRepoTemplate
Declaring 64-bit Readiness: The official declaration of 64-bit readiness likely falls to ROOL, similar to the process for 32-bit readiness. However, the focus should be on ensuring your code is both 32 and 64-bit compatible, emphasizing thorough testing and code abstraction. HTH |
Sprow (202) 1161 posts |
The short answer is you can’t since we don’t yet know what a 64 bit module will look like, plus the slight issue of not having a 64 bit OS to run it on either, but it’s possible to prepare so at least things are no harder than they need to be in future. The sage words already given about keeping code modular and testable is worth doing anyway, don’t wait for a 64 bit OS before starting doing those good habits! My mental checklist would be:
On that last point, here’s a bit of 32 bit compatible C
and here it is rewritten to be 64 bit compatible
If there does turn out to be a 1:1 mapping between the module header items in a 32 bit world and 64 bit, which is quite a big if, then all you’d need is to run CMHG with some different switches. That’d declare the module as 64 bit in the flags field. For SMP I could imagine CMHG will gain some more CMHG keywords to signal which SWIs are multicore safe. |
Julie Stamp (8365) 474 posts |
Eeek! I think there are about at least a hundred components in All.git using _kernel_swi, this is really worrying :-/ |
Charles Ferguson (8243) 429 posts |
No, that’s bad advice.
|
Stuart Swales (8827) 1367 posts |
It wouldn’t hurt to define |
Paolo Fabio Zaino (28) 1893 posts |
I respectfully suggest a different approach. It might be more beneficial to encapsulate these calls within your own abstraction layer, particularly if ROOL has not already provided a design for it. |
Julie Stamp (8365) 474 posts |
That sounds a lot less stressful! I was wondering if cppcheck could help for this sort of thing? Can it be asked to report on casts from pointers to int? |
Charles Ferguson (8243) 429 posts |
I had assumed that Sprow meant only that these calls be /in/ the abstraction layer of libraries, as that had already been mentioned by you, me and others. |
Paolo Fabio Zaino (28) 1893 posts |
It should when you use enable=all, but, to be 100% sure, I’d run a: cppcheck --doc With the specific version of cppcheck you’re using, that will show which checks are provided. Eeventually you can also run a: cppcheck --errorlist Definitely there are multiple tools that can report on casts from pointers to int. You should be able to check even just using GCC with -Wint-to-pointer-cast and -Wpointer-to-int-cast (IIRC also clang supports these 2 warnings).
Oh sorry, maybe my English caused a bit of confusion. What I meant was, probably best to write some more portable version of _kernel_regs like: typedef union { uintptr_t u; intptr_t s; } register_t; typedef struct { register_t r[MAX_REGISTERS]; ... } _my_kernel_regs_t; And an portable abstraction for the syscall mechnism in place of __kernel_swi and then within the abstraction translate it back to __kernel_swi – instead of – using _swi / _swix Althought if, internally it would not matter this much at that point. However, it would matter in relation to what Julie commented which was:
+
So, my limited English understanding suggests she is not keen on rewriting all the code that is using already __kernel_swi, hence even the abstraction doesn’t seems to be an option. Maybe I just misunderstood. |
Dave Higton (1515) 3559 posts |
Has anyone done any work on a 64 bit HAL? |
tymaja (278) 178 posts |
Some thoughts (very good advice above!); Writing the module in C is a good idea. What I would do for anything, except my ARM64 BASIC, would be to write it in C, and when it works, test performance, and make a few ‘alternate’ versions of areas that are slow in C, and code those in ARM32 or ARM (a challenge, of course, is register widths if dropping to ARM32 assembly from C). Another thing is ‘special registers’; Register 18 is reserved on iOS, macOS , Linux, Windows etc, so don’t touch that register! Use macros for all stack operations, because; X19, X30, SP need to be preserved across function calls too. X0-X7 are used for passing arguments in the ABI; This is useful: https://developer.arm.com/documentation/102374/0102/Procedure-Call-Standard For Linux; x9-x15 : your code can ‘trash’ these : the caller has to save these before calling a function (caller-save) This may be helpful when doing small sections of ARM64 code in a C module! |
David J. Ruck (33) 1649 posts |
No, and no again. If it is slow in C with modern compilers (and particularly on 64 bit ARM) and re-writing in assembler is a fools errand, tiny gains at best. The way to make code faster is come up with better algorithms. |
tymaja (278) 178 posts |
I agree with most of this and that most compilers are impressive nowadays – I wrote a ‘VDU-like’ framebuffer text plotter in C, without any optimisation, on RPi5, which used multiplies and many adds in the inner loop. It is fast despite this, and when moving the multiplies out of the inner loop, it actually slows down a bit – showing that it detected the redundant arithmetic in the inner loop, and optimised it rather well. Which I found impressive! I do believe there will always be a role for assembler in coding, but do agree it can sometimes give small gains for a lot of work compared to C. One of my interests is realtime 3D graphics, which is why I tend to gravitate towards assembler (although, again, the 3D drivers are the biggest overhead these days, so hand-optimised assembler won’t do much unless it is causing a slowdown that can’t be fixed any other way. At least one role for assembler would be to use the Carry bit. It is possible to manually ‘emulate’ carry in C, or use someone else’s libraries, but assembler can do a 64 / 128 bit int add in 2 instructions using ADC. Another would be in low level system code (including modules, if changing the mode / EL). Although most modules could be in USR mode anyway (like BASIC!) |
Rick Murray (539) 13908 posts |
The less, the better. In RISC OS, for making API calls, though this is usually hidden away in some library code. Elsewhere? Hardware glue logic, like the interrupt handler between the exception and when it hands over to C code. And… That’s about it. People don’t even program little microcontrollers in assembler these days – only very specific time/cycle critical code, but whether or not there is any depends upon the application, making a cheap oscilloscope, yes, making a smart toaster, not so much.
? When working in a higher level language, you shouldn’t be concerned with such things. Let the compiler worry about those details. Or do you mean the various quirky ways the carry bit is used in the RISC OS API? That’s why we have _kernel_swi_c() and it’s part of what I meant about the API written in assembly for assembler programmers. To put that last part into context, let me quote from the Arthur PRM, in particular the second part here: |
Steve Fryatt (216) 2112 posts |
I’d hope that any half competent compiler could do this, too (at least for 64-bit, and anywhere that implements 128). |
Rick Murray (539) 13908 posts |
Example program I threw together: #include <stdio.h> // WhyTF didn't they make a new data // type for 64 bit? "long long" is dumb. unsigned long long bigone = 0x1122334455667788; unsigned long long bigtwo = 0x2233445566778899; int main(int argc, char *argv[]) { // This convoluted nonsense is to stop // the compiler just doing the calculation // itself and storing the result. ;) if ( argc > 1 ) bigone = bigone * argc; bigone += bigtwo; printf("The result is &%llX.\n", bigone); return 0; } Here’s the addition that was compiled. 000080C0 : .... : E8950003 : LDMIA R5,{R0,R1} 000080C4 : .... : E893000C : LDMIA R3,{R2,R3} 000080C8 : .... : E0900002 : ADDS R0,R0,R2 000080CC : .... : E0A11003 : ADC R1,R1,R3 This isn’t the late ‘80s/early ’90s. There are few good reasons to use assembler these days, and plenty of good reasons why you shouldn’t. RISC OS itself, ironically, being one of them. ;) |