Getting modules ready for 64-bit RISC OS

56 posts, 18 voices

Pages: 1 2 3

Mar 26, 2024 9:21am Julie Stamp (8365) 480 posts	When I’m working on OS modules at the moment I’m aware there are some big changes ahead (64-bit and SMP), and I’d like to be able to write my code now to work with that, so I don’t have to go back and change things later. I hope people have some ideas or checklists to help with this… How can I make my module 64-bit ready? How can I test if my module is 64-bit ready? Once I’ve reviewed my module, how can I declare that I think it’s 64-bit ready?

Mar 26, 2024 1:08pm David J. Ruck (33) 1696 posts	It’s impossible to make a module 64 bit ready until the a 64 bit RISC OS API has been published. Just write the module now, and in 20 years time you can worry about making the changes for 64 bit.

Mar 26, 2024 1:44pm Rick Murray (539) 14047 posts	☝️ This. I’ll worry about 64 bit when I see it. PS: Only twenty years, Druck?

Mar 26, 2024 1:48pm Rick Murray (539) 14047 posts	On a more serious note… Once I’ve reviewed my module, how can I declare that I think it’s 64-bit ready? You don’t. ARM64 != ARM32 (or ARM64<>ARM32 for those who read BASIC) So unless somebody devises some weird mojo of stuffing two binaries into one module, you WILL NOT (categorically and without exception) get an ARM64 processor to execute anything that runs on RISC OS today whilst in 64 bit mode. ARM64 is not a superset of ARM32. It’s something completely different. Different encoding, different behaviour, different register sets…

Mar 26, 2024 8:39pm David J. Ruck (33) 1696 posts	So unless somebody devises some weird mojo of stuffing two binaries into one module I think Julie who has written some modules in C, is thinking more on what sort of thing you should do/avoid to enable the code to be easier to port to the eventual 64 bit API. In the same way you can write C programs which will work on both 32 and 64 bit versions of Linux and Windows, by being careful with explicitly sized variable types, and then it’s just a compile option to build a 64 bit variant. But until the 64 bit RISC OS API and is defined and we get C libraries for it, we can’t offer similar advice. PS: Only twenty years, Druck? I’m ever the optimist.

Mar 27, 2024 7:36am Colin Ferris (399) 1847 posts	Perhaps working a way of running modules in USR mode – for use in Linux RO.

Mar 28, 2024 8:17pm Charles Ferguson (8243) 438 posts	It seems that most of the responses here are negative or sarcastic. I hope that I can offer something more concrete, which doesn’t drift too far into “told you so”. Also, please treat ‘you’ as ‘anyone writing RISC OS code’, not you specifically, Julie. Getting modules (or for that matter, any other application or tool) into a state for which they are capable of being used in a different architecture has been on the cards for a very long time. Fortunately, if you’ve been doing the sensible thing and writing your modules in C or other high level language, you’ve been working in the right direction. If you have been writing your code in a structured manner, with libraries to implement the operations you want to perform, and with typing that allows you to change behaviour, you’ll have been moving in the right direction. If you’re been writing tests to ensure that the behaviour of those libraries and your component were working correctly, you’ll have been going in the right direction. If you’ve been focused on what you’re trying to do, rather than how you do it, then you’ll have been going in the right direction. Fortunately, these are things that people should have been doing anyhow. They allow you to write well structured and maintainable code which is testable, and easily retargetable, should the interfaces change in the future. So, having said what you should have been doing – fully with the knowledge that it’s unlikely – I’ll try to expand on this a little. If you write your code in small libraries that do the operation you want to do, with a clear interface and expectations, it will make the code easier to change if you need to change the underlying system that you want to use. It will make it easier to test in whatever system you are running it in. Let’s make a contrived example. Let’s say you have a module which provides a rendering system for lines of text – an area that shows text, maybe. The code that handles the list of text lines, functions to allow content to be added and removed, inserted and other operations. All of this code should be in a library. But ONLY that code. None of the code that knows about rendering, or fonts, or anything else. Why? Because it’s not relevant to the region of text. You can then write tests for that code, and make sure that it does the right thing. You can then provide some rendering routines in a library, but – and this is important – you only do the rendering concepts in the library. You don’t actually do the operations that will put anything on the screen. You can stub those routines, and allow them to be entirely divorced from the code that’s trying to render the code. A function for text size, a function for text plotting, a function for line drawing, maybe. In your test implementation you make those do nothing or return dummy data. Maybe you provide an implementation that uses the VDU system, rather than the Font system. The point is that the OS interfaces – the parts that you’re going to find which might change and might be different in the future, you’ve kept away from the business logic of adding lines, or working out how a line should be layed out. In C, you might do this with a structure that defines how you do the rendering with function pointers. In C++ you have a class that defines the rendering methods for a context. In assembler, you implement similar things but using a dispatch table for operations. In Python, you implement a class that has different rendering methods like in C++. If you were to do this in C it might look like this: typedef struct gcontext_s { void (rectangle_fill)(unsigned long colour, int x0, int y0, int x1, int y1); void (rectangle_outline)(unsigned long colour, int x0, int y0, int x1, int y1); void (triangle_fill)(unsigned long colour, int x0, int y0, int x1, int y1, int x2, int y2); void (line_start)(unsigned long line_colour); void (line_line)(int x0, int y0, int x1, int y1); void (line_end)(void); void (fill_start)(unsigned long fill_colour); void (fill_move)(int x, int y); void (fill_line)(int x, int y); void (fill_close)(void); void (fill_end)(void); font_t (text_findfont)(const char name, int xsize, int ysize); void (text_losefont)(font_t handle); void (text_getstringsize)(font_t, stringbounds_t bounds, int xlimit, const char str, int strsize, char split_char); bounds_t (text_getemsize)(font_t); coords_t (text_paint)(font_t handle, int xb, int yb, unsigned long bg, unsigned long fg, const char str, int strsize); coords_t (text_paintattrib)(font_t handle, int x, int y, int x1, int y1, const char str, int strsize, unsigned long bg, unsigned long fg, unsigned long attrib); } gcontext_t; (Find the sources here: https://github.com/gerph/riscos-presenter/blob/master/h/gcontext) Inside your code that’s doing the line manipulation you never have anything at all that’s related to the external structures that the OS provides. If you find yourself doing something like sizing a text string as a font, in the middle of adding the line, you’re doing things wrong – don’t mix underlying system calls with your business logic. If you do that, then you’ll find that if you need to work with different architectures or systems, you’re going to be working all over the codebase not in isolated sections of code. Even if you’re using interfaces that you know need to be system calls, move them to a implemetor function that does the semantics of the operation you want. For example, you could do a `_kernel_swi(Font_ScanString)` inline with your operations if you really wanted to (ignoring the above advice), but that means that you have to understand the semantics of the (rather complex) Font_ScanString call. If, on the other hand, you called `get_string_bounding_box` and implemented the call in that function, you then have a very clear intention in the code that is calling it, and if you need to switch out the `Font_ScanString` call, it is significantly easier to do so. Why do this? It makes testing easier – because you can stub everything and make it less reliant on the system that it’s running on. It makes re-targetting to other systems easier – because that same business logic will compile on Linux and Windows. And probably your tests will run on those systems too. And wouldn’t you know it, those systems might have an underlying 64bit architecture, so if you can handle things running on those alien systems, you’re going the right way. It makes running in different environments easier – the above example has nothing to do with running in a module, and shouldn’t because the environment it runs in is irrelevant to the operations that are performed. If you want the code to be in an application or command line tool or module, the code is the same. It’s just the veneers that are different. It makes running with a different underlying implementation easier because of the above – if you change the underlying system calls you need to use, the tests that checked the behaviour of the libraries would still be usable. What does this mean for moving to 64bit? It means all the world. If you can prove that your main logic is sound in itself with tests, then once you have a 64bit system you can know that it’s still going to work because those tests will tell you. “All you have to do is” implement the underlying system calls that might have changed. And that’s trivial compared to the business logic of the code (assuming you haven’t mixed the two together). If you then run the same code on Linux then you can very easily see that it’s going to work in a 64bit environment. If you want a possible example of how mixing your system calls and business logic makes it hard to know whether it will be retargettable to a different system, look at most of RISC_OSLib. For an example that is similar to the text area example I’ve just discussed, look at TextGadgets – for example https://gitlab.riscosopen.org/RiscOS/Sources/Toolbox/Gadgets/-/blob/master/c/TextMan#L1346 which is a rendering operation that seems to have been abstracted to a ‘draw the text region’, but actually launches into system calls from the start. This makes it a lot harder to decide if you need to change those parts of the code to make them safe. Of course, if you’re doing lower level things then you’re going to have bigger problems with interface changes, but again, keeping the abstraction of what you’re trying to do from how you do it will help. If for example you wanted to build a page table definition for the layout of the operating system (say, if you were implementing that in the Kernel), you could use a descriptor table that said what you wanted to happen, and have the functional implementation write the resulting page table descriptor into memory, with suitable operations to make it live (similar to what the HAL does right now, or the system init environment did in Select). Your code above the page table initialisation is exactly the same. It’s only the architecture specific part you have that’s different. And, again, if that code is isolatable you can compile and run that code in user mode on RISC OS right now, or on Linux or whatever, and check whether the data structures you’ve created are correct. Are there other things that you can do? Of course, but it will depend on the domain you’re working in. Module SWI interfaces will need to be different in many cases, so design the interface to fit what you want right now, and ensure that the back end code can be made to work if the external interface changes – a veneer that takes a structure of pointers to memory (+0 → 32bit word, +4 → 32bit word, etc) and converts it to a set of pointers for the general code in your module will work fine right now. And then when a 64bit system comes along and you need to have a different structure you either change the API definition in 64bit or offer an alternative SWI interface that is for the 64bit memory environment. All your back end code stays the same, but your veneer that takes the user’s input changes to handle it. Bit manipulation and magic values? Don’t rely on the size of a word, or the values when signed or unsigned. So if you take a 32bit value, ensure that you really are dealing with it in 32bits, but if you take an address pointer, treat it differently. In C you can use different types for these, obviously. If your interfaces take signed values, use the signed values. Never use 0xFFFFFFFF to mean -1, and take care not to look for -ve values when you really meant “I want -1” (eg a non-handle might be -1, but if you check for -ve you might find it gets tripped up in the future when pointers are up in high memory – that’s something that already happens but you should take care). Never try to stash more values into a word – if you have a 64bit value that could be an address or a value, and you used bit 63 to indicate that this was a number not a pointer, that’s fine in 32bit memory space, but not in 64bit memory space. These are all usual things to think on. I’ve mentioned not mixing system calls into your business logic, but I’ll come back to a thing that I am very clear on… assembler has very little place in anything but performance critical code. Don’t write assembler – you’ll only have to write it again. Never inline assembler unless you have no other recourse. Performance of unmaintainable code is irrelevant. If you perpetuate such bad practices, you hurt yourself in the future by making things harder. If you need to use assembler, isolate it to separate implementation files and libraries – the same as you do with the system calls. Make sure that it’s easy to swap in and out. And then what about testing your module? Same principle as I’ve laid out previously – write your code to be able to be run in USR mode, either by stubbing, dummy functions or other mechanisms, and check that you’ve got something that works. Include the positive and negative cases. Include stupid cases. If you find a bug, write a test for it. Show that you can check the behaviour works on the current system, and when a new system comes along, you re-use those tests to check that the behaviour is the same. Then when it isn’t the same, you modify the code and re-test on the 32bit environment and see that you haven’t broken things. Yes, a lot of this is about structuring your code well, in more general, retargettable and testable ways. Yes, there’s a lot of “make it testable”. You cannot know what’s going to come, but you can make what you have done easily modifiable to make it so that it will work in a different world. Does this answer your questions about how you make your module 64bit ready? No. Can I answer that question? No. Can anyone ? No. Because such a system doesn’t exist yet, and the interfaces are not defined. But if you do all of the above – which you should be doing anyway because it will improve the maintenance, reliability and reusability of your code, you’ll find that you are in a much better position to just move a few things around, change a couple of interfaces and be able to meet the needs of a different system.

Mar 29, 2024 9:18am Colin Ferris (399) 1847 posts	Reading through this – it seems Gerf wanting to produce new SWIs creates a empty module of which the SWI calls a star command which can be debuged in USR mode. Which I suppose can be tested on Win/Linux machines.

Mar 29, 2024 10:20am Charles Ferguson (8243) 438 posts	Reading through this – it seems Gerf wanting to produce new SWIs creates a empty module of which the SWI calls a star command which can be debuged in USR mode. Um… I’m not sure what you mean there. I’m not suggesting anything at all about what you should do with your module or how you should provide an API. And definately I would never recommend any SWI calls should ever call a command. Commands can replace the running application, whilst SWI calls can be used for any operations at all that you want to use, including those on ticker events and callbacks. Because SWIs can be used at any time, you must never call a Command that might potentially disrupt the running application, which is something that a Command can do. So no, I never want you to calls *Commands from SWIs. I’m saying that the underlying code that you produce should be retargettable to be able to be used (and tested) in different environments. I expect that you would recompile, or reuse, the code that you intend for the module in the safer environment (like USR mode on RISC OS, or on other systems). You don’t even go near the module itself to do that testing of the main code. However, this is talking largely about how you write your code to be testable – have a look at https://www.iconbar.com/forums/viewthread.php?threadid=12842&page=1 for a little more discussion about how to split your code up to be more testable and isolated. There was a presentation on this here: https://presentations.riscos.online/testing/index.html – I don’t think there was ever a video made of it. If you want a more concrete example of a module that has isolated code in the manner that I suggest, take a look at the IconBorderFob module – https://github.com/gerph/iconborders-fob . This module provides an interface to render borders on Wimp icons. As such it registers a filter on the wimp (using the IconFilters in the FilterManager) and replaces the border rendering code. You never want to be writing such code only in the module because during development and testing, you’re going to make mistakes and the RISC OS problems with lack of isolation of such components mean that you’re likely to block the system away and need to reboot many times when you mess up. So instead it was developed using reusable and functional components. It has a library that implements that rendering component of the calls – issuing system calls to draw lines and shapes. This is in the [ch].graphics file. If the icon rendering were to be transferred to linux and made to use Cairo, this is the only part of the rendering that needs to change. It’s easy to see what it does and how it does it, and if the interfaces on another system (like 64 bit ARM) are different, that’s all that needs to change. It uses different functions rather than a dispatch structure like I suggested for the text handling, above, but it’s the same principle. The actual border rendering is defined in [ch].borders. This does the actual drawing – you have a defined interface (that looks something like the one used by the filter system) which is used to draw the borders on the screen. It calls the functions in c.graphics, so if you had changed the underlying system, you don’t need to change the business logic of the shape of the icons or how they’re drawn. Again, none of this code cares whether it’s in a module or not. There’s a test program, c.testborders, which is written to exercise the rendering of the code. This just calls the c.borders interfaces in a similar way to how it would be used in the module, but it’s just an application like any other. You can test it out, see when it crashes or draws things stupidly, and modify the code all in the safety of user space. Or, if you had ported the primitives to linux or Windows, in those operating systems instead. Not only aren’t you needing to do anything about being a module, you might not even be using RISC OS. Finally there’s the veneer code – in CMHG + c.veneer. All this does is the simple operation of interfacing between your functional library (c.borders) and the OS interfaces. Since this interfacing is minimal, there will be very little to go wrong so it is very easy to get right. The border code that lies underneath it has already been tested to make sure that it works, when used correctly. So the only thing you need to do in the veneer is make sure you use it correctly. If the 32bit/64bit interfaces are different, the only parts that interface with the OS are the lower level rendering library and the outer veneer. Either of which is easy to change without affecting the rest of the system. If you had made the veneer code also deal with the business logic of drawing the borders, you’d be mixing the different areas of the code. You’d find that it was a lot harder to make the code safe in different environments – not knowing where the OS interface boundary is, or maybe having many different interface points littered through your code, which increases the amount of work you need to do. The part that’s running in USR mode is the business logic and you just don’t get involved with the rest of the OS interfaces. Not all modules will have this type of separation, but most will. If you’re heavily implementing a OS interface, then you’re at the mercy of those OS interfaces being different in a new architecture, so maybe you can only isolate your code so much. Even things like filing systems don’t need to run in SVC space. For example, you can make the filesystem code run as a command line tool and check all of its behaviour in user space without having to care about the OS interfaces at all. Only the veneer between the code that does file system address and the OS interfaces need actually be system specific.

Mar 30, 2024 6:12pm Paolo Fabio Zaino (28) 1933 posts	When I’m working on OS modules at the moment I’m aware there are some big changes ahead (64-bit and SMP), and I’d like to be able to write my code now to work with that, so I don’t have to go back and change things later. I hope people have some ideas or checklists to help with this… Beside the detailed and really well thought explanation from Gerph, I would suggest: How can I make my module 64-bit ready? Making Your Module 64-bit Ready: Prefer C/C++ Over Assembly: Use C or C++ for your codebase to avoid architecture-specific issues. If assembly is necessary, isolate it in external files (e.g., myproject.arch.arm32), allowing for easy addition of myproject.arch.arm64 support later. Use GCC for 64-bit Testing: While DDE’s full support for AArch64 is pending, GCC offers a way to ensure your code is internally 64-bit ready by testing on AArch64 environments. Make DDE your secondary target to validate compatibility. Abstract Data Types: Instead of directly using data types like long int, define your own types that can be easily adjusted for different architectures: // Instead of using datatypes like double or long int do the following: typedef long int my_l_int_t; And then use it as: void myfunc(my_l_int_t x, my_l_int_t y) { ... } Abstract Software Interrupts (SWIs): To facilitate future adjustments without causing issues, abstract (or wrap) malloc with your own so you can switch as needed. Ensure Portability: I have already written libraries that compile and work fine on AArch64 and AArch32, here is an example: https://github.com/RISC-OS-Community/ZVector This compiles with GCC on RO and on other OSes and with DDE, runs fine on AArch32/64, x86, x86_64, RISC-V (32 and 64), PPC and even the 68K As Gerph mentioned, use libraries! Ideally we should create a repository of libraries that work in both cases and for both Apps and Modules, so that we don’t need to do the same work every single time! IMHO, most of the work is in ensuring that your code is 32/64 neutral. How can I test if my module is 64-bit ready? Testing for 64-bit Readiness: Abstract Your Code for Unit Testing: This approach enables you to test your module’s logic as a regular application, streamlining the identification and resolution of issues. Manage Memory Allocation: Implement wrappers for memory management functions like malloc (and using your custom datatypes for pointers) to monitor and control memory allocation, aiding in detecting potential 64-bit related issues because pointers will be different. I wrote my UltimaVM so that it can actually even runs 64 bit code on existsing RISC OS 5.28 and all the way down to RISC OS 3.11, compile on 64 and 32 bit hosts and can be built as a module (only 32 bit module for now). So, it’s completely possible to do everything Gerph has mentioned and more. It just takes more work and more testing on RISC OS than it does on Linux. That is the not fun part, sorry! For collaboration and standardization, I recommend checking out the StandardRepoTemplate on the Riscos Community GitHub. It includes helpful scripts and a structure conducive to using various code analysis tools (e.g., SonarQube, Fortify, CodeQL, Codacy, GitHub Actions, Sourcetrail and more) to improve code quality and productivity. BTW you helped me initially with this effort, did you stop following it? https://github.com/RISC-OS-Community/StandardRepoTemplate Testing your code in privileged mode: Until there is an official RISC OS AArch64 API and 64 bit modules requirement list, it’ll be not much use. However, if your code is solid and 32/64 bit, you can test the module as a 32 bit module and have an approximate idea. Once I’ve reviewed my module, how can I declare that I think it’s 64-bit ready? Declaring 64-bit Readiness: The official declaration of 64-bit readiness likely falls to ROOL, similar to the process for 32-bit readiness. However, the focus should be on ensuring your code is both 32 and 64-bit compatible, emphasizing thorough testing and code abstraction. HTH

Mar 30, 2024 10:02pm Sprow (202) 1168 posts	How can I make my module 64-bit ready? How can I test if my module is 64-bit ready? The short answer is you can’t since we don’t yet know what a 64 bit module will look like, plus the slight issue of not having a 64 bit OS to run it on either, but it’s possible to prepare so at least things are no harder than they need to be in future. The sage words already given about keeping code modular and testable is worth doing anyway, don’t wait for a 64 bit OS before starting doing those good habits! My mental checklist would be: Use a high level language, not assembler For C, use the definite width types in <stdint.h> rather than ‘int’ and ‘long’ For C, use size_t and [u]intptr_t when counting through arrays and messing with pointer arithmetic Watch out for uses of the SWIs and service calls which are known to have baked in 32 bit-isms Use the shared makefile fragments. These already select all the tool switches for you when building things for 26 bit APCS-R and 32 bit APCS-32, so would do similar for 64 bit tools too you’d hope Use _swix() and _swi() rather than _kernel_swi() because the latter takes an array of int registers, which will truncate the input arguments while _swix() and _swi() don’t On that last point, here’s a bit of 32 bit compatible C `_swix(OS_ValidateAddress, _INR(0,1) \| _OUT(_FLAGS), start, end, &carry);` and here it is rewritten to be 64 bit compatible `_swix(OS_ValidateAddress, _INR(0,1) \| _OUT(_FLAGS), start, end, &carry);` Once I’ve reviewed my module, how can I declare that I think it’s 64-bit ready? If there does turn out to be a 1:1 mapping between the module header items in a 32 bit world and 64 bit, which is quite a big if, then all you’d need is to run CMHG with some different switches. That’d declare the module as 64 bit in the flags field. For SMP I could imagine CMHG will gain some more CMHG keywords to signal which SWIs are multicore safe.

Mar 31, 2024 1:55pm Julie Stamp (8365) 480 posts	Use _swix() and _swi() rather than _kernel_swi() because the latter takes an array of int registers, which will truncate the input arguments while _swix() and _swi() don’t Eeek! I think there are about at least a hundred components in All.git using _kernel_swi, this is really worrying :-/

Mar 31, 2024 3:03pm Charles Ferguson (8243) 438 posts	Use _swix() and _swi() rather than _kernel_swi() because the latter takes an array of int registers, which will truncate the input arguments while _swix() and _swi() don’t No, that’s bad advice. `_kernel_swi` takes an array of register sized objects. There’s no reason to believe that it wouldn’t continue to do so. It’s only purpose is to pass register values to system calls, so it would be updated to take an array of register values, and you should assume as much. Or if you don’t assume that, assume that `_swix` will also truncate the values and that you’ll have to take different actions.

Mar 31, 2024 3:25pm Stuart Swales (8827) 1384 posts	It wouldn’t hurt to define `_kernel_swi` to take an array of `intptr_t` right now, though that just shoves the burden back on the callers not to cast wider values and pointers down to `int` as they will be doing already. Would be easy enough to trawl through the RISC OS source; hopefully would eventually get third-party code to migrate.

Mar 31, 2024 4:52pm Paolo Fabio Zaino (28) 1933 posts	Use _swix() and _swi() rather than _kernel_swi() because the latter takes an array of int registers, I respectfully suggest a different approach. It might be more beneficial to encapsulate these calls within your own abstraction layer, particularly if ROOL has not already provided a design for it.

Mar 31, 2024 4:58pm Julie Stamp (8365) 480 posts	It wouldn’t hurt to define kernel_swi to take an array of intptrt right now, though that just shoves the burden back on the callers not to cast wider values and pointers down to int as they will be doing already. Would be easy enough to trawl through the RISC OS source; hopefully would eventually get third-party code to migrate. That sounds a lot less stressful! I was wondering if cppcheck could help for this sort of thing? Can it be asked to report on casts from pointers to int?

Mar 31, 2024 4:58pm Charles Ferguson (8243) 438 posts	I respectfully suggest a different approach. It might be more beneficial to encapsulate these calls within your own abstraction layer, particularly if ROOL has not already provided a design for it. I had assumed that Sprow meant only that these calls be /in/ the abstraction layer of libraries, as that had already been mentioned by you, me and others.

Mar 31, 2024 9:44pm Paolo Fabio Zaino (28) 1933 posts	Can it be asked to report on casts from pointers to int? It should when you use enable=all, but, to be 100% sure, I’d run a: cppcheck --doc With the specific version of cppcheck you’re using, that will show which checks are provided. Eeventually you can also run a: cppcheck --errorlist Definitely there are multiple tools that can report on casts from pointers to int. You should be able to check even just using GCC with -Wint-to-pointer-cast and -Wpointer-to-int-cast (IIRC also clang supports these 2 warnings). I had assumed that Sprow meant only that these calls be /in/ the abstraction layer of libraries, as that had already been mentioned by you, me and others. Oh sorry, maybe my English caused a bit of confusion. What I meant was, probably best to write some more portable version of _kernel_regs like: typedef union { uintptr_t u; intptr_t s; } register_t; typedef struct { register_t r[MAX_REGISTERS]; ... } _my_kernel_regs_t; And an portable abstraction for the syscall mechnism in place of __kernel_swi and then within the abstraction translate it back to __kernel_swi – instead of – using _swi / _swix Althought if, internally it would not matter this much at that point. However, it would matter in relation to what Julie commented which was: Eeek! I think there are about at least a hundred components in All + That sounds a lot less stressful! So, my limited English understanding suggests she is not keen on rewriting all the code that is using already __kernel_swi, hence even the abstraction doesn’t seems to be an option. Maybe I just misunderstood.

Jun 15, 2024 7:44pm Dave Higton (1515) 3592 posts	Has anyone done any work on a 64 bit HAL?

Aug 16, 2024 11:16pm tymaja (278) 185 posts	“How can I make my module 64-bit ready? How can I test if my module is 64-bit ready? Once I’ve reviewed my module, how can I declare that I think it’s 64-bit ready?” Some thoughts (very good advice above!); Writing the module in C is a good idea. What I would do for anything, except my ARM64 BASIC, would be to write it in C, and when it works, test performance, and make a few ‘alternate’ versions of areas that are slow in C, and code those in ARM32 or ARM (a challenge, of course, is register widths if dropping to ARM32 assembly from C). Another thing is ‘special registers’; Register 18 is reserved on iOS, macOS , Linux, Windows etc, so don’t touch that register! Use macros for all stack operations, because; – the stack pointer must be kept 16-byte aligned at all times (you can load unaligned, even store unaligned, but the SP itself must always be 16-byte aligned, otherwise an exception is raised at the next attempt to load/store from stack (16 byte alignment checking can be disabled I believe, but for portable code to other platforms, this stack alignment rule must be followed!) X19, X30, SP need to be preserved across function calls too. X29cis the frame pointer (FP) and should point at a ‘stack frame’. On Apple OS, it must always point to a valid stack frame. X0-X7 are used for passing arguments in the ABI; This is useful: https://developer.arm.com/documentation/102374/0102/Procedure-Call-Standard For Linux; X0-X7 are used to return values to the user from a procedure call. Also X8 (‘XR’) is used for indirect return value – passing a pointer. Nit sure how RISC OS64 would use X0-X8, but I expect pointers could be passed in any registers. x9-x15 : your code can ‘trash’ these : the caller has to save these before calling a function (caller-save) X19-X28 : callee save; you can use these, but must leave them untouched when returning from a function (the caller expects these to be unchanged) X29 :FP X30:LR This may be helpful when doing small sections of ARM64 code in a C module!

Aug 17, 2024 1:19am David J. Ruck (33) 1696 posts	Writing the module in C is a good idea. What I would do for anything, except my ARM64 BASIC, would be to write it in C, and when it works, test performance, and make a few ‘alternate’ versions of areas that are slow in C, and code those in ARM32 or ARM (a challenge, of course, is register widths if dropping to ARM32 assembly from C). No, and no again. If it is slow in C with modern compilers (and particularly on 64 bit ARM) and re-writing in assembler is a fools errand, tiny gains at best. The way to make code faster is come up with better algorithms.

Aug 17, 2024 2:29am tymaja (278) 185 posts	I agree with most of this and that most compilers are impressive nowadays – I wrote a ‘VDU-like’ framebuffer text plotter in C, without any optimisation, on RPi5, which used multiplies and many adds in the inner loop. It is fast despite this, and when moving the multiplies out of the inner loop, it actually slows down a bit – showing that it detected the redundant arithmetic in the inner loop, and optimised it rather well. Which I found impressive! I do believe there will always be a role for assembler in coding, but do agree it can sometimes give small gains for a lot of work compared to C. One of my interests is realtime 3D graphics, which is why I tend to gravitate towards assembler (although, again, the 3D drivers are the biggest overhead these days, so hand-optimised assembler won’t do much unless it is causing a slowdown that can’t be fixed any other way. At least one role for assembler would be to use the Carry bit. It is possible to manually ‘emulate’ carry in C, or use someone else’s libraries, but assembler can do a 64 / 128 bit int add in 2 instructions using ADC. Another would be in low level system code (including modules, if changing the mode / EL). Although most modules could be in USR mode anyway (like BASIC!)

Aug 17, 2024 6:45am Rick Murray (539) 14047 posts	I do believe there will always be a role for assembler in coding The less, the better. In RISC OS, for making API calls, though this is usually hidden away in some library code. Elsewhere? Hardware glue logic, like the interrupt handler between the exception and when it hands over to C code. And… That’s about it. People don’t even program little microcontrollers in assembler these days – only very specific time/cycle critical code, but whether or not there is any depends upon the application, making a cheap oscilloscope, yes, making a smart toaster, not so much. At least one role for assembler would be to use the Carry bit. ? When working in a higher level language, you shouldn’t be concerned with such things. Let the compiler worry about those details. Or do you mean the various quirky ways the carry bit is used in the RISC OS API? That’s why we have _kernel_swi_c() and it’s part of what I meant about the API written in assembly for assembler programmers. To put that last part into context, let me quote from the Arthur PRM, in particular the second part here:

Aug 17, 2024 4:43pm Steve Fryatt (216) 2124 posts	assembler can do a 64 / 128 bit int add in 2 instructions using ADC I’d hope that any half competent compiler could do this, too (at least for 64-bit, and anywhere that implements 128).

Aug 17, 2024 5:50pm Rick Murray (539) 14047 posts	I’d hope that any half competent compiler could do this Example program I threw together: #include <stdio.h> // WhyTF didn't they make a new data // type for 64 bit? "long long" is dumb. unsigned long long bigone = 0x1122334455667788; unsigned long long bigtwo = 0x2233445566778899; int main(int argc, char argv[]) { // This convoluted nonsense is to stop // the compiler just doing the calculation // itself and storing the result. ;) if ( argc > 1 ) bigone = bigone argc; bigone += bigtwo; printf("The result is &%llX.\n", bigone); return 0; } Here’s the addition that was compiled. 000080C0 : .... : E8950003 : LDMIA R5,{R0,R1} 000080C4 : .... : E893000C : LDMIA R3,{R2,R3} 000080C8 : .... : E0900002 : ADDS R0,R0,R2 000080CC : .... : E0A11003 : ADC R1,R1,R3 This isn’t the late ‘80s/early ’90s. There are few good reasons to use assembler these days, and plenty of good reasons why you shouldn’t. RISC OS itself, ironically, being one of them. ;)