Rewriting assembler modules in C

41 posts, 12 voices

Pages: 1 2

Dec 9, 2020 9:56pm Chris Evans (457) 1614 posts	I was struck by Richard H’s comment in the “No more big 32-bit cores for RISC OS from 2022” thread of: I have neither the time nor the inclination to learn 32-bit ARM programming. I looked at some of the smaller bits of assembler, wondering if I could perhaps convert them to C, and turned away shuddering. I am in awe of Julie Stamp, and those like her, who have rewritten large and complex elements of RISC OS in C. I don’t know where to begin to think about speculating on the merest possibility of starting. Has anyone written a guide how best to rewriting assembler modules in C? Before I’d read Richard’s comment I’d thought (in probably ignorance) that it would be done by writing to the published API with little if any reference to the assembler code, I suspect undocumented features will be a problem. I’ve yet to learn C and doubt I’ll ever have time. But I ask in the hope that the answer to the question would be useful to people like Richard.

Dec 9, 2020 10:14pm Lothar (3292) 134 posts	In the GCC installation, look into Examples.Module.Simple

Dec 9, 2020 10:20pm Richard H (8675) 100 posts	My goodness, I hadn’t even thought about using gcc to compile anything on RISC OS, let alone modules. I’ll have to give it a try. What is it like cross-compiling on Linux? Doable, or do I really need to be doing it in a RISC OS environment?

Dec 9, 2020 10:26pm Stuart Swales (1481) 351 posts	There are those who swear by GCCSDK on Linux. If it’s to go in the RISC OS ROM, it needs to be compiled with Norcroft natively.

Dec 9, 2020 10:33pm Richard H (8675) 100 posts	Thanks. I don’t really have a setup at the moment that would make it easy to develop regularly on RISC OS, so I might explore cross-compiling when I get some downtime (which at the current work levels will probably be some time in March). Mind you, the cordyceps rasperrypii fungus has already infected my brain, and I’m starting to think that maybe I need two Raspberry Pi’s sitting on the desk…

Dec 9, 2020 11:49pm Terje Slettebø (285) 275 posts	@Chris Evans Has anyone written a guide how best to rewriting assembler modules in C? For someone who’s done some work with this earlier, specifically the Sprite system, I may offer some comments. A nice thing about RISC OS is that it’s well documented, and when it comes to the modules, the inputs and outputs for each system call is clearly specified. My approach was basically a black-box reimplementation: I took the specification and wrote the corresponding C++ code. I also looked at the assembly code, to get some ideas for optimisations, particularly if I saw that my implementation was way slower than the original. This worked pretty well. However, sprites are fairly simple, which was one reason for starting there. Other modules may prove much more challenging, such as rewriting the Font Manager or the Draw module. Challenging, indeed, especially since assembly code isn’t exactly easy to reverse-engineer to get back the original algorithms.

Dec 15, 2020 9:24pm Chris Evans (457) 1614 posts	I had hoped Julie Stamp would contribute to this thread as I believe she has recent experience of rewriting a couple of modules in C. As there is no way on this forum to private message someone, if there is anyone with her contact details who can point her in the direction of this thread I’d appreciate it.

Dec 15, 2020 9:41pm David J. Ruck (33) 1635 posts	The rewriting part is very much the smallest part of the process. The main task is to write a comprehensive set of regression tests, so you know the new C module works the same as the old assembler one. Ideally the tests can be written from the documented behaviour, but in practice documentation wont cover all the edge cases, so a certain amount of reverse engineering of the module will be required to determine how it works. Once you’ve designed and implemented the tests, the module will almost convert itself. Then of course you release it into the wild and find that people use it in a way you, or the original authors, didn’t envision, and if you haven’t replicated every undocumented foible, there will be a stack of bug reports.

Dec 15, 2020 11:19pm Julie Stamp (8365) 474 posts	Sorry Chris I had seen this, I was meaning to put some notes up on the wiki I had started, but I got waylayed. Richard, don’t worry about toolchains at this stage, if you’re curious about doing it then go ahead with gcc on Linux, cross-compiling works fine. Once you’ve got it into shape, you could look into the DDE; it shouldn’t be too much work to “cross over” if you’ve written it sensibly. I did that with both of Obey and ShellCLI, but I start now right away in Norcroft and shared makefiles because I’m getting used to those. Testing is important, but not straightforward, and I’d welcome a discussion about that in Code Review.

Dec 16, 2020 2:41am Charles Ferguson (8243) 427 posts	In my view, there’s a few ways to re-write modules from Assembler in other languages (and please do remember that it is a while since I did this from ARM to C). My discussion here will cover how I approached conversions, why I did some things the way I did, and how you might fit that into the system. It will be largely anecdotal, with explanations of why I did things certain ways, and how it helped. The way that I was approaching module conversions for Select was to have no new modules in Assembler if they can be avoided. That’s a pretty obvious thing to say, but it’s kinda important that you start out at that point. Secondly, decide what you’re trying to do. It would be great to go and just start from scratch on some modules. I did this on ResourceFS (never released), and had a C version that worked enough to make things run, but… then you have to have the SCL loaded before hand unless you don’t use it (a viable option, but more tedious), or you change SCL to be a little more forgiving of the other of system startup (which I think I did, but that came later). So there’s that issue of what you’re going to do with it afterward, which can affect you. Then there’s the issue of how you are going to get the code replaced – are you going to ‘big bang’ drop in your new version (you’re going to need to be able to prove to yourself that it’s solid well before that point). That’s where testing comes in – as David has immediately said. A half-way house that I chose was to integrate C into the existing assembler modules so that the code can be written in C, and tested at in application space (or, in my case, ‘orbiting around a totally different planet’ – tested on linux). In this scenario you identify the parts of teh code that are isolated enough to be implemented separately and then you write that code as if it’s going to go into the module. Then you write some shims to make it look like the environment you’re actually targetting (the module) but which works like regular C code. To take, for example the code in the Kernel (which does the parsing of mode strings, module header checks, abort management, some display device things, and something to do with the page tables and the sprite flags) has code that has a workspace file that looks like this: /******************************************************************* * File: workspace * Purpose: Definition of the workspace we're passed from the kernel * Author: Justin Fletcher * Date: 06 Dec 2003 *****************************************************************/ #ifndef WORKSPACE_H #define WORKSPACE_H / Keep this in line with the data in hdr.KernelWS / typedef struct global_s { struct abortindirection_s AbortIndirection; struct displaydevicesworkspace_s DisplayDeviceWorkspace; } global_t; #ifndef TEST global_t __global_reg(6) global; #else extern global_t global; #endif #endif The global register means that the static base is used to hold all references to the ‘global’ struct, when called within the Kernel, and conditionally when you’re in TEST it’s just a global variable. You have to be careful when doing this sort of thing to not pass around pointers to functions if you don’t want relocation, and you can’t use static variables because they will be referenced in ways that are more difficult to support. You need to turn off the software stack checking if you want to mimic the rest of the assembler modules, too. When you need to call your C code you set up the function so that it transitions into the C (APCS) environment. In the Kernel, there’s a CUtils (actually it’s almost exactly the same CUtils file as is in the Filer, but I never made them common), which includes macros for calling the C functions, and some small implementations of functions that you might like to use from C – kalloc, kfree, memset, memcpy, strcmp, a __rt_udiv (for division – not sure where I was using that but I assume it was in the display device operations off OS_ScreenMode), and a few others. printf is implemented in a C file, and is just a very simple implementation. The C calling macro is very simple: ; Normally the macro corrupts r1-r3 (r0 corrupt if not returning value) MACRO $label CCall $function, $stack, $a1, $a2, $a3, $a4 $label [ "$stack" <> "" Push "$stack, v6, r10-r12" \| Push "v6, r10-r12" ] LDR v6, =CWorkspace [ "$a1" <> "" :LAND: "$a1" <> "r0" MOV a1, $a1 ] [ "$a2" <> "" :LAND: "$a2" <> "r1" MOV a2, $a2 ] [ "$a3" <> "" :LAND: "$a3" <> "r2" MOV a3, $a3 ] [ "$a4" <> "" :LAND: "$a4" <> "r3" MOV a4, $a4 ] MOV sl, sp, LSR #20 MOV sl, sl, LSL #20 ; stack limit is at the base of the stack ADD sl, sl, #&21c ; upped offset so that we have a valid ; value for BTS MOV fp, #0 ; no frame pointer (end of chain) BL $function [ "$stack" <> "" Pull "$stack, v6, r10-r12" \| Pull "v6, r10-r12" ] MEND And this can be called from your assembler – about the simplest example I can see here is that of checking the module header validity: CheckHeader SIGNATURE Entry "R0-R3" ; check_module expects ; r0-> module ; r1 = size ; r2 = 'bitness' (26 or 32) MOV r0, r1 MOV r1, r2 MOV r2, #{CONFIG} CCall check_module TEQ r0, #0 BEQ %FT90 MOV lr, #0 ; check each case and set error; needs more error messages TEQ r0, #modvalid_not32bit ADREQ lr, ErrorBlock_RMNot32bit TEQ r0, #modvalid_badursulatable ADREQ lr, ErrorBlock_RMBadServices ; we don't have an error for 'bad flag table' (and the C doesn't return it yet) ; TEQ r0, #modvalid_badflagtable ; ADREQ lr, ErrorBlock_RMBadFlags ; fallback case TEQ lr, #0 ADREQ lr, ErrorBlock_BadRMHeaderField MOV r0, lr LDR r14, [sp, #Proc_RegOffset + 41] ; find module base LDR R4, [R14, #Module_Title] ADD R4, R4, R14 ; pointer to module name BL TranslateError_UseR4 ; sets V STR r0, [sp, #Proc_RegOffset] EXIT 90 CLRV EXIT Essentially all this is doing is taking the assembler parameters, setting them up for the C call, and then turning the return code into an error message. The function in the implementation has a prototype: module_valid_t check_module(module_t mod, unsigned long size, int bitness); which means that you can write a bunch of tests for it, passing in things that look like modules (or not) and seeing if the function reacts in the right way or not. You then test that to your satisfaction outside of the Kernel – run it through your debugger on a sane OS if you find it harder to work out what’s going on – and then you wire it up, like above. Why would you do this halfway house? Because… the prologue comment in the ‘checkmodheader’ file says it well enough… Purpose: Check that a module's header is valid * Re-written in C because the assembler is just a waste * of time By writing it in C and having a test system that you can use independantly you can avoid building a ROM, booting it, finding it doesn’t work and then repeating. This is less of an issue for some modules, but even still the advantage of writing your code this way – small, testable modules, in a high level language – means that you can find what you need to do in a good way without all that tedious mucking about in assembler. I’ll give a slightly different example of assembler translated to C, which is again a Kernel feature. The replacement of the video system with VideoV meant that all the graphics operations were being farmed out to separate modules. I’ve barely started on this example, but I’m going to digress here because it’s pertinent. Parts of the Kernel in RISC OS 4 were carved out of the system and moved to their own modules. This wasn’t just capricious, but because it isolated those components. It removes the collusion and makes them independant and replaceable – you can replace the ReadLine module with another one and because it’s just a vector handler you’ll get the new one loaded without fuss. No need to mess with rebuilding a Kernel and a ROM and reboot because you want to try to make a small tweak to its handling. Similarly CLIV, system variables, oh, and a few others. The whole point is that the Kernel is complicated enough already and separating these out means you can deal with them independantly and be reasonably confident that when you change them you won’t be affectng other parts of the Kernel. Plus, of course, those little tweaks don’t cause the Kernel version number to race (or mean that you don’t update the kernel versin number because otherwise it’ll race). AND it means that you can choose to just reimplement those interfaces however you wish – if you wanted to replace CLIV with a C module that did the same thing, you can. So… the functions were being ripped out of the Kernel – there was a base Video module that did some functions, a VideoHWVIDC module that handled the VIDC hardware (mode selection, pointers) (and a VideoHWVF for ViewFinder), a VideoSW module that handled the graphics operations and VDU4/5 text, and a SpriteOp module that handles all the sprite operations (I don’t think that was ever released). Each of these modules was abstracted away from the Kernel by deciding what their code functions needed to do, and placing those operations on the end of a vector. This meant that they could be modified on the fly – something that’s not easy if, for example, they are parts of the VDU operations where you’d have to trap a lot of state (a LOT of state) to manage that yourself if you say on WrchV. This meant that those modules and their functionality can be replaced easily. Ok, so why explain this… Teletext. This is a 16 colour mode which has a very low resolution. There was a higher resolution version, but it was limited to (essentially) double resolution. Which is great and all that, but it’s still all in assembler, and dear god I don’t want to write any assembler – not for Teletext. And the point of the abstraction was that we don’t want to be knowing that you were going to put an 8×8 pixel map into memory at a given location. The abstraction wants to say ‘put a “T” at 2,25’, and let the implementation get on with it. Of equal importance as not wanting to write assembler is the fact that 16 colour modes generally aren’t supported on most modern graphics systems, and the resolution that teletext used would need to be much higher. All of which meant that it was likely to be tedious to update in assembler. So… Teletext was implemented in C in an application. It was designed with functions that would take these sorts of operations. Some examples: void teletext_cls(teletext_t ttx, int x0, int y0, int x1, int y1); void teletext_scroll(teletext_t ttx, int x0, int y0, int x1, int y1, int dx, int dy); void teletext_setchar(teletext_t *ttx, int x, int y, int code); to just Cut and Paste a couple out. This was then built into an application that could do those operations – setting the characters, and rendering the screen. All the teletext graphics are implemented as ColourTrans and OS_Plot operations. The entire rendering system goes through a very light abstraction ‘gcontext’ (graphics context) which is just a structure with a list of function pointers to do the operations you want – primitives like line drawing, rectangles, and text plotting at positions. This context can be switched out for something that does different operations – a very early version produced HTML so that I could see what it would really look like, but this was dropped because… well, it’s not useful. But this meant that you can use it on other systems, by talking to a different type of graphics context. This sort of design shouldn’t be too surprising, but it’s what allows the code to be retargeted and tested and exercised on other systems, or in different ways. Anyhow, with the application written and seemingly working, it was hooked up to the vector interface that had been created for the teletext system. And it pretty much worked. Yeah, there were a bunch of problems with it in some cases, related to the real use of the system (reveal, flashing sections of the screen, getting stuck in a non-redrawing state, etc), but I think most of those were fixed. And then I added the ‘high resolution’ text which was just using the FontManager instead of the system font to draw text. I had intended to get a Teletext font drawn so that it didn’t look so bad, but it was pretty neat. And you could then ignore the fact that it was a teletext mode, put yourself in mode “X1600 Y1200 T16M TX80 TY60” and you had an antialiased command line in relatively high resolution (20px per character, I think) with 80 × 60 characters on the screen. So that’s a longwinded way of saying ‘replace the innards of the system with an abstraction and implement it how you like’. You get the testing you want, you get the development environment that you like, and you don’t have to wire it up until you’re happy with it. I’ve mentioned hybrid C-in-assembler modules, but there’s the other way around too. You can start with the C module and put assembler in it. I’m not sure which modules in RISC OS this might actually be useful for, but I could imagine a case where you’ve got a bunch of functionality in the assembler that you haven’t got around to converting yet, so you place all that module inside your C module that provides the external interface and then you slowly erode the amount that the assembler does by converting it to C. I guess I kinda attempted something like that by lifting parts of FileSwitch out so that they could be called from a C module, but it was an utter disaster to work with and it never saw the light of day. I could see that RAMFS would be doable in this way – you provide just the block copies and heavy lifting in assembler, and re-write the rest in C. Honestly, RAMFS does nothing clever. It creates a DA, lets you resize it, formats it when first created, and after that it’s just a block interface to FileCore which calls the copy interfaces in assembler. Another way you can attack the problem is to just implement functionality. I shall give the example of the SysLog module I created (before I joined ROL, but which was supplied). Jon Ribbens (I believe it was, although now I say that it might have been Andrew Clover… anyhow) created the SysLog module, distributed by their Doggysoft group. This was amazing, and it was assembler, because you want it to be fast and functional. Fine. I absolutely forget the reason why I created my version – in C. It was probably so that I could add networking to it. Anyhow, for that, I utterly ignored the code in the SysLog module, and just used their documentation. Because they wrote such great documentation, describing what it did and how it reacted, just like the PRMs. I worked to that definition of what the module should do, and implemented the parts I needed. I’m pretty sure I never implemented things like SysLog_IRQMode except as a stub, and that lost some functionality for certain types of debugging. Testing? Yeah, my SysLog module didn’t have much of that, but it was basically used with all the applications that I could get my hands on that used SysLog. But that’s an example of how you convert it – not quite a closed room implementation, because you’ve still got the original to compare against and can disassemble, if you like, but you’re working to its API which you know and can understand. Modules you might do like this would be things like ResourceFS, and maybe MessageTrans. I’ve talked about different ways of converting the modules, but largely I’ve been assuming you retain all the functionality. That’s not necessary throughout the development. Why? Because you can say “I don’t care about feature X” early on. So long as you remember that you’re going to need those features, you can design your new implementation to replace things in a suitable way. I’m going to talk about the implementation of FontManager in Pyromaniac initially for this example, because it’s pertinent. Briefly, Pyromaniac is a RISC OS implementation in Python, so the FontManager is a Python module that provides SWIs that make it look to RISC OS like it’s the regular FontManager. Except it’s not the regular FontManager. It started out by supporting 3 SWIs, IIRC – FindFont, LoseFont, Paint. And then only as stubs. FindFont did nothing but return a static handle, Paint would print a message saying what it was given. Why? Because you can throw things at it and see how it reacts and whether it matches what you believe the API says it should do. Follow the PRM, and when the PRM is opaque (it is good much of the time, but sometimes you’re left with a conceptual leap, even for me, and I think I know what I’m doing) go and read the source to the assembler module and see how the original reacts. This approach of piecemeal adding things as you need them and find them to be important means that you can rapidly get something that works and can expand out as you need to. Why is this a useful approach? Because there are things that are only needed in a limited number of cases. The bulk of the operations you’ll ever do will be on a subset of the system. Pyromanic does not support a lot of things yet (in the FontManager), and in some cases it’ll never support them. Font_CacheAddr makes no sense, so it’s not implemented. If something calls it, it’ll get an error. And as we know RISC OS programs can detect those errors and handle them… (yeah, I know, but that’s how it should be). Concrete example – Pyromaniac FontManager doesn’t support justification yet. When the Wimp tries to plot a menu which has a key shortcut at the end of the menu item, it uses the justification functions to make the key sequence appear on the right of the menu (a very clever trick, I like it). But the moment that the FontManager returns an error to say ‘Font_Paint is a lazy implementation: Justify coords not supported’ the Wimp goes ‘uh-oh, something’s up… I’m going to panic and revert to the system font’… and keeps working. That’s an example of approaching the system with a best-effort attitude. Good software will cope when there’s errors generated. You’ll still have problems, but you’re not making the porting of a module into a huge project that must get everything right first time. The time to ‘minimum viable product’ is reduced, and you can have something that people can use. Aside from describing the approach, why am I talking about Pyromaniac in the context of converting modules from assembler to C? Because, it fits nicely into the example case of ignoring the actual implementation of the system and only worrying about the API. The Pyromaniac system does not use RISC OS font files; it uses the ‘toy’ font system built into Cairo – a graphics library. This allows it to use Truetype fonts. The FontManager is, in this case, taking a translation of the RISC OS font operations and translating them into the primitives to pass to Cairo. It /can/ do a lot more than the RISC OS FontManager can in this way. Requests for ‘Homerton.Medium’ and the like are converted to something that’s appropriate (I think Homerton maps to ‘sanserif’, and leaves it up to the library to decide what that means). There are some oddities in metrics (the alignment of characters) and how kerning, antialiasing and the like are handled, but that’s to be expected. Yes, that’s an entirely separate implementation on another OS, and but… porting a library to RISC OS and putting it behind an interface layer to make it look like an existing module, such as FontManager is just a matter of time and will. What do you get at the end of it? Maybe not exactly the same system, but probably good enough, and enough that you can just add bits as you find them to be a problem. But you’re in a better place because you’ve got a system which can use much more advanced features then if you’d had to add them to the assembler module. It’s not like interfaces haven’t been deprecated before, and it’s certainly not surprising to have extended features. There are some modules that I just wouldn’t consider for conversion from assembler at this time – FileCore is one. If you wanted a replacement block filing system, I wouldn’t start from there. Obey is another. Why? Because it handles a lot of complex environment that I think is really hard to get right within C, and I don’t think it lends itself to a C module. I just wouldn’t do it – I’d create something new and get people to transition to that and maybe make it close enough to work. So, what Julie Stamp has done to create a replacement module is pretty amazing to me. I’m not saying that it shouldn’t be done that way, but that I would have said that it was a really weird and awkward choice… and yet it works and it’s ended up way more readable than I expected. I don’t know how maintainable it is, but honestly it’s got to be better than an assembler module. I think – and it would be nice to see if she does put some notes up on how she approached it – that the approach of ‘take the intent of the functions and transcribe them to C’ is a good one (and one that I started for FileSwitch at one point). You can then change them to be less assembler-like and more structured c-like as you work with it, whilst keeping the functionality you know works. Earlier I think I argued the opposite(ish) to David Ruck’s comment about if you release something with some features missing/different you’ll get a slew of bug reports. I agree wholeheartedly that this is a possibility. But also, if you keep all the same things from the past, you’re going to find that you’re drowning under a mountain of obsolescence. Maybe that’s an extreme case, but you still have to be pragmatic about things, because there are only so many hours in the day and so many people to be able to do things. Testing is something that helps with this sort of thing massively, but one interesting part of the goal of testing is not to fix issues. The goal of testing is to /identify/ issues. The difference isn’t always clear, because most of the time testing means you fix things (the test or the issue), but the identification is important because it gives you the information you need to decide if it is good enough. ‘Good enough’ may just mean that you’re documenting a difference. ‘Good enough’ in other cases may mean ‘I don’t care, it’s not a real case that people will hit’. Simple example: Code which calls (within SVC module) SWI OS_ReadMonotonicTime without the X bit set has a bug. Any SWI can return an error, and you should never generate errors from SVC mode. But are you going to care? That SWI’s surely never going to error? That’s an easy fix, but you can certainly imagine cases of calls which don’t expect errors going wrong in reasonable circumstances. The closer any reimplementation is to the original the less chance there will be to have problems. But also the greater chance there is of making more work for yourself. Keeping everything working the way that it did in the past will cause you huge headaches, and some cases are obvious to replace. Concrete example: When the pointer interfaces were moved out of the Kernel, certain applications stopped working. All the APIs were retained, and everything was exactly the same as it had been inside the Kernel but it wasn’t working with an application. I believe the particular failure condition was that starting a drag within the application would get stuck and you could never escape the application once you started that drag (I don’t remember all the details). Reason? The implementation wasn’t ‘exactly’ the same. The state for the pointer was no longer in the Kernel, but now in the module workspace. The application had been looking directly at zero page to find where it knew the data to be. Solution: ; ; Legacy support for the OS Pointer interfaces ; ; People - bad people - read zero page to get at certain variables ; Obviously this won't work in the new environment where the variables ; are in the OSPointer module workspace. So this legacy file will ; propogate them from our workspace to known locations in zero page. ; ; Beware if the kernel changes! ; [ LegacyZeroPageAddresses Legacy_UpdateZeroPage Entry "R0-R11" ; Update MouseXY,MultXY ADR R4, MouseX LDMIA R4, {R0-R3} ; mouse x,y,xmult,ymult MOV R5, #&500 ; zero page STR R0, [R5, #&5A0-&500] STR R1, [R5, #&5A4-&500] STR R2, [R5, #&5AC-&500] STR R3, [R5, #&5B0-&500] ; Update mouse bounds ADR R4, MouseBounds LDMIA R4, {R0-R3} ; bbox ADD R5, R5, #&5B8-&500 STMIA R5, {R0-R3} ; bbox ; Update pointer position ADR R4, PointerShapeNumber LDMIA R4, {R0-R2} ; shape, x, y LDR R4, =&1578 STMIA R4, {R0-R2} EXIT ] END So the support is there, but it’s able to be removed. Part of my reaction was that I don’t care that I’ve broken someone who was doing it wrong, but the pragmatic response is that if it can be kept working, with caveats and the ability to remove it once the need has gone (ie when Zero page became inaccessible entirely), then that is the way to go. You can apply this same argument to so many things, though. FileCore stores the descriptor of the filesystem at the first word of the module workspace in that filesystem’s incarnation. I’m pretty certain that’s not documented, but it’s assumed by a number of things. Converting FileCore to C would probably suffer from that breaking. Big deal? Judgement call as to which is more important. I’ve drifted slightly from the topic, I think, but I’ve one other recent example of writing assembler in a C module. For my hourglass maker (https://github.com/gerph/riscos-hourglass-maker) I implemented the hourglass manipulation code in assembler, which I wrote from scratch without actually looking at the original Hourglass module. Actually the assembler is written by a python program that creates the assembler code, but that’s an implementation detail. The wrapper around that assembler is in C. Why didn’t I write the code that builds the hourglass data in C? I don’t know. The best answer I can give is that I wanted it to be fast and ‘if you’re writing an hourglass module, you do it in assembler’. Oh, and ’I’ve not written any real assembler for a few years’. Stupid, but … hey-ho. Especially stupid because you construct the data in assembler, and then you call OS_Byte and OS_Word which then goes and calls multiple claimants and probably transforms the data in a variety of ways before it finally hits the hardware. (and then I took the same algorithm and implemented it in Python for Pyromaniac, because nothing says sanity like taking a hand creafted assembler algorithm and then putting it into an interpreted high level language verbatim). Finally, I’ll give a more weird example of a module conversion. I have a stub SoundDMA module I wrote in about 2004 which does most of the things that it needs to execpt talking to the hardware. It’s written in Pascal, and uses a Pascal-to-C converter to turn that into regular C code that builds into the module. The point was to see / show how it was possible to write modules in other high level languages, which did a relatively low level operation. The result? Pascal is not the language to use for writing SoundDMA. But it’s quite possible to do that, if you wanted. It’s more readable than the assembler, but then it doesn’t jump through all those hoops that the assembler does because the assembler is actually doing real work with hardware.

Dec 16, 2020 12:21pm Steve Pampling (1551) 8170 posts	Then there’s the issue of how you are going to get the code replaced – are you going to ‘big bang’ drop in your new version (you’re going to need to be able to prove to yourself that it’s solid well before that point). That’s where testing comes in – as David has immediately said. A half-way house that I chose was to integrate C into the existing assembler modules so that the code can be written in C, and tested at in application space No half way house. Reasoning: Since we have different builds for different platforms would it not be a good idea to either: Implement the rebuilt in C modules for building only in the IOMD image – after all swapping ROM files in and out on RPCEmu is extremely easy Implement a new testing stream on the IOMD – same reasoning but only people specifically wanting to test the C rebuilds would use this ROM. Following success on the IOMD it can be introduced to the other beta builds

Dec 16, 2020 12:34pm Julie Stamp (8365) 474 posts	Ok, I’ve made a page here, including the collected wisdom of people who have helped me this year :-) …then you have to have the SCL loaded before hand… I think (or hope!) that this will be ok for us, since some machines already start it very early.

Dec 16, 2020 2:34pm Charles Ferguson (8243) 427 posts	No half way house. Reasoning: I’m not understanding your reasoning. Why would you want to differentiate the manner in which the code is implemented by the hardware platform? The language of implementation of the modules is entirely orthogonal to the hardware on which it runs – assuming that you build for lowest common denominator, and you’d want to do that to reduce your testing burden. My comment to which you a replying, about a halfway house, was to move some of the code within a component to C whilst keeping the assembler structure (the examples being the Kernel, the Wimp, the Filer, Fileswitch maybe). Making the module less assembler focused and allowing greater testing and management of the modules being tested. In modules like the Filer or Wimp (giving two examples of modules that are irrelevant to the hardware platform) these are easy to build and test without caring about a ROM build. The whole point of making these components half-way-houses is so that you don’t have to replace ALL the Wimp with a C implementation before you can run with it. But even still, it’s not at all necessary to build a ROM to do so. The comment was specificly addressing those cases of parts of the system which do not lend themselves to being easily rewritten from scratch. To address your comment about ‘swapping ROM files in and out on RPCEmu’ being extremely easy…. that may be true, but you’re looking at it from the wrong perspective. What you’re talking about is that it’s possible to perform full system integration testing. That’s great, but… System integration testing should be the last thing you do, and all your testing up to that point should have proven that you’re only testing the integration process and its interactions. RISC OS is a modular system, so you should be able to do module level testing without worrying about the rest of the system. One massive gain that you get from having code in C is that you can test the components individually, and find problems without having to build and distribute a whole system. Similarly, dismissing the possibility of providing evolutionary moves from assembler to C in parts in this way (and I’m assuming that the counter example of C-wrapper with assembler back end is also a no-no for you as that is just another form of half-way house), is impractical from a delivery standpoint. If you have just two different versions, implemented in different ways, you’re going to have to have a big-bang switch over where you abandon what went before and move to the new system. That means that you see none of the benefits of changes to the structure until that switch over. It also means that you either commit to abandoning any development on the assembler side of the system until it becomes obsolete by the switch over, OR you commit to playing catch up with the assembler versions as it is enhanced and you’re now chasing a moving target. Since the development of entire replacements is not insubstantial, the work needed and the time taken can leave the systems struggling to ever be delivered. And, of course, if you find yourself derailed and needing to focus on other tasks, the work you make on the complete rework gets parked and the lack of any feedback into the things that people are actually using means that there’s no benefit to anyone in what you’ve done to date. Whereas with a half-way model, you have the opportunity to deliver incremental benefits, at increasingly easier to manage timescales because the parts of the system you’re working with become easier as more is converted over. By all means provide different streams for people to use at the end of the process, but that’s a release management and distribution issue which is independant of the process of modifying modules. The need to provide a testing stream is not a requirement or dependency of how you write those modules.

Dec 16, 2020 2:40pm Charles Ferguson (8243) 427 posts	Ok, I’ve made a page here, including the collected wisdom of people who have helped me this year :-) Quick comments… pretty sure that library-enter-code is present in CMunge. Certainly the -blank produced it for me: ; The library initialisation code sets up the C environment for use with ; this module. You may wish to perform some initialisation prior to the C ; environment being created. You must call _clib_initialisemodule prior to ; returning, or provide an identical functionality. ; You will probably never need this field. library-enter-code: _clib_initialisemodule Your comment about OSLib surprised me; it should absolutely not make the image larger. OSLib builds individual interfaces into different areas in the AOF, so only the interfaces that you use should be included. And they should be pretty small. They’re much more efficient than the _swix use. So I’m geniunely surprised that you find that it has generated larger components.

Dec 16, 2020 3:14pm Chris Evans (457) 1614 posts	Wow! Thanks Julie and Charles for your comprehensive responses. I’m sure they’ll be very useful.

Dec 16, 2020 3:35pm Julie Stamp (8365) 474 posts	I missed that before. Looking at the docs included with the current !GCC (cmunge 0.76) it says `* library-enter-code: <symbol-name>` `Is an alias for library-initialisation-code.` but I think calling _clib_initialisemodule wouldn’t do the right thing for module start? OSLib surprised me too! I’d originally thought that as a library the required bits would be put in some shared area in ROM. Ben did a breakdown of ShellCLI. On the latest Obey it’s 580 bytes (admittedly that’s using SWI’s for file handling rather than clib).

Dec 16, 2020 3:48pm Julie Stamp (8365) 474 posts	It also means that you either commit to abandoning any development on the assembler side of the system until it becomes obsolete by the switch over, OR you commit to playing catch up with the assembler versions as it is enhanced and you’re now chasing a moving target. Since the development of entire replacements is not insubstantial, the work needed and the time taken can leave the systems struggling to ever be delivered. And, of course, if you find yourself derailed and needing to focus on other tasks, the work you make on the complete rework gets parked and the lack of any feedback into the things that people are actually using means that there’s no benefit to anyone in what you’ve done to date. I think this is good for anyone looking at larger components such as the window manager to consider carefully. Unless I’ve got the wrong end of the stick somewhere, there is relevant work happening on FileCore, which I count as a “large component”. I wonder what approach is being used?

Dec 16, 2020 4:06pm Charles Ferguson (8243) 427 posts	re: library-enter-code You might be right; there were a few bits like that which were done… oddly in CMunge.

Dec 16, 2020 4:48pm Richard H (8675) 100 posts	Thank you, Charles and Julie: this is an absolute goldmine of really useful information, and I am grateful to you for taking the time to put it all down on e-paper.

Dec 16, 2020 6:02pm Steve Pampling (1551) 8170 posts	I’m not understanding your reasoning. Why would you want to differentiate the manner in which the code is implemented by the hardware platform? I’m not – that’s what I get for posting quick comments while a system change is running. Far too brief and missing pertinent information. So here we go: The idea was to use a “disposable” setup that everyone with a PC can run and experiment by dropping in a ROM image with the changed items. It breaks, oh dear how sad, replace with a clean setup if it corrupted the disc resident items or move on to the next (fixed) ROM image. Many people treat RPCEmu as a ‘work’ system, whereas I consider it an experimental system and somewhat cheaper¹ than even a Pi and a set of SD cards. Much like the online setup you’ve done but sitting on the laptop/desktop of every volunteer RO tester. We can’t really propose that the standard beta nightly release becomes that level of test playground without annoying a large number of general users who use the betas because it’s a long time between stables. ¹ Cheerfully binned hardware that suffered a battery age issue can become a “needs a PSU to work” laptop to run experimental stuff on, at zero cost. If I tidy bedroom 4 I can have a proper ‘den’

Dec 16, 2020 7:07pm Steve Fryatt (216) 2105 posts	We can’t really propose that the standard beta nightly release becomes that level of test playground without annoying a large number of general users who use the betas because it’s a long time between stables. I don’t see why not: that’s exactly what they’re for…

Dec 16, 2020 7:33pm Charles Ferguson (8243) 427 posts	I’m not – that’s what I get for posting quick comments while a system change is running. Far too brief and missing pertinent information. Ah, sorry, I guess I read a little too much into what you were implying. Yes it seems completely reasonable to be able to have an additional system integration test image /in addition/ to the other forms of testing. As for how users use the distributed images… well, that’a a matter for them, and if they treat betas as working then maybe you need a different stream. Call it ‘alpha’ maybe ? Or educate users that by running a beta you’re going to get breakages and if you don’t want that, you run a stable version. Again, that’s a distribution issue rather than an issue with the development of a system.

Dec 16, 2020 7:55pm Steve Pampling (1551) 8170 posts	I don’t see why not: that’s exactly what they’re for… I see the likelihood of this being something to do alpha testing with. If the develop and test is to go at any real pace you need to throw the alpha’s out pretty much without testing beyond a status of doesn’t break anything obvious and the beta’s at the current rate (which is NOT every night)

Dec 16, 2020 8:03pm Steve Pampling (1551) 8170 posts	Ah, sorry, I guess I read a little too much into what you were implying. I guessed that, no worries I’m hard to insult¹ As for how users use the distributed images… well, that’s a matter for them, and if they treat betas as working then maybe you need a different stream. Call it ‘alpha’ maybe ? That was the way I was thinking. Since we are talking about fairly core items for much of the pathway the chance of a base-over-apex event is high I thought to avoid ROM-on-a-SD instances that would require swap out of the card and consequent wear-and-tear on the SD socket. ¹ Well the insults are easy, but I don’t take offence easily.

Dec 16, 2020 9:28pm Julie Stamp (8365) 474 posts	beta’s at the current rate (which is NOT every night) It says nightly on the downloads page? A testing stream does sound like something to think about, though I guess it would take some doing at ROOL HQ. At least early stage testing (before merging in gitlab) is opt-in, and in most cases doesn’t need a ROM – with Ben’s work on CI it’s now easier than ever to drop in a softload of a module you want to take for a spin.

Pages: 1 2

Reply

To post replies, please first log in.

Forums → General →

Rewriting assembler modules in C

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options