SharedCLibrary_LibInitAPCS_R documentation

47 posts, 5 voices

Pages: 1 2

Mar 25, 2014 12:51am

_swi(x) dates from 1991 though I’m unsure when it went into CLib. Another thread suggests it went into Desktop C in 1991, so maybe from RISC OS 3.1 onwards (3.1 being 1992)?

Mar 25, 2014 7:23am

Jon Abbott (1421) 2651 posts

As there’s no mention of _swi or _swix in any official PRM and they’ve been there for some time, I suspect they were meant for internal use only. Referencing the CLib documentation on riscos.com about _kernel_swi:

Warning: If you use this function to call a SWI that returns an error longer than 148 bytes, the register dump area is corrupted; even longer errors may corrupt other vital system data. You should ensure that no error will be returned – or work round this problem by instead using the internal function _swix, which is documented in the C library header files.

It’s a fair assumption that CLib chunk id 2 has been 183 official entries and 185 unofficial entries since RO3.1, so I’ll code accordingly.

Mar 25, 2014 11:16am

Jeffrey Lee (213) 6048 posts

Warning: If you use this function to call a SWI that returns an error longer than 148 bytes, the register dump area is corrupted

That doesn’t sound very useful! (Although I’m not sure what register dump area the docs are talking about)

I wonder if that’s just a bug in ROL’s version? From a cursory look at the sources it doesn’t look like ROOL’s version suffers from the same problem (the workspace allocation shows a full 256 bytes for the error buffer).

However we do definitely have a problem in that the routine to copy the error message to the buffer doesn’t check for buffer overflows, so if a module did generate an error string longer than 252 bytes we’d be in trouble.

Mar 25, 2014 11:51am

Jon Abbott (1421) 2651 posts

Stop copying at a zero or 251 chars and append a zero? Not ideal, but will prevent corruption in the unlikely event an error is >251 chars long.

Incidentally, swi and swix are documented in CVS

Mar 25, 2014 8:25pm

Rick Murray (539) 13840 posts

_swi(x) dates from 1991 though I’m unsure when it went into CLib.

19-Nov-91. It’s in Lib.RISC_OSLib.clib.s.cl_entries.

It is also in RISC OS 3.10. Just poked around with an emulator and ROM image. It is the final two entries of the old-style jump table, you’ll find it at &3978E9C assuming dump starts at ROM address, or you’re looking at a live ROM. ;-)

Warning: If you use this function to call a SWI that returns an error longer than 148 bytes, the register dump area is corrupted; even longer errors may corrupt other vital system data.

The kernel lib data area defines the error block as 63 (*4) bytes. That plus the error number is 256 bytes. I think if CLib was fundamentally broken in failing with errors only a little over half the permitted size, it might have been noticed long before now.

For what it is worth, I just added this to the start of one of my programs:

static void clib_bomb(void)
{
   // the following is all one line in the source, split for sane
   // looking forum posting...
   char cmd[] = "*Info VeryLongName345678901234567890123456789012345
678901234567890123=64=89012345678901234567890123456789=100=567890123
4567890123456789012345678901234567890123456789012345678901234567890";

   r.r[0] = (int)cmd;
   _kernel_swi(SWI_OS_CLI, &r, &r);
   printf("R0 = \"%s\"\n", r.r[0]+4);

   return;
}

My program starts up displaying:

'...big filename here!...' not found

where the message is 180 bytes of filename, plus 12 bytes of message around it. 192 bytes in total.

The program continues.
No problems.

Second test was setting the filename to something ridiculous, in the order of 500 characters. In this case, there is no “' not found”, the message is clipped at max length. And, guess what. The program continues without problem.

This is testing with RISC OS 5.21 with modern DDE. Don’t have an old-DDE setup to try it with.

Furthermore – Google tells me:

Your search – 148 site:http://gerph.org/riscos/ – did not match any documents.
[neither did using _kernel_swi find any mention of this]

I’m pretty sure if there was a glaring cockup of these proportions, Justin might have mentioned it…

So – riscos.com – I invite you to supply a snippet or two of code (come on Aaron, you believe you own all incarnations of RISC OS ever created, right? so posting some code won’t be a hardship) to back up this assertion. Because, as far as I can tell, the error block is the right size, plus I have deliberately invoked this situation with the largest size error possible. And it works.

By the way – if you look at the definitions, you will see that the “registerDump” is the 11th entry down. Then the handlers. Then some various important bytes. And afterwards the “errorBuffer” and its workspace.
The only exception to this is “FatalErrorBuffer” (just below) which tramples all over the place. I guess the logic is that none of that stuff is needed as the application dies after a fatal error. This may suffer “consequences” as variables at the end are used in handling fatal errors and it is possible that a long error message may trample these (by my reckoning, that ought to be 4+164 bytes).
That said, I should point out that fatal error != _kernel_swi().

so if a module did generate an error string longer than 252 bytes we’d be in trouble.

I noticed. But is this possible? The kernel’s error handler appears to truncate errors at 252 bytes, so in theory…

BTW, while I have you – why is VSet_GenerateError still playing with the V bit in LR?

Mar 25, 2014 8:49pm

Jeffrey Lee (213) 6048 posts

BTW, while I have you – why is VSet_GenerateError still playing with the V bit in LR?

The comment just above that line explains it all :)

; In    return address, r10-r12 stacked, lr has SPSR for return

In 32bit kernels (for RISC OS 5 at least!) the SWI dispatcher pushes the PC onto the stack and then sticks the SPSR in LR. This means all the kernel SWI handlers (which will be invoked with a branch instruction, not BL) can just poke LR in order to manipulate the PSR flags. Then once the SWI is done it just branches to one of the SLVK_* routines (around line 506), still with the SPSR in LR, which handles passing control back to the caller (triggering callbacks, invoking the error handler, etc.)

Mar 25, 2014 9:04pm

Rick Murray (539) 13840 posts

This means all the kernel SWI handlers (which will be invoked with a branch instruction, not BL) can just poke LR in order to manipulate the PSR flags.

Mmm… And in doing this, you don’t need a metric tonne of defines for different behaviours in a 26 bit (LR) vs 32 bit (CPSR) world? That’s actually pretty bloody smart. [Like!]

(^_^)

Mar 27, 2014 7:19pm

Jon Abbott (1421) 2651 posts

I need to rewrite kernel_swi and kernel_swi_c so I can hypervise the OS_CallASWIR12 call – that bit is straight forward. The problem I have is what to do with any error returned.

How can I get the error block from R0 back into C, so _kernel_last_oserror works?

Looking at the CLib source code, it calls the CopyError macro, which in turn calls the LoadStaticBase macros, which calls the LoadStaticAddress macro. I’m guessing from that, that it’s putting the error in the client’s Static area. At that point I’m lost!

I thought of calling _kernel_last_oserror to get the address of the existing error block and then overwrite it, but that will only work if there’s already been an error.

EDIT: I should probably have said, my code is sitting between the client and CLib. eg __kernel_swi(…) → my_kernel_swi, so I have access to anything the C client does.

Mar 27, 2014 7:47pm

Jeffrey Lee (213) 6048 posts

How about:

Program calls _kernel_swi
Hypervisor intercepts and performs call manually (with X flag set, and _kernel_NONX flag cleared)
Write the result registers to the output struct (including any error pointer – to match behaviour of _kernel_swi)
If no error, exit back to program
If error, call _kernel_swi(OS_GenerateError,outputregs,outputregs) in order to pass the error onto CLib. Make sure you get the _kernel_NONX flag set correctly (i.e. as per the original SWI number)

Mar 27, 2014 8:00pm

Jon Abbott (1421) 2651 posts

OS_GenerateError

I just had the exact same thought, I’ve coded it up and am testing.

Mar 27, 2014 10:27pm

Rick Murray (539) 13840 posts

If error, call kernel_swi(OSGenerateError,outputregs,outputregs) in order to pass the error onto CLib. Make sure you get the _kernel_NONX flag set correctly (i.e. as per the original SWI number)

Are you only looking to have _kernel_last_oserror work? I’m wondering if it would suffice to simply remember the error yourself and then trap/fake calls to read the last OS error…?

Mar 27, 2014 11:02pm

Jon Abbott (1421) 2651 posts

The method Jeffrey suggested mirrors exactly what kernel_swi / kernel_swi_c does and works perfectly.

It seems to be working, Populous and Quest for Gold are both loading on the Pi, so CLib seems to be functional.

Mar 28, 2014 7:47am

Jon Abbott (1421) 2651 posts

Gerph’s rambles on C library make for an interesting read. StubsG sounds integuing, as does the 32bitificiation:

in the 26bit environment, the module generates veneers for use with the 32bit clients so that the clients operate correctly despite the calling conventions differing – this is similar to the extant interfaces for the APCS-A handling

I’ve implemented the reverse: APCS-R client → veneer → APCS-32 Clib

He mentions veneers for callbacks, the only ones I can see are in the language description block used by _kernel_init. As both Populous and Quest for Gold run fine under StrongARM emulation, but do odd things on physical StrongARM (and behave identically on the Pi), I’m wondering if I’ve missed any entry points back into the client, meaning sections of the client code aren’t being run through the JIT and are falling foul of the split cache.

I’m also wondering if I should at some point add veneers for APCS-A → APCS-32 as some of the early C games will use it. Some of the Magnetic Scrolls adventures spring to mind. I need to find some documentation on APCS-A to see if it’s possible.

Mar 28, 2014 11:25am

Jon Abbott (1421) 2651 posts

I’m wondering if I’ve missed any entry points back into the client, meaning sections of the client code aren’t being run through the JIT and are falling foul of the split cache.

What creates the following block at 8000? Is it CLib or Squash?

BL xxx
BL xxx
BL 8040
BL xxx
SWI OS_Exit
followed by 11 values
.8040

Under emulation (clicking Conquest on Populous), the BL at 8000 and the 11 values are overwritten, it then branches to 8000. On physical this isn’t happening, so should I be investing the game code or something CLib related?

Mar 28, 2014 1:10pm

Jon Abbott (1421) 2651 posts

Following the code, I discovered _main needed a veneer, however after returning from _main and the veneer subsequently returning to CLib, it still breaks on physical. What does CLib execute after _main returns?

I’ve almost certainly missed an entry point somewhere. Should I be doing anything about setjmp / longjmp or jmp_buf ? I can’t find any detail on what’s in jmp_buf, but I’m guessing it’s a code/stack/return pointer etc.

The entry points to the client that I’ve veneered so far are:

InitProc (and the execution address returned in R0 which I’ve treated as _run)
_run
FinaliseProc
_main

I need to ensure any application space code is executed through the JIT, so all entry points need a veneer to get into and out of the JIT.

EDIT: The only other entry points I can find in the documentation are: _kernel_register_allocs, _kernel_register_slotextend and atexit.

EDIT2: I’ve added _kernel_register_allocs, _kernel_register_slotextend and atexit and am still seeing the same problem. _main returns to CLib, which then doesn’t continue execution.

EDIT3: After _main returns to CLib, CLib should exit back to the Obey script that ran the program. That happens under emulation, but not on physical, which has me completely perplexed as its outside of the JIT. It shouts cache problems to me, I can’t think what else would work under emulation and fail on physical.

Mar 29, 2014 12:24pm

Jon Abbott (1421) 2651 posts

I now have both Populous and Quest for Glory working on the Pi, took a good week to code the veneers etc but got there in the end.

Thank you all for your assistance with this, not knowing anything about CLib I would have struggled otherwise.

APCS_A will be the next one to veneer, although I’ll leave that until its actually required, if anyone has any documentation for it please let me know.

Mar 29, 2014 6:03pm

Jeffrey Lee (213) 6048 posts

APCS_A will be the next one to veneer, although I’ll leave that until its actually required, if anyone has any documentation for it please let me know.

I assume you’ve seen the documentation in the PRMs? (PRM4-408)

Mar 29, 2014 7:22pm

Jon Abbott (1421) 2651 posts

I’ve read the section on the APCS-A procedure call standard here from that I gather it just needs sl, fp, ip, sp swapped around. If it were that simple, I’d expect CLib to have inbuilt backward compatibility for it when APCS-R was introduced.

What I’m really after is CLib chunk 1 / 2 documentation from the APCS-A era, so I can confirm if any entry/exit changes are required to any CLib functions – some must have changed since 1987.

Mar 29, 2014 8:49pm

Theo Markettos (89) 919 posts

What creates the following block at 8000? Is it CLib or Squash?
BL xxx
BL xxx
BL 8040
BL xxx
SWI OS_Exit
followed by 11 values
.8040
Under emulation (clicking Conquest on Populous), the BL at 8000 and the 11 values are overwritten, it then branches to 8000. On physical this isn’t happening, so should I be investing the game code or something CLib related?

This is the AIF header. If the file is squeezed all BLs but the first are replaced by BLNV, and the first BL branches to the unsqueeze code at the end of the file. When it is done, the BLNVs at the beginning are rewritten to BLs and then it branches back to &8000. That’s how it worked on RISC OS <3.7. The C compiler output unsqueezed binaries with vanilla AIF headers, and running them through ‘squeeze’ put on the compression code and BLNVs. You can tell what version of squeeze was used by looking at the very end of the file – it says something like rcc 4.0 for Acorn C v4.

However, some versions of this unsqueeze code were not StrongARM compatible. So the UnSqueezeAIF module (UnSqzAIF?) catches the initial load and does its own unsqueeze (or hands it around via Service_UKCompression). That means the AIF’s unsqueeze may not get run. If you’re doing something unusual that might trip you up. This is purely OS loading the Absolute at this state – CLib doesn’t get to know about it until the executable is running and calls a SharedCLib SWI.

Mar 29, 2014 9:09pm

Theo Markettos (89) 919 posts

What I’m really after is CLib chunk 1 / 2 documentation from the APCS-A era, so I can confirm if any entry/exit changes are required to any CLib functions – some must have changed since 1987.

I suspect you need Acorn C v1 or v2 documentation for that. C v3 (1989) might mention it in passing as in ‘the following is obsoleted…’. Or Arthur PRM/RISC OS 2 PRM likewise.

Mar 30, 2014 5:37am

Jon Abbott (1421) 2651 posts

This is the AIF header

Theo, thanks for that explanation, makes it very clear what’s going on. The SoE will be:

*Obey … → OS OS_FSControl
*Run … → paravirtualized OS_FSControl	→ Service_UKCompression step 1
	→ Service_UKCompression step 2
	→ OS_SyncroniseCodeAreas
	→ 8000 → _kernel_init → _main

It doesn’t look like Service_UKCompression is doing the decompression, but it doesn’t really make that much difference as the internal decompression goes through the JIT. I should probably look into this step though, as I’d expect Service_UKCompression to catch all compressed Absolutes that aren’t veneered,

The paravirtualized OS_FSControl mirrors the OS source code for this step, by checking the first instruction in an Absolute for BLNV or MOV R0, R0 and offering it to Service_UKCompression if it’s neither.

Where things could be failing (and I have seen fail several times) is the OS_SynchroniseCodeAreas step that’s done after the Service_UKCompression pre/post steps. From my tests OS_SynchroniseCodeAreas doesn’t always succeed in flushing the cache, this isn’t a fault of the code, but looks like an ARM issue, as I was seeing the same problem in the JIT when using flush D-cache entry after writing and executing a codelet. On both StrongARM and the Pi it randomly fails to flush the cache entry you specify. In the end, I had to resort to a full cache flush prior to the JIT switching to the execute stage. Somewhat frustrating, but not really a performance issue due to the very low number of execute stage switches the JIT performs – I’ve not worked it out statistically but it’s probably in the execute stage 99.9% of the time and .1% in the decode/compile stage.

Back on topic…it seems to all be working on the Pi with the latest RO alpha. StrongARM is doing something odd though when CLib is initialised, games quite often write directly to the IRQ vector at &18. On the Pi this is emulated, on StrongARM I’ve made page zero read only and proxy any writes to it so I can fix up the change from using B … To LDR PC, … that happened post RO3.11. This proxy write which is done in SVC reports “Abort of Data transfer” whilst CLib is initialised and I can’t for the life of me see how that’s happening, SVC has r/w access to page zero!

I suspect you need Acorn C v1 or v2 documentation for that.

I think I spotted a floppy of C v2 last night in one of the boxes the project was given, so will look on that to see if it includes the documentation. When I was looking at the !Si source the other day (with a view to modernising it and superseding !ARMSi), I noticed I’d coded the benchmarks in Norcroft C (annoyingly the source for these seems to be missing), I’m not sure if that uses CLib or not, so I may or may not have the documentation somewhere!

EDIT: My copy of C didn’t contain any documentation so I’m still after the C v1/2 documentation.

Mar 31, 2014 8:39am

Jon Abbott (1421) 2651 posts

The hypervised OS_FSControl mirrors the OS source code for this step, by checking the first instruction in an Absolute for BLNV or MOV R0, R0 and offering it to Service_UKCompression if it’s neither.

Shouldn’t OS_FSControl look at the 2nd word in an absolute to determine if it’s compressed? I was just looking at a recently compressed C file and noticed the MOV R0, R0 is at 8004, not 8000. Likewise for older C files, the first BLNV is also at 8004.

EDIT: I’ve found the current AIF header documentation, which has answered this, although it’s still not clear how you actually determine if an Absolute is actually compressed. The MOV R0, R0 is always present at 8000 in the current AIF standard, so Service_UKCompression is never called for current Absolutes, but may be for legacy files that follow the previous standard of which I’ve not found any documentation.

Pages: 1 2

Reply

To post replies, please first log in.

Forums → Community Support →

SharedCLibrary_LibInitAPCS_R documentation

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options