OS_ClaimProcessorVector again

147 posts, 30 voices

Pages: 1 2 3 4 5 6

Aug 31, 2021 8:01pm

Rick Murray (539) 13839 posts

What Paolo said

Yup, I was being sarcastic. ;-)
Those who aren’t interested in current goings on and wish to remain in the past, well, they won’t be here, this doesn’t concern them.

However, a modern PC can put in a pretty decent emulation of an ARM machine, plus with the ability to cross compile one isn’t stuck with things like the DDE and Zap/StrongEd but can use modern development tools (hmm, like CoPilot to steal the code for you!).

I think nemo had, in the past, made his feelings about this clear.
There’s someone else (I forget who) who doesn’t run RISC OS on native ARM.

It is an option that people use, so even though it’s a bit of a drag with respect to modernity, I don’t think we’re in a position to ditch the RiscPC class just yet, as that’s our RISC-OS-on-something-else option.
Our only RISC-OS-on-something-else option.

We need a better emulator ¹. But “Who?” (GOTO 10) ;)

¹ And woe is you if said emulator doesn’t run on Windows (back to XP), Linux (all forty seven distros), MacOS, and Sarah’s granny’s smart-toaster. And that one weird person that will want it for Minix.

Aug 31, 2021 8:53pm

Stuart Swales (8827) 1357 posts

There’s someone else (I forget who) who doesn’t run RISC OS on native ARM.

I think there are a number of devs who will spin up a RPCEmu instance to quickly recompile/test something rather than switching their ARM hardware on (or it may not even be to hand).

Feb 26, 2022 7:58pm

Jeffrey Lee (213) 6048 posts

If anyone’s wondering what happened so this, I implemented it on this branch of the kernel and then spent a while updating FPEmulator and VFPSupport to use it (and then further updates and testing to fulfil the goal of getting FPA+VFP working with the SMP module). The documentation is currently incomplete, but this post has a summary of the entry state for the handlers.

My goal for the next few weeks (/months) is to get this tidied up and submitted. There are a couple of pre-requisites which need merging first, so it might take a while.

Feb 26, 2022 11:08pm

Paolo Fabio Zaino (28) 1882 posts

@ Jeffrey

If anyone’s wondering what happened so this, I implemented it on this branch of the kernel and then spent a while updating FPEmulator and VFPSupport to use it (and then further updates and testing to fulfil the goal of getting FPA+VFP working with the SMP module).

THANK YOU soooo much! I was hoping you would start implementing support for the VFP (and you know why), this is an AWESOME news! :D

I know you’re busy, so my apologies for this request, but as in the past could I have a build of it please?

I really want to port Genann to support your SMP work, but, as you know well, I am “locked” to use Floating point (this is normal in ML).

Thanks in advance for your time, answer and work!

Feb 27, 2022 7:08am

Jon Abbott (1421) 2651 posts

Thanks for the update Jeffrey. A couple of questions:

When are you planning to submit the change to the main trunk?
When will the documentation be available?
Are the changes applicable to all CPU’s? ie ARMv4 onward

Once the documentation is available, I’ll have a better idea of what (if any) changes I need to make to the Abort handlers in ADFFS and if I need to consider cutting support for earlier OS versions.

Feb 27, 2022 1:00pm

Jeffrey Lee (213) 6048 posts

I know you’re busy, so my apologies for this request, but as in the past could I have a build of it please?

Sure. There are a couple of tweaks I want to make, and then I’ll email you a new build.

When are you planning to submit the change to the main trunk?

There are a couple of pre-requisites which need finalising and merging first, so it’ll take at least a few weeks before the OS_ClaimProcessorVector changes go in.

When will the documentation be available?

Either today, or some time next week. I’ll post here once it’s available.

Are the changes applicable to all CPU’s?

Yes, the new API works on all machines currently supported by RISC OS 5.

The old API is still there and can still be used, so there’s no immediate requirement for software to switch over to it.

Mar 2, 2022 11:44am

David Pitt (3386) 1248 posts

A Pi ROM build of the WIP: A useful SMP threading system came a bit unstuck. All of the seven SMP modules including SyncLib 0.05 were added to the current Dev beta and built with DDE30d.

Kernel (Sources.Kernel)...
amu -E  install_rom INSTDIR=ADFS::ROOL.$.Rpi-SMP.BCM2835.RiscOS.Install.ROOL.BCM2835.RISC_OS COMPONENT=Kernel TARGET=Kernel ASFLAGS="-PD \"CMOS_Override SETS \\\"= FileLangCMOS,fsnumber_SDFS,CDROMFSCMOS,&C0\\\"\""
do mkdir -p bin
SetEval KernelBase "4" + STR ( 227858432 + ( HALSize LEFT ( LEN HALSize - 1 ) ) * 1024 )
Do link -aif -base <KernelBase> -RW-base 0xff000000 -bin -d -o bin.Kernel_aif GetAll.o SEH.o support.o aborttrap.o atarm.o atcontext.o atinstr.o aterrors.o atmem.o C:SyncLib.o.SyncLib
ARM Linker: (Warning) Attribute conflict between AREA SEH(C$$code) and image code.
ARM Linker: (attribute difference = {NO_SW_STACK_CHECK}).
ARM Linker: (Error) Relocated value too big for instruction sequence.
ARM Linker: (at 0x24 in barrier(Asm$$Code): offset/value = 0x2fb4b58 bytes)
ARM Linker: (Error) Relocated value too big for instruction sequence.
ARM Linker: (at 0x28 in barrier(Asm$$Code): offset/value = 0x2fb4b58 bytes)
ARM Linker: (Error) Relocated value too big for instruction sequence.
ARM Linker: (at 0x30 in barrier(Asm$$Code): offset/value = 0x2fb4b4c bytes)
ARM Linker: (Error) Relocated value too big for instruction sequence.
ARM Linker: (at 0x34 in barrier(Asm$$Code): offset/value = 0x2fb4b4c bytes)
ARM Linker: garbage output file bin.Kernel_aif removed
ARM Linker: finished,  5 informational, 1 warning and 4 error messages.
AMU: *** exit (1) ***

It is work in progress but I report this in case it is unexpected.

Oddly, or not, a Titanium build of the same did complete and did run but the SMP module reported -1 cores. This may be because the WIP has not got as far as the Titanium.

Mar 2, 2022 1:19pm

Jeffrey Lee (213) 6048 posts

Thanks for the info. I’ve just pushed a fork of SyncLib which should fix the issue (I’ve been using that fork for a while, I just didn’t realise that it might have been necessary for things to build correctly)

The Titanium and PineA64 HALs need updating to support the SMP module, but the other components should all work (if not, then it’s a bug).

Mar 2, 2022 2:21pm

David Pitt (3386) 1248 posts

I’ve just pushed a fork of SyncLib which should fix the issue

Many thanks it has fixed the issue. The Pi’s four cores can now be seen.

Mar 4, 2022 8:35pm

Jeffrey Lee (213) 6048 posts

Documentation:

OS_ClaimProcessorVector

    In:     r0 = vector and flags
                    bit     meaning
                    0-6     vector number
                            0 = 'Branch through 0' vector
                            1 = Undefined instruction
                            2 = SWI
                            3 = Prefetch abort
                            4 = Data abort
                            5 = Address exception (only on ARM 2 & 3)
                            6 = IRQ
                            7+ = reserved for future use
                    7       0 = no flags in bits 9-31
                            1 = flags in bits 9-31
                    8       0 = release
                            1 = claim
                    9       0 = old API
                            1 = new API
                    10-31   reserved (set to 0)

Old API:
    In:     r1 = replacement value
            r2 = value which should currently be on vector (only needed for release)

    Out:    r1 = value which has been replaced (only returned on claim)

New API:
    In:     r1 = handler address
            r2 = handler R12 value

    Out:    All regs preserved

Old versions of the kernel ignored bits 9-31 of R0. To provide room for future expansion, bit 7 is now interpreted as a flag to say that bits 9-31 are present. On old kernels setting this bit will cause the OS_ClaimProcessorVector call to fail (the kernel will think you’re trying to claim/release an invalid vector number).

This means that in order to select the new API, bits 7 and 9 must both be set.

Old API

The old API can be used for vector numbers 0-6 inclusive.

On entry, the register state is the same as when the CPU took the exception, except that CPSR_fs may have been corrupted by a previous handler in the chain.

To claim the exception, the appropriate “exception return” operation should be performed in order to return to the foreground, with whatever register values are deemed necessary.

To pass on the call to the next handler, the handler must preserve all registers (except CPSR_fs) and branch to the address of the previous handler (which was returned in R1 by the OS_ClaimProcessorVector claim call).

On future SMP versions of RISC OS, it’s expected that old-API handlers will only be called for code running on the primary core.

New API

The new API can only be used for vector numbers 1-4 and 6.

All the old handlers will be called before any of the new handlers.

All vector types follow the same pattern for entry/exit:

Entry:

  R0 = pointer to vector-specific register dump
  R12 = "handler R12 value" that was passed to OS_ClaimProcessorVector
  R13 = full-descending stack
  R14 = return address
  Processor is in the relevant exception handler CPU mode
  IRQ+FIQ state is unchanged from exception entry

Exit:

  R0 = Handler result:
       0 = Claim the exception
       1 = Pass on to next handler
       Other values are interpreted as an error block pointer
  R1-R12, R14, CPSR_sf can be corrupted
  The CP15 registers which hold the exception state can be corrupted (i.e. by triggering a recursive exception)

Handlers must pass control back to the kernel by returning to the return address in R14.

When claiming exceptions, handlers can (and usually must) update the R0-R14 & SPSR values in the register dump. It’s these values that the kernel will restore when returning from the exception. R0-R13 will be the new R0-R13 values for the exception handler mode, the SPSR will be restored to the CPSR, and execution will continue from the R14 value. The other registers stored in the register dump (e.g. CP15 DFAR) are not restored.

If a handler returns an non-serious error (bit 31 of error number not set), behaviour is as if the handler passed on the exception to the next handler in the list. However it’s possible that future versions of the kernel, or special builds, will log this error somewhere for diagnostics.

If a handler returns a serious error (bit 31 of error number set), this will cause the kernel to a raise the error. The exact behaviour depends on the exception type.

On future SMP versions of RISC OS, new-API handlers will be called for exceptions that occur on any CPU core, not just the primary core.

Undefined instruction handler

The register dump is a 16 word structure containing R0-R14_und and SPSR_und
- These register values reflect the CPU state when the CPU took the exception
If a serious error is returned, the kernel will raise it using the same mechanism that it uses for standard “Undefined instruction” errors. I.e.:
- SeriousErrorV 0 (Collect) will be called
- The IRQ, SVC, ABT and UND mode stacks will be flattened and IRQsema reset
- The registers will be copied from the abort handler register dump into the OS_ChangeEnvironment Exception registers block
- SeriousErrorV 1 (Recover) and 2 (Report) will be called
- OS_GenerateError will be called to raise the error
If none of the handlers claim the exception or raise a serious error, control will pass to the Undefined Instruction environment handler
- The kernel will restore all the registers from the register dump, so that the environment handler is entered in the same state as used by previous OS versions

SWI handler

The register dump is a 17 word structure containing R0-R14_svc, SPSR_svc and the SWI number
- These register values reflect the CPU state when the CPU took the exception
- The SWI number is only taken from the SWI instruction; i.e. OS_CallASWI will be reported as OS_CallASWI.
Currently, returning serious errors isn’t supported; the error will be ignored and control will pass to the next handler

Prefetch abort handler

The register dump is a 19 word structure containing R0-R14_abt, SPSR_abt, IFAR, IFSR, and AIFSR
- These register values reflect the CPU state when the CPU took the exception
- For CPUs which lack the CP15 IFAR register, the stored value will be R14-4, which will be correct for everything except Jazelle
- For CPUs which lack IFSR or AIFSR, the stored values will be undefined
If a serious error is returned, the kernel will raise it using the same mechanism that it uses for standard “Prefetch abort” errors. I.e.:
- OS_ReadSysInfo 7 will be updated with the PC, PSR & IFAR+4 values taken from the register dump
- SeriousErrorV 0 (Collect) will be called
- The IRQ, SVC, ABT and UND mode stacks will be flattened and IRQsema reset
- The registers will be copied from the abort handler register dump into the OS_ChangeEnvironment Exception registers block
- SeriousErrorV 1 (Recover) and 2 (Report) will be called
- OS_GenerateError will be called to raise the error
If none of the handlers claim the exception or raise a serious error, and OS_AbortTrap and lazy task swapping are unable to deal with the exception, control will pass to the Prefetch Abort environment handler
- The kernel will restore all the registers from the register dump (including the CP15 registers), so that the environment handler is entered in the same state as used by previous OS versions

Data abort handler

The register dump is a 19 word structure containing R0-R14_abt, SPSR_abt, DFAR, DFSR, and ADFSR
- These register values reflect the CPU state when the CPU took the exception
- For CPUs which lack the CP15 ADFSR register, the stored value will be undefined
If a serious error is returned, the kernel will raise it using the same mechanism that it uses for standard “Data abort” errors. I.e.:
- OS_ReadSysInfo 7 will be updated with the PC, PSR & DFAR values taken from the register dump
- SeriousErrorV 0 (Collect) will be called
- The IRQ, SVC, ABT and UND mode stacks will be flattened and IRQsema reset
- The registers will be copied from the abort handler register dump into the OS_ChangeEnvironment Exception registers block
- SeriousErrorV 1 (Recover) and 2 (Report) will be called
- OS_GenerateError will be called to raise the error
If none of the handlers claim the exception or raise a serious error, and OS_AbortTrap and lazy task swapping are unable to deal with the exception, control will pass to the Data Abort environment handler
- The kernel will restore all the registers from the register dump (including the CP15 registers), so that the environment handler is entered in the same state as used by previous OS versions
- The only exception is that on ARMv3 CPUs the DFAR & DFSR registers can’t be restored because they’re read-only. This will cause the environment handler to see the wrong values if any recursive data aborts were triggered

IRQs

The register dump is a 16 word structure containing R0-R14_irq and SPSR_irq
- These register values reflect the CPU state when the CPU took the exception
Currently, returning serious errors isn’t supported; the error will be ignored and control will pass to the next handler

Jan 1, 2023 2:38am

Timothy Baldwin (184) 242 posts

The implementation requires interrupts to be disabled on exit from the handler to protect SPSR_svc from modification by interrupt handlers but the documentation does not state this.

I propose the following to abstract away the processor modes:

Export via OS_ReadSysInfo 6 aborttrap_reg_read etc from Kernel.aborttrap.s.atcontext. This would need extending to cope with different processor modes.
Add a byte to structure to indicate the mode of exception for the above routines.
Aporttrap internally extends the register dump with SPSR_svc, R13_svc, R14_svc, add these registers. Alternatively reserve space and provide routines to switch modes and save and restore the registers.

A compromise between the original faulting mode register and the current interface would be to store non-FIQ R0 to R12, R13 to R15 from the faulting code, and R13 form the exception handling mode (or just define or export a stack frame size?) However that would complicate the SVC and IRQ handlers needlessly.

The problem of wrapping SWI calls needs work, perhaps by endorsing relocating the stack to introduce an additional stack frame, this is complicated by the presence of the exception handler block. Perhaps a routine to do that should be exported.

I am however concerned that this approach is slow, with a lock on every SWI instruction (and said lock implemented with a lock instead of using LDREX and STREX directly).

I’ve measured RISC OS taking 251906 SWI calls to boot to the desktop with unmodified HardDisc4, taking about 0.25 seconds with a Neoverse N1 in Graviton 2. That is 1,000,000 SWI calls per second, compiling RISC OS is about 300,000 SWI calls per second.

I’ve bench-marked contended atomic increments at 20 nanoseconds using ARMv8 32-bit instructions and 38 nanoseconds using ARMv7 instructions, so that wouls suggest a 4% slowdown on a load as similarly SWI call heavy as booting.

I tried adding an atomic counter increment into the SWI handler:

10      mov     r10, #&2000
        ldaex   r14, [r10]
        add     r14, r14, #1
        stlex   r10, r14, [r10]
        cmp     r10, #0
        bne     %BT10

With that change I timed the “export_libs”, “resources” and “rom” phases of an IOMD ROM build across 4 cores running on RISC OS on Linux 6.1.1 on an AWS c6g.xlarge, using the time command of GNU bash.

With the atomic counter private for each cores times were:

real	0m32.945s	0m32.692s	0m32.799s
user	0m58.478s	0m58.774s	0m58.243s
sys	0m28.289s	0m28.001s	0m28.311s

With the atomic counter shared:

real	0m33.081s	0m33.022s	0m33.035s
user	0m59.892s	0m59.662s	0m59.477s
sys	0m28.519s	0m28.591s	0m28.274s

That is a 2% slow down (user cpu time), just from the inter-core latency.

I’ve uploaded these changes to the Atomic-SWI-Test tag of my git repository.

This performance loss can be removed by removing the mutex and instead providing synchronisation by OS_ClaimProcessorVector using an inter-processor interrupt to execute a memory barrier on all cores, which would synchronise with vector dispatch due to the fact that interrupts are disabled whilst it is running.

To be continued…

Jan 1, 2023 12:36pm

Jeffrey Lee (213) 6048 posts

The implementation requires interrupts to be disabled on exit from the handler to protect SPSR_svc from modification by interrupt handlers but the documentation does not state this.

Documentation error – thanks. I was meant to say that CPSR_sf can be corrupted, but wrote _cf instead.

I propose the following to abstract away the processor modes:

That’s something I hadn’t considered doing. I guess it makes sense, to allow easier use of higher-level languages, or to abstract over the differences between whether the AArch32 exception is handled in AArch32 or AArch64 (or some other architecture for emulators). But obviously any register access via a function is going to be slower than direct access ;-)

I am however concerned that this approach is slow, with a lock on every SWI instruction (and said lock implemented with a lock instead of using LDREX and STREX directly).

For single-core machines that could be solved by disabling the spinlocks (which would also get rid of the exception handler block)

For the multicore work I’ve been doing, optimisation is still very much in the “I’ll do it later” category, mainly because I haven’t yet hit the milestone of regular C apps being able use threads via a standard API (e.g. C11 threads). (Last time I was working on it, I ran into a roadblock with the way I was handling OS_CallBack/etc. when the primary core is in the idle thread). But once that’s done, and I’ve got easy access to a reliable high-resolution time source (soon!) I’ll be in a much better position to start identifying and fixing performance issues or other faults, and getting the code to a mergeable state. At the moment one of the big problems is that half the code is in the SMP module which does some nasty things to hook itself into the running OS/kernel, so there are limits to what can be done with that implementation (e.g. the kernel doesn’t know that the other cores exist, so there’s no mechanism for it to send an interrupt or message to them)

Jan 1, 2023 5:50pm

Simon Willcocks (1499) 513 posts

How about we don’t let user programs just take over the whole processor? Jeez.

Jan 1, 2023 7:54pm

Rick Murray (539) 13839 posts

Well, that kind of is the entire situation.

Arthur evolved from the BBC MOS, and RISC OS evolved from Arthur, but in its heart it’s extremely similar. A single process, single context operating system.
That’s why all the weirdness with switching tasks in the other thread.
There is, really, only one “task”, program, process. Call it what you will, it’s what it is.
The Wimp uses smoke and mirrors to make it look like many tasks, but they’re all mapped in at &8000 and they’re the one as far as the OS is concerned.
To the point that outside of the Wimp’s control, other tasks are simply inaccessible. They don’t exist. Not until the Wimp shuffles the cards and makes a different one exist.

As Charles says in the other thread, pretty much all of the ways of starting an application boil down to variations of ways of calling FSControl 4 to get FileSwitch to actually start up a program, the commands just have different behaviours for desktop/non-desktop use, or are legacy things (does anybody use *Go ¹ these days?). These variations are important as the most basic way with *Run is immediate, which doesn’t work with the desktop world unless you’re actively in the desktop world.
Again, RISC OS only understands a single process, so outside of the desktop that’s all you get.
Which is part of why the desktop manages the tasks (and that, my friend, is an entirely different discussion).

Couple that with the infamous SWI OS_EnterOS to happily promote anything to kernel level privilege (and full access to the entire machine) and… yeah… it was good in 1987.
In 2023, people run in horror.

Take all of the above and throw in that the system happily runs unvetted third party add-ons (modules) with most of the calls happening in a privileged mode, and you’ll see that the situation isn’t fixable. It would take a ground up rewrite and redesign to resolve the massive architectural problems. This is why I am very against the idea of any potential rewrite ² maintaining anything that resembles the current API. The current API is functionally broken. It’s something that was acceptable at the end of the 80s home computer era, not something applicable to today’s connected world. Recreating the API will imply recreating the mistakes of the past.

Anyway, as Jeffrey says below, OS_ClaimProcessorVector could take over, but it’s not used like that…and I say above in many more words, if a user mode program hijacking the machine is what concerns you… hold my beer. ;)

¹ Thinking about it, *Go doesn’t actually start a program or process, it just passes control to something loaded into memory accessible by the current program. It’s basically equivalent to CALL.

² Not that I believe, for one moment, that such a thing will ever actually happen.

Jan 1, 2023 8:03pm

Jeffrey Lee (213) 6048 posts

How about we don’t let user programs just take over the whole processor? Jeez.

OS_ClaimProcessorVector isn’t that bad, AFAIK. Obviously yes it could be used to take over the system, but in terms of actual use (within the OS) there are only a handful of things that use it – possibly just FPEmulator & VFPSupport (to hook onto the undefined instruction vector) and the PCI module (to trap potential data aborts while scanning the bus). And in the wild I can’t imagine that much user software uses it.

If you’ve got a version of the OS that can work out whether a SWI is coming from a user program or a trusted OS component, then you could probably quite easily lock OS_ClaimProcessorVector away so that only trusted code can use it, without breaking anything important (except the BASIC test code I’ve written)

Jan 1, 2023 8:12pm

Rick Murray (539) 13839 posts

or a trusted OS component

Define “trusted OS component”?

Or, rather, define “trusted OS component when EnterOS exists and one presumably would like to retain the ability to load updated/enhanced modules” ¹.

¹ So restricting the test to “code in a ROM address” can’t be used as it prevents updated modules being used.

Jan 1, 2023 11:49pm

Ronald (387) 195 posts

The Wimp uses smoke and mirrors

you are using both Wimp and Desktop terms, does one equal the other. Does one or the other or both do time sharing. Sometimes at the start of documentation things like that are stated. and then it is known what the reference is throughout.

Jan 2, 2023 1:13pm

Rick Murray (539) 13839 posts

you are using both Wimp and Desktop terms, does one equal the other.

The Wimp is the window manager that provides the desktop environment.

When I use the term “desktop”, I’m referring to it conceptually. The part of RISC OS with windows and menus and multiple applications running at the same time.
When talking about the “Wimp”, it’s more a low level nuts and bolts thing.

This, of course, isn’t helped by there being a module called “Desktop” (that starts up the environment, pops up the banner (in the old days), and gets the autobooted ROM apps going). But since that’s just a startup kind of thing, one can generally say that referring to “the desktop” means the multitasking user interface as a whole.

Does one or the other or both do time sharing.

Not a valid comparison. The desktop (high level, metaphor) works via the Wimp (low level, SWI calls), so they’re different views of the same thing.

Oh, and it’s the Wimp that handles switching tasks.

Sometimes at the start of documentation things like that are stated. and then it is known what the reference is throughout.

PRM book 3, first section “The desktop”, first chapter “The Window Manager”, first sentence: This chapter describes the Window Manager. It provides the facilities you need to write applications that work in the Desktop windowing environment that RISC OS provides.

:-)

Jan 4, 2023 11:06am

Jon Abbott (1421) 2651 posts

you could probably quite easily lock OS_ClaimProcessorVector away so that only trusted code can use it, without breaking anything important

If one of the goals is to lock down OS_ClaimProcessorVector to trusted code, we’re going to need code-signing. There would also need to be a way to allow untrusted code for developers and people transitioning.

As you say, there’s probably not much 3rd party software that relies on taking over the hardware vectors. I just had a look at ADFFS – it takes over all, except PreFetch and Address Exception when 26bit is running. I suspect it’s mostly going to be debuggers and instruction emulation that would be impacted or could make use of any changes.

ADFFS couldn’t make use of the new method as its too late in the call chain. ADFFS acts as a Hypervisor and needs to be first in the chain in front of FPEmulator and OS. Provided there’s a method for trusted-code on the old method, I don’t see any issues.

Jan 5, 2023 12:05am

David J. Ruck (33) 1635 posts

My !SWIstat SWI call monitoring application has to hook in to to SWIV using OS_ClaimProcessorVector, I’d like to keep that working.

Jan 5, 2023 12:35am

Jeffrey Lee (213) 6048 posts

For clarity: I’m not planning on locking down OS_ClaimProcessorVector. I just mentioned locking it down as it’s something Simon could opt to do in his reimplementation of the kernel.

Jan 9, 2023 1:53am

Timothy Baldwin (184) 242 posts

I don’t see any invocation of ISB before calling the vector routines which is needed between synchronising the caches and executing new code, OS_SynchroniseCodeAreas does this for processor on which runs, but it is needed on every processor.

How about we don’t let user programs just take over the whole processor? Jeez.

Well that’s one reason OS_ClaimProcessorVector to define a user mode only behaviour, but the one I’m most interested in my Linux Port which runs in user mode. Also the existence of OS_EnterOS/Os_LeaveOS does make things more complicated…

The follow discussion applies to undefined instruction, data abort and prefetch abort handlers only.

I intend to mostly remove my para-virtualization of CPU modes, leaving only the switching of user /supervisor R13 and R14.

This means the aborted code’s R13 and R14 needs to go in the structure. Placing them after R12 seems logical, and R0 to R15 then CPSR format is used Linux, BSD, Windows and in parts of RISC OS.

This makes the native return sequence :

LDR   r1, [sp, #16*4]
MSR   SPSR_csxf, r1
LDMFD sp, {r0-r15}^

This almost matches the Linux ABI, alas Linux puts IFSR / DFSR below R0, which rather messes up out layout.

In my opinion it is better to abstract the remaining differences try to make the structure the same by providing routines to access it.

Jeffery’s Proposal	Privileged AArch32 modes available	User mode only
R0
R1
R2
R3
R4
R5
R6
R7
R8 (not FIQ)
R9 (not FIQ)
R10 (not FIQ)
R11 (not FIQ)
R12 (not FIQ)
Handler R13		Aborted R13
Handler R14 / Aborted PC	Unused	Aborted R14
Handler SPSR / Aborted CPSR	Handler R14 / Aborted PC
IFAR / DFAR	Handler SPSR / Aborted CPSR
IFSR / DFSR	IFAR / DFAR
AIFSR / ADFSR	IFSR / DFSR
Unused	AIFSR / ADFSR	Unused
Unused	Handling mode + Saved mode	Zero = user mode
Unused	Routine pointer
Unused	Space for saved SPSR	Unused
Unused	Space for saved R13	Saved fake SVC R13?
Unused	Space for saved R14	Unused

The last 3 entries are not set by dispatch code but a routine will be provided to switch to SVC mode and save them:

NewCPV_switch_to_SVC
        MOV     r1, lr
        SetMode SVC32_Mode
        MOV     r2, #SVC32_Mode
        ADD     r4, r0, #NewCPV_saved
        STRB    r2, [r4]
        MRS     r2, SPSR
        STMIB   r4, {r2, r13, r14}
        MOV     lr, r1
        MOV     pc, lr

NewCPV_switch_from_SVC
        MOV     r1, lr
        LDMIA   r4, {r2, r3, r13, r14}
        MSR     SPSR, r3
        MOV     r3, #-1
        STRB    r3, [r4]
        MRS     r3, CPSR
        BIC     r3, r3, #M32_bits
        ORR     r3, r3, r2, LSR #24
        MSR     CPSR_c, r3
        MOV     lr, r1
        MOV     pc, lr

Similar could be done to save and restore the aborting mode r13 and r14. On a RISC OS port that runs all AArch32 code in user mode, these 4 routines would do nothing.

Some things that are implied, but should be made explicit:

Exceptions will not be be delivered to lower privileged CPU modes, so no handlers will be called in user mode for exceptions in SVC, IRQ, FIQ, ABT, and UND modes, and no OS_ClaimProcessorVector handlers will be called from HYP or MON modes.
The undefined instruction vector defined here will only be entered for AArch32 instructions (traditional ARM and Thumb).
The memory abort handlers defined here may not be called for non-AArch32 instructions.
OS_AbortTrap handlers are likely not to be called for faults from higher OS privilege levels, and will not be called for DMA. (RISC OS only has one privilege level).
If there is limited support for running privileged AArch32 code (eg only some core or software emulation), handlers my be entered in both user and privileged modes.

Is there any problem with any exception handler entered in user mode being prohibited from switching stacks by calling OS_LeaveOS or otherwise?

Pages: 1 2 3 4 5 6

Reply

To post replies, please first log in.

Forums → Code review →

OS_ClaimProcessorVector again

OS_ClaimProcessorVector

Old API

New API

Undefined instruction handler

SWI handler

Prefetch abort handler

Data abort handler

IRQs

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options