Showing changes from revision #2 to #3:
Added | Removed | Changed
(Using The more main than goals 1 are: core, threading,SWI reentrancy, memory, interrupts)
There will also be some degree of support for using threads from modules (in an SMP manner), but this is likely to be in a more restrictive environment than that available for applications (e.g. limited SWIs available).
Older multi-core systems/solutions (e.g. the ARMv6K-based ARM11 MPCore CPU) are highly unlikely to be supported.
Getting the multi-core support working reliably is likely to involve making several missteps along the way, so the changes which allow for the extra cores to be utilised are likely to only be merged into the OS towards the end of the project. However other improvements can be drip-fed into the main sources as and when they’re finalised.
A Primary coreterm – the is a placeholder.CPU core which boots first
Auxiliary cores – the extra CPU cores which the multi-core changes aim to unlock the potential of
New HAL APIs are required to allow the OS to perform the following tasks:
To support the above, the following new HAL entry points have been (re)defined:
See Kernel/Docs/SMP/HAL and Kernel/Docs/SMP/IRQ for more in-depth information about these calls and any changes to the specification of existing HAL APIs (i.e. expected SMP safety of different HAL entry points)
Several HALs have already been updated to support the above specification revisions, with the code merged into the main OS sources (BCM2835, OMAP4, OMAP5, iMx6). Out of the other HALs in the main OS sources, the Titanium and PineA64 HALs are yet to be updated.
Mutexes, semaphores, spinlocks, and other mechanisms are critically important to being able to make components MP-safe. If code crashes while a lock is held, there needs to be some way of unlocking it (and potentially taking other recovery actions) in order to avoid a deadlock when something tries to use it next.
“Stack-based exception handlers” have been proposed as a solution for this (forum thread, merge request), which allow privileged-mode code to push special exception handler nodes onto the stack. Whenever the OS resets the stacks (e.g. after an unhandled data abort) it will invoke those handlers to allow them to perform any necessary actions such as unlocking spinlocks.
More iteration & dogfooding is needed before the code is ready to be merged, including checking to see how feasible it is to allow handlers to halt unwinding half-way (e.g. to shield foreground threads from crashes in interrupt handlers, as per RISC OS Select)
The current OS_ClaimProcessorVector API has flaws that cause problems for both single-core and multi-core use. A new version of the API has been proposed and implemented but not yet merged in to the main sources. See this forum post for details of the new API.
Software which is using the old API will only have its handlers called for aborts/exceptions which occur on the primary core. Software using the new API will be called for aborts/exceptions that occur on any core on which the OS is running.
To finish off these changes, the API design feedback needs reviewing, and any appropriate implementation changes need to be made.
To allow the implementation to be merged into the main sources (and the pending changes for other components which have been modified to use the new API), it may be necessary to make the use of spinlocks optional in order to avoid any negative performance impacts on single-core machines (and the current single-core version of the OS).
Floating point support is a fundamental part of the execution environment which lots of software relies upon. Therefore MP-safe versions of FPEmulator and VFPSupport are important stepping stones to allowing every-day code to run across multiple cores.
Building ontop of the new OS_ClaimProcessorVector API, MP-safe versions of these modules have been produced, but until the new OS_ClaimProcessorVector API has been finalised (and preferably merged) these modules can’t have their changes merged:
The SMP module defines and implements the low-level threading APIs (threads, mutexes, condition variables, etc.) which the OS exposes to applications & modules. Currently it’s also responsible for bring-up of the aux cores (and the kernel which they run), process management, and thread scheduling.
The initial version of the module was produced in 2017, providing a minimal threading environment for the aux cores. All development since then has taken place in the SMPthread branch, with the changes collected in this pending merge request. This new version is significantly improved, but still incomplete. Missing features still need implementing, along with lots of code tidying, testing, hardening & bug fixing. The module also hooks itself into the kernel/OS in a very unsatisfying way – ideally some of the code from the SMP module (e.g. thread-safe IRQ & SWI dispatch) should be merged into/implemented within the kernel.
This can be broken down into several smaller chunks of work:
Core code has been implemented and merged in as CLib 6.14.
Compiler extensions are required to support both _Atomic
and _Atomic()
(currently only _Atomic()
is supported), and to ensure that doubleword types are doubleword aligned.
The recent addition to the compiler of the ACLE __ARM_ARCH
predefined macro could allow atomics.h to be improved to use inline assembler instead of library calls when targeting new CPUs.
Work has started on implementing the C11 thread APIs as a wrapper around the SMP module threading SWIs, but is incomplete.
Not yet started. It’s not yet been decided whether this will take the form of a callback-based thread scheduler implemented within CLib (no extra dependencies required), or by producing a single-core version of the SMP module (which already implements a callback-based scheduler for running the scheduler on the primary core)
Not yet started. On ARMv7+ this is best done by using the CP15 thread ID registers. Supporting the _Thread_local
storage specifier will require the compiler to be extended.
Library entry points need checking and making thread-safe where appropriate.
UnixLib already contains a pthreads implementation that runs on single-core machines. This will modifying to detect the SMP module and use it for threading, and UnixLib as a whole will need reviewing to ensure it’s MP-safe where appropriate.
(Many more things need listing)
Phase | Status | Completion | Latest updates |
---|---|---|---|
Conceptual design | In progress | 0% | 26-Dec-2021 Document created |
Mock ups/visualisation | - | - | - |
Prototype coding | - | - | - |
Final implementation | - | - | - |
Testing/integration | - | - | - |
v1.00 – 26-Dec-2021
v1.01 – 19-Jan-2023