Showing changes from revision #28 to #29:
Added | Removed | Changed
Warning: The current and/or previous revision contained markup errors which prevented proper diff analysis. A guess at markup correction has been made but the diff may not be a truly accurate reflection of the real changes.
The release in 2011 of architecture ARMv8 added a whole new instruction set called AArch64 with wider 64b register bank, and renamed what was previously referred to as the ARM instruction set (which RISC OS uses) as AArch32. The AArch64 instructions are not binary compatible with the older ones.
To date, the ARMv8 targets RISC OS supports have been fortunate in still containing an implementation of AArch32 for backwards compatibility, allowing it to run and entirely ignoring the AArch64 aspects. As at 2020 it was becoming clear that Arm intended to wind down ARMv7 and earlier (fewer than 50% of the cores available to license would run that way) to focus on their ARMv8 offerings, of which many dropped AArch32 support entirely.
In April 2021 the ARMv9 architecture was announced with AArch32 relegated to being a license option, in much the same way that 26 bit mode became an option in ARMv4 rarely taken up.
This proposal identifies aspects of RISC OS that will require attention in order to migrate away from AArch32, to help making design decisions, and ultimately a route to implementing changes to ensure there are chips in future to run on.
DDI0847 – The ARMv8 architecture reference manual (A profile)
DDI0608 – The ARMv9 supplement to DDI0847
No more big 32-bit cores for RISC OS from 2022
What would AArch64 BASIC look like?
ARMv8 support
Secondary relevance: Could you run on an Apple MacMini and RISC OS on hypervisor
The AArch32 is the family of instructions with 32 bit wide integer registers addressing up to 2 32 bytes of memory, and also includes the 26 bit addressing mode too.
The AArch64 is the family of instructions with 64 bit wide integer registers addressing up to 2 64 bytes of memory.
Since RISC OS is coming to the 64 bit scene rather late, it has the advantage of being able to look at how other mainstream operating systems have approached the problem (though some solutions may not be practical on an Arm processor).
In the remainder of this analysis we assume approach (4) is the primary solution, though to assist development elements of (3) and (5) may be used as temporary measures. This is not unlike the approach taken when getting the first 32 bit versions of RISC OS to work – initially they kept the old memory map with 28MB application limit, and once everything stabilised those limits were raised and more and more of the OS was moved higher up to free up the large application slot we enjoy today.
The opcode under AArch32 has space for a 24 bit immediate value which encodes the call number, for a total of 16M possible unique values, though in practice some of the number space is used to encode other information such as bit 17 holding the non-error returning ‘X’ flag.
The opcode under AArch64 has a reduced size 16 bit immediate value, for a total of 64k possible unique values, which clearly is insufficient to directly encode all of the currently circulated allocations.
Despite the reduced number space, this does encompass the high value OS SWI block from &00 to &1FF. Since the advent of split I+D caches with the StrongARM is has not been convenient to dynamically generate supervisor call opcodes without incurring a cache flush penalty, and the strong recommendation has been to call OS_CallASWI
or OS_CallASWIR12
instead. This indirect method, passing the call number via a register which is then passed for despatch via an OS SWI could be used to allow existing SWI allocations to be retained regardless of the instruction set in use.
Programmers in C will be used to using the _swix()
and _swi()
functions which are already indirect calling methods, and programmers in BASIC use SYS
which the interpreter can change to an indirect call as required.
Whereas current in AArch32 R0-R15 are 32 bits wide, and therefore can address any location in the 4GB of logical address space, with AArch64 the general purpose register bank can be viewed either as X0-X30 (64 bits each) or W0-W30 (32 bits each). The AArch64 program counter is not directly visible.
Therefore, the AArch32 register bank could be viewed as a subset of the AArch64 register bank, in that parameters passed to a SWI executed as an AArch32 instruction could be losslessly passed to an AArch64 RISC OS kernel for handling by sign or zero extending as appropriate.
Note that in a mixed AArch32/AArch64 system it is not guaranteed whether the narrower 32 bit registers (containing the SWI’s parameters) are zero extended when entering the AArch64 exception handler. The AArch64.CallSupervisor
pseudo code in the ARMv8 architecture reference manual is the clearest place to follow this through AArch64.TakeException
you get to AArch64.MaybeZeroRegisterUppers
where the loop to clear is conditional on ConstrainUnpredictableBool
.
Assuming a register widening solution is adopted, the only places where the size of a pointer is of concern is where the pointer is passed in via a parameter block held in memory. This section surveys the core module SWIs for places where this technique is used in order to see how widespread a problem it might be.
The following list modules which have been checked but whose SWIs don’t have any potential pointer issues. Modules which don’t implement any SWIs, such as application modules, are not listed here.
* AcornHTTP * AcornSSL * ADFS * ATAPI * BCMSupport * BlendTable * BootFX * Buffer Manager * CDFS, CDFSDriver * ColourTrans * CompressJPEG * DDEUtils * Debugger * DeviceFS * DHCP * Dialler * DOSFS * DragASprite * DrawFile * FilerAction * Filter Manager * Font Manager * Free * Freeway * FrontEnd * FSLock * GPIO * Hourglass * IIC * Internet (Socket) * InverseTable * Joystick * JPEG | * MakePSFont * NetFS * NetMonitor * NetPrint * NetTime * NFS * Parallel Device Driver * PDriver * PDumper * Portable Manager * RamFS * RedrawManager * ResourceFS * RTC * RTSupport * ScreenBlanker * ScreenFX * ScreenModes * SCSIFS, SCSIDriver * SDFS * ShareFS * ShellCLI * SMP * Sound (Level 1), Sound (Level 2) * Sound Control * Squash * SuperSample * TaskWindow * Toolbox * URI * VCHIQ * VFPSupport * ZLib |
Assuming a register widening solution is adopted as for SWIs, the only places where the size of a pointer is of concern is where the pointer is passed in via a parameter block held in memory. This section surveys the core module service calls for places where this technique is used in order to see how widespread a problem it might be.
List of AArch32 affected service calls
The following list service call ranges which have been checked but which don’t have any potential pointer issues. Modules which don’t implement any service calls are not listed here.
* ADFS (&10800) * SCSI (&20100) * Wimp (&400C0) * NetPrint (&40200) * Toolbox (&44EC0) * SDIODriver (&81040) * IIC (&81100) * Window (&82880) * URL (&83E00) |
The SWIs OS_File and OS_GBPB include some subreasons which deal with load and execution addresses. These are currently 32b quantities, albeit deprecated in use. Various places store these as 32b quantities for example: in the extended attributes of a ZIP file, in file server messages, in the directory entries of FileCore discs.
The SWI OS_FSControl 12 (Add FS) and OS_FSControl 35 (Add image FS) pass a pointer to a FileSwitch FS Information Block which includes 32b offsets to functions to implement a filing system. Provided modules are not expanded beyond their existing maximum size of 64MB these 32b offsets will suffice because they are relative to the module base address.
The SWI OS_SpriteOp doesn’t make use of absolute addresses in memory. Provided sprites are not expanded beyond their existing maximum size of 2GB these 32b offsets will suffice because they are relative to the sprite area base.
The 4 word MessageTrans block is opaque to the caller, so while it may contain a pointer, its layout could be changed without impacting clients.
The SWI ResourceFS_RegisterFiles includes a block with a 32b offset to the next item to add in the chain, however that still allows blocks to be kept ±2GB apart.
Devices registered via the list in R1 to DeviceFS_Register include a 32b offset to the device name as the first word of the buffer. This limits the string to be within ±2GB of the block.
Toolbox Res files include 32b offsets (to the body, strings, etc) some of which are relocated when the Res file is loaded into absolute addresses.
Mbuf Manager works with mbctl and mbuf structures, these contain both 32b function pointers and linked list pointers.
The AIF format has always included a flags word, with a small number of valid values allocated by Acorn or Arm for their use. Following dialogue with Arm there are still plenty of spare flag bits. Therefore, a flag bit can be allocated to denote the code was intended to be run on a 64 bit version of the OS and rejected on 32 bit versions.
In addition, the first few words are expected to be a limited subset of AArch32 instructions (B, BL, MOV) to add confidence to the decision.
Plain binaries which are *Run
and rely on the 32b load and execution addresses held in the RISC OS file attributes would either require a change to the FileCore logical format to support longer attributes, or be unsupported (the use of load/execution addresses has been deprecated for some time), or be limited to loading into the low 4GB of the memory map as presently.
Utilities run in User mode after being loaded into the RMA. The kernel currently runs utilities without type checks since in User mode any undefined instruction can only cause inconsequential damage. An optional “32OK” signature appended to the image is used by Aemulor to suppress its emulation, so precedent exists for adding “64OK” to denote the change of instruction set. Heuristics to detect AArch32 opcodes are likely to lead to false matches due to the AArch64 opcodes overlapping.
Since RISC OS 5 the module header has included a flags word to provide for future changes. Therefore, a flag bit can be allocated to denote the code was intended to be run on a 64 bit version of the OS, rejected on 32 bit versions.
In addition, the first few words are expected to be a limited subset of AArch32 instructions (B, BL, MOV) to add confidence to the decision.
Podule loaders have 4xAArch32 instructions and an optional “32OK” signature at the start. Since AArch64 instructions are also 32b in size, and the signature could be changed, should there be an ARMv8+ machine with a podule bus, this can be accommodated.
The BASIC interpreter could need to address memory above the 4GB boundary, for example through DIM
or interacting with SWIs through the SYS keyword, but with its current 4 byte integer variables would not be able to do so.
Other dialects have already introduced 64 bit integers, so ARM BBC BASIC may be able to copy that syntax for declarations and indirection.
Phase | Status | Completion | Latest updates |
---|---|---|---|
Conceptual design | In progress | 10% | 23-Jul-2022 Document updated (see history) |
Mock ups/visualisation | - | - | - |
Prototype coding | - | - | - |
Final implementation | - | - | - |
Testing/integration | - | - | - |
v1.00 – 10-Apr-2021
v1.01 – 01-Aug-2021
v1.02 – 18-Apr-2022
v1.03 – 19-May-2022
v1.04 – 11-Jun-2022
v1.05 – 23-Jul-2022
v1.06 – 24-Sep-2023