Dynamic Area alignment
nemo (145) 2546 posts |
I know DAs get little love here, but that’s primarily due to the spectre of ‘no maximum’ areas needlessly depleting address space. That’s a known and largely solved problem. Smaller areas have always been useful. It would be helpful to be able to specify the required alignment of a DA when creating it. There’s a number of reasons one might want to constrain alignment of what is already page aligned – cyclic buffers for example are more efficient when appropriately aligned. Alignment of DAs can currently be forced manually by creating an unmapped DA of twice the size minus a page, aligning the base address up to the required granularity, removing that DA and then creating a new one of the required size at the specified aligned base address. This is slightly inefficient in that it cannot fit a DA into an aligned hole in the address space if that hole is less than twice the size. One can complicate the strategy to mitigate that but it’s not ideal. It would be good to be able to specify the required alignment on creation, and let the Kernel place it appropriately. It is possible to do so without defining any new DA flags or APIs by finessing the interpretation of the supplied base address in DyanmicArea,0 – Create Area: The specified base address is normally -1 meaning ‘anywhere will do’, anything else is taken as the required start address. Specifying an address within an existing area elicits the ‘Overlapping Areas’ error. It is therefore impossible to create an area with a base address within the Application Slot or (proper) zero page. I propose interpreting a base address that is a power of two >= 2×pagesize and < end_of_app_slot as a required alignment – if the API extension is not available, ‘Overlapping areas’ will be returned, otherwise the Kernel shall allocate a DA at that alignment and return its actual base address. The caller can fall back to the above work-around if the API extension is not available (which is what they would have had to do anyway), and the extension can be easily retro-fitted to older OSes by the same method. Example: A 32KB cyclic buffer, for decompression algorithms for example size%=32*1024 REM These 7 lines required to align the buffer appropriately span%=size%*3-4096 :REM we need a 32KB buffer with 64KB alignment SYS&66,0,-1,span%,-1,&483,span%,0,0,"Find some address space"TO,da%,,base% need%=size%*2-1 need%=(base%+need%)ANDNOTneed% SYS&66,1,da% SYS&66,0,-1,size%,need%,128,size%,0,0,"Cyclic buffer"TO,da%,,base% IFbase%<>need%:SYS&66,1,da%:ERROR1,"How did that happen?" REM Now cyclic buffering is very efficient: STRB R0, [R8],#1 ; store byte in buffer and increment BIC R8, R8, #size% ; wrap ptr back to start The proposed change would allow all that to be done in one call: SYS&66,0,-1,size%,size%*2,128,size%,0,0,"Aligned buffer"TO,da%,,base% The |
Jeffrey Lee (213) 6048 posts |
Aligned areas would be useful – a while ago I had to workaround a problem with the PCI module where it wasn’t allocating aligned blocks of physical memory correctly because the DA wasn’t logically aligned.
That feels a bit nasty. However the only danger I can think of is if there’s RiscPC-era software which tries to create a dynamic area at a specific power-of-two address, and under RISC OS 5 would lie within application space. Under current versions of RISC OS 5 the software will fail (hopefully safely) with an error. But with your proposal it will succeed in creating the area, just not at the address the software expects, probably causing it to fail in a more dangerous manner. Enforcing a maximum alignment of 16MB (or 32MB?) would protect against that. Alternatively, R3 could be set to required_alignment-1 (for power-of-two alignments). A quick check against git suggests that RISC OS 3.6 thru 5.XX have always complained if the base address in R3 isn’t page aligned – if that holds true for other OS versions then that feels like a safer way of extending the API. |
nemo (145) 2546 posts |
Naturally here in RO4 land I was thinking in terms of 28MB, sorry for being ambiguous.
I think a maximum 16MB alignment would be generous. Beyond that is Civil Engineering territory.
I’m liking the alignment being an immediate constant (what may be more useful in general is an alignment and offset, though I don’t have a usage scenario for that – the lower bits may be better served encoding the offset). I suspect if one needs more than 24 zero bits at the bottom of a DA address, one would be content to use a less convenient constant, rather than impose the inconvenience on the usual cases. eg &8000, &10000, &20000, &40000…&01000000 are alignments of that size, else So maximum alignment would be %00000001 oooooooo ooooMMMM MMMMMMMM where o is offset (whole pages) and M+1 is a multiple of 16MB, so that’s 64GB… I’m inclined to not worry about this limitation. Edit: There is a problem with &1C00000 under RO4 which I’m trying to grok… So… <sighs heavily> you were right to worry. RO4 allows one to create DAs where there is already RAM in use: So that’s not very good. Amending my suggestion to a maximum 8MB alignment with a convenient constant: %00000000 1ooooooo ooooMMMM MMMMMMMM o is offset as before, M+1 is multiple of 8MB, hence maximum 32GB alignment. Actually, it’s more convenient having established R3’s an alignment to CMP R3, #&00800000 MOVCS R11, R3, LSL#20 BICCS R3, R3, R11, LSR#20 ADDCS R3, R3, R11 so M can contain the power-of-two multiple of 8MB and the higher bits of the offset. |
Jeffrey Lee (213) 6048 posts |
A less contrived way of encoding offsets:
That way code won’t have to remember to jump between two different encoding schemes depending on how much alignment they require, and you’ll be happy because the alignment can be represented in an immediate constant, and RISC OS 4 (hopefully) won’t explode because the R3 value will never be a multiple of the page size. Also, since the minimum sensible alignment would be 4K, we could stipulate that it’s only bits 12+ that contain the offset, and bits 5-11 are reserved for future expansion. |
nemo (145) 2546 posts |
Indeed. To summarise: R3 = %oooo oooo oooo oooo oooo 0000 000p pppp where p (13-31) is the power of two of the desired alignment and o are the upper bits of the offset from that alignment 0s, and p<13, are reserved values How does that look? |
Jeffrey Lee (213) 6048 posts |
Yep, that’s what I had in mind. |
nemo (145) 2546 posts |
I have a compatibility module working under RO4. |