Restrictions on cacheable doubly-mapped areas
Jeffrey Lee (213) 6048 posts |
Considering that doubly-mapped areas are supposedly for OS use only, I’m not sure whether this will affect anyone, but I figured it was worth raising it here just to be sure: I’m going to be restricting doubly-mapped areas so that on ARMv6 and below they can only be created as non-cacheable. Meanwhile, on ARMv7+, I’ll be fixing cacheable doubly-mapped areas to perform their cache maintenance properly (currently resizing a cacheable doubly-mapped area will do a full TLB+cache flush, when really it only needs to flush the affected address ranges). This is just one of the many parts of my “fix cache maintenance” task – writes to cacheable doubly-mapped areas prior to ARMv6 are never going to work properly due to all the CPUs using virtually tagged caches, so it would be wrong to support making them cacheable there (the Ursula cacheable screen memory implementation worked by only making the lower mapping of screen memory cacheable and left the upper mapping as-is, leading to potential coherency issues). On ARMv6 cacheable doubly-mapped areas have more support, but only if you comply to “page colouring” constraints, which is something RISC OS can’t currently guarantee (would require OS_ChangeDynamicArea to limit the DA size to a multiple of 16K – which probably wouldn’t be too hard to implement, but there’s not much point at the moment). It’s only really ARMv7+ which has proper support for cacheable doubly-mapped areas, although there are still a couple of caveats with regards to the instruction cache – so for now I might just take the easy way out and only allow doubly-mapped areas to be cacheable if they’re also non-executable. It’s not a hard problem to fix, but it would involve updating OS_SynchroniseCodeAreas so that it can detect when it’s being called on a doubly-mapped area, so that it knows to perform the cache maintenance for both mappings. So if these changes are going to adversely affect anyone, speak now or forever hold your peace! I’m not sure when these changes will end up hitting CVS – as I’ve said there are many parts to the “fix cache maintenance” task, and a lot of them deliver no functional gain on their own. So it may only be in a month or two when everything is done and tested that all the changes will hit CVS in one big lump. (As an aside, I’ve actually started using a local git repo to keep track of all this work. It’s really useful to have one branch per sub-task and to be able to start new branches on-demand whenever I realise that a task X is actually dependent on not-yet-started task Y. Plus it makes it easier to drip-feed any useful tweaks and fixes back into CVS) |
Jon Abbott (1421) 2651 posts |
I use a doubly-mapped cacheable DA for a circular sound buffer in ADFFS, its hit quite heavily at each sound IRQ but I don’t think this change will affect it other that substantially slowing down sound processing. Let me know which build it ends up in and I’ll test, if preferred I can recode and double map it directly in L2PT as I shouldn’t really be relying on an “internal use only” feature.
Its fine for what it was intended – speeding the desktop up. Not so great for games! |
Jeffrey Lee (213) 6048 posts |
That sounds like a bit of an odd setup – you’re using the double mapping to allow for the buffer to wrap back round to the start? I would have either gone with a standard circular buffer (split each read/write request in order to wrap it manually) or a ‘bip-buffer’ which avoids having to split write operations at the cost of a bit more memory overhead (you basically need to allocate desired_capacity + largest_fragment for the buffer). |
Jon Abbott (1421) 2651 posts |
Its the client Channel Handler writing to the circular buffer, its called multiple times at each sound IRQ, until there’s more sample time in the circular buffer than sample time required to fill the actual sound buffer. |