Unmapping IO memory

13 posts, 2 voices

Feb 20, 2019 1:08pm Jon Abbott (1421) 2651 posts	Is there a way to unmapped IO memory? OS_Memory 13 maps it in, but I can’t see any obvious way of unmapping it. I’ve noticed that if I map the GPU screen memory enough times via OS_Memory 13 eventually the screen ends up corrupt or blank. I’m not sure what the actual issue is yet, possibly Logical address space being exhausted as the previous mapping isn’t being removed?

Feb 20, 2019 1:31pm Jeffrey Lee (213) 6048 posts	Is there a way to unmap IO memory? No. (Unless you’re using OS_Memory 14/15, but you wouldn’t want to use those for your use-case) I’m not sure what the actual issue is yet, possibly Logical address space being exhausted as the previous mapping isn’t being removed? The kernel will reuse an existing mapping if it can find one, including if you’re trying to map in a smaller section of a larger region that’s already been mapped. So if you were to make one request to map in all of the GPU’s memory, then any future requests wouldn’t need to map in anything extra. Unfortunately this falls down a bit as soon as doubly-mapped memory comes into the picture – instead of looking for a mapped range with (base R1, size R2) it needs to for two mappings of that range which are right next to each other. So it effectively needs an exact match on the base & size on an existing mapping in order to reuse that mapping for the new call. And since the GPU is in charge of allocating the framebuffer on the Pi, it’ll likely move around quite a bit from one mode to the next. I should probably look into seeing how hard it would be to disable the double-mapping of screen memory for drivers like the Pi (external framestore + no hardware scrolling). Software shouldn’t be relying on the double-mapped nature of the memory, so it’ll hopefully just be a few bits of kernel code to unpick regarding address calculations. Then BCMVideo can just map the entirety of GPU memory on startup, and there won’t be any need to create any additional mappings for the framebuffer or any overlays.

Feb 20, 2019 1:55pm Jon Abbott (1421) 2651 posts	I’m mapping the video memory at every emulated mode change, as cacheable/bufferable but not double buffered. The size changes on each mode change as I’m only mapping in enough for three frame buffers. I believe I coded it that way, as I’d previous found it impossible to get the logical address of the GPU framebuffer (OS_Memory 0 doesn’t work for IO memory?) Your advice was to remap the memory to get the logical address. As its switching between my GraphicsV driver and the OS one, I can’t think of any other way to get the logical framebuffer address other that remap it every mode change. I suppose I could change it to use the GPU framebuffer size if its big enough to avoid size changes. I should probably look into seeing how hard it would be to disable the double-mapping of screen memory for drivers like the Pi (external framestore + no hardware scrolling). What’s double mapping the memory? Surely not BCMVideo as it would be pointless – the GPU is working on physical memory and doesn’t touch the MMU as far as I know.

Feb 20, 2019 3:24pm Jeffrey Lee (213) 6048 posts	If it’s cacheable, are you taking care to flush the CPU cache before each mode change? You won’t want write-back data being written to the framebuffer after the GPU has released it and started using the memory for something else. I’d previous found it impossible to get the logical address of the GPU framebuffer (OS_Memory 0 doesn’t work for IO memory?) Correct Since you’re not double-mapping the memory, you could have ADFFS read the base & size of the GPU memory on startup (via the mailbox interface) and then map in the entire area. That’ll avoid the OS creating extra mappings for your future requests. What’s double mapping the memory? Surely not BCMVideo as it would be pointless – the GPU is working on physical memory and doesn’t touch the MMU as far as I know. It’s the kernel that does it.

Feb 20, 2019 3:50pm Jon Abbott (1421) 2651 posts	are you taking care to flush the CPU cache before each mode change? Its cleaned at the end of every blit and the blitter is shut down during a mode change so yes, the CPU/GPU are in sync. read the base & size of the GPU memory on startup (via the mailbox interface) and then map in the entire area. That’ll avoid the OS creating extra mappings for your future requests. This would work if the framebuffer base/size are fixed from boot and never change. Do any of the GPU OS builds change the GPU framebuffer base/size? It’s the kernel that does it. I understand the legacy reasons for video memory being double mapped, which was initially a quirk of the way physical RAM started at &2000000 on MEMC. This was later mirrored on the RiscPC via the MMU for games I guess, but do we still need it? Unless there’s some underlying reason to double map video memory, such as the OS using it, I wouldn’t have thought any 32bit apps/games would be relying on it as triple buffering is pretty much required for tear free screen updates.

Feb 20, 2019 8:35pm Jeffrey Lee (213) 6048 posts	read the base & size of the GPU memory on startup (via the mailbox interface) and then map in the entire area. That’ll avoid the OS creating extra mappings for your future requests. This would work if the framebuffer base/size are fixed from boot and never change. Do any of the GPU OS builds change the GPU framebuffer base/size? Yes, the framebuffer base and size will change – but it’ll always be located within the memory that the firmware reserves for the GPU on startup. So if you make a request to map all of the GPU memory on startup, and you’re not using double-mapping, any subsequent requests you make will just re-use the existing mapping. but do we still need it? That’s the unanswered question!

Feb 22, 2019 1:42pm Jon Abbott (1421) 2651 posts	Yes, the framebuffer base and size will change The problem I’ve got is GraphicsV 8 could indicate the framebuffer base/size returned by GraphicsV 9 can change, which means I have to map the memory based on the current values returned by GraphicsV 9 when my driver takes over the mode. if you make a request to map all of the GPU memory on startup, and you’re not using double-mapping, any subsequent requests you make will just re-use the existing mapping. I’m not sure I can do that, due the base/size potentially changing. Can I assume that GraphicsV 9 base/size don’t change for ROM based GraphicsV drivers? If so, there will be three mappings of the GPU RAM. The double mapping from the existing driver and my mapping. That’s not going to cause any issues is it? And will the kernel remove my mapping if the existing GraphicsV driver were to remap it’s allocation? That’s the unanswered question! Perhaps we should take the Page Zero approach and see what breaks via nightly build? I’m fairly certain it’s only going to affect games that rely on memory wrapping, or buggy software that’s writing directly to screen memory and overruning. EDIT: if you make a request to map all of the GPU memory on startup, and you’re not using double-mapping, any subsequent requests you make will just re-use the existing mapping. I’ve just tried this approach, but it doesn’t work. I’m guessing the physical base address is changing when I change mode on the GPU via GraphicsV 2

Feb 22, 2019 6:13pm Jeffrey Lee (213) 6048 posts	To confirm: 1. You’re using the get VC memory mailbox message to read the GPU memory 2. You’re mapping in that memory using OS_Memory 13 (not using double mapping) 3. When you want to get the framebuffer address on a mode change, you’re calling OS_Memory 13 again with the new physical base & size, but with the same flags as in step 2 4. You’re getting a result which is outside the range that was returned by the call in step 2 Although now I think about it, the GPU cache mode of the range returned in step 1 might not match the cache mode that it decides to use for the framebuffer in step 3 (the top two bits of the physical address control whether the GPU’s L2 cache is used). So that might be why it’s not working for you – and unless you start guessing about what cache mode the GPU is going to be using for the framebuffer (or you waste address space by mapping in all possible variants), this approach isn’t really workable. Can I assume that GraphicsV 9 base/size don’t change for ROM based GraphicsV drivers? No. (example: BCMVideo) If so, there will be three mappings of the GPU RAM. The double mapping from the existing driver and my mapping. That’s not going to cause any issues is it? Since your driver is creating a cacheable mapping (of BCMVideo’s framebuffer), there are potential issues if a GraphicsV 13 call makes it through to BCMVideo, since the OS/drivers don’t do any cache maintenance when accelerated ops are performed. But other than screen corruption nothing should catch fire. And will the kernel remove my mapping if the existing GraphicsV driver were to remap it’s allocation? No – the kernel maps it in using OS_Memory 13 (which, as we’ve discussed, has no corresponding “unmap” call) That’s the unanswered question! Perhaps we should take the Page Zero approach and see what breaks via nightly build? I’m fairly certain it’s only going to affect games that rely on memory wrapping, or buggy software that’s writing directly to screen memory and overruning. Yeah – if it works for me with a brief bit of testing then I’ll just check it in and see if anyone else can break it.

Feb 23, 2019 12:06pm Jon Abbott (1421) 2651 posts	To confirm: I pass GraphicsV calls onto the existing driver unless it’s a mode my driver is going to take over. It then calls GraphicsV 9 to get the original driver’s physical framebuffer base/size and then OS_Memory 13 to remap it with caching/buffering and no double map and to get a logical address for the blitter to write too. At no point do I touch hardware directly, that’s handled by the existing driver. Since your driver is creating a cacheable mapping (of BCMVideo’s framebuffer), there are potential issues if a GraphicsV 13 call makes it through to BCMVideo, since the OS/drivers don’t do any cache maintenance when accelerated ops are performed. Nothing will get through to BCMVideo, as my driver will have stopped passing GraphicsV calls to it. When you go back to a mode my driver isn’t implementing, I start passing GraphicsV to the existing driver, which will carry on using it’s existing framebuffer allocation. Everything works, except for the logical memory leak that occurs every driver switch, because I can’t release my IO mapping.

Feb 25, 2019 1:32pm Jeffrey Lee (213) 6048 posts	Although now I think about it, the GPU cache mode of the range returned in step 1 might not match the cache mode that it decides to use for the framebuffer in step 3 (the top two bits of the physical address control whether the GPU’s L2 cache is used). Disregard that – while looking at the code yesterday I was reminded that the ARM physical addresses aren’t affected by GPU the cache mode. RAM accesses by the ARM are always performed using whatever the default GPU cache mode is. DMA controllers (and perhaps a couple of other bits?) can use the top two bits of the address to control which cache mode is used, so programming the cache mode correctly is relevant for those, but the logic for that is all within the BCM-specific drivers, so that higher-level drivers/software only has to deal with regular ARM physical addresses. That’s the unanswered question! Perhaps we should take the Page Zero approach and see what breaks via nightly build? I’m fairly certain it’s only going to affect games that rely on memory wrapping, or buggy software that’s writing directly to screen memory and overruning. Today’s ROM contains a change to disable double-mapping of VRAM for drivers which use GraphicsV 9 and don’t support hardware scrolling. Also, BCMVideo will now map in all of the GPU’s memory on startup, so when the OS maps in the framebuffer or when BCMVideo maps in overlays there shouldn’t be any new IO mappings created. If you use similar code to lines 524-554 to map in the memory as cacheable then that should avoid ADFFS creating extra mappings on mode change. (Just don’t copy the register mixup bug that’s in there – RSBHI r1, r2 should be RSBHI r2, r1)

Feb 25, 2019 2:07pm Jon Abbott (1421) 2651 posts	If you use similar code to lines 524-554 to map in the memory as cacheable then that should avoid ADFFS creating extra mappings on mode change. Thanks, I’ll give it a try once the build is available. This does however only resolve the leak when ADFFS is used with BCMVideo, I do not have other platforms to test but I suspect it will also leak logical memory space under other GPU drivers such as OMAP/Iyonix/Titanium etc.

Jul 27, 2021 3:18pm Jon Abbott (1421) 2651 posts	I’ve finally got around to revisiting this issue, but I’m no further forward. I can’t use Jeffrey’s advice above as it’s specific to the BCMVideo driver – I need one of the following: GraphicsV 9 extended to provide the logical address of the framebuffer Translate a physical IO address to a logical address Ability to release permanently mapped IO memory Without one of these, there is currently no way of getting the GraphicsV framebuffer logical address or preventing memory exhaustion.

Jul 29, 2021 5:08am Jon Abbott (1421) 2651 posts	I think I’m going to code up a Repro for this and see just how quickly memory space is exhausted, depending on how the framebuffer mapping is requested. The way I currently have it coded it seems to happen quite quickly, possibly after 20-30 mode changes, but as I’ve never actually counted that’s just a guess. Is there a method to read the current IO memory mappings? I’d be interested to see if normal day to day use can cause IO memory to leak as devices are switched and the memory mapping not released.

Reply

To post replies, please first log in.

Forums → Community Support →

Unmapping IO memory

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options