Safeguarding the past, present and future of RISC OS for everyone

News | Downloads | Bugs | Bounties | Forums | Library

Forums → Code review →

SMP-friendly DMA

3 posts, 2 voices

Jun 23, 2017 10:19pm Jeffrey Lee (213) 6048 posts	For the past few days I’ve been about how best to handle the cache maintenance requirements of DMA to cacheable locations – because on modern CPUs, and particularly with SMP, making arbitrary pages non-cacheable is a big no-no. I think I’ve arrived at a solution that I’m happy with, so the question is, is there anything I’ve missed? OS_Memory 19 In: R0 = 19 + flags Bit 8: Input function provides physical addresses (else logical) Bit 9: Is a DMA write operation (else read) Bit 10: Is the end of the operation (else start) R1 = R12 value to provide to called functions R2 → Input function R3 → Output function (only used if start of operation) Out: All regs preserved Potential error in R0 The idea is that this is going to act as a replacement to OS_Memory 0 (for the task making pages uncacheable for DMA). But rather than require the caller to use a specific data structure for input/output, function pointers are used to allow the kernel to read the input list and produce the output list. One of the reasons behind this choice is that the output list may end up being a different length to the input list – for a DMA write to RAM, any areas which aren’t cache line aligned will need breaking into two or more parts (bounce buffer at start, RAM in middle, bounce buffer at end). So to avoid the kernel allocating memory which the caller may or may not have use for, this API cuts out the middle man by allowing the caller to directly control the output format himself (e.g. the caller will typically want the output in a format which the DMA controller can use). Input function In: R12 = R1 value from SWI Out: R0 = Start address of region (either logical or physical, as indicated by the SWI flags) R1 = Length of region (specify zero to indicate the end of the transfer) R12 corrupt Or, R0 = error to abort the operation The input function will be called multiple times, until it either returns an error or with R1 equal to zero. It’s expected that the memory regions returned will be in the same sequence as will be used for the DMA transfer, although this isn’t strictly necessary. Output function In: R0 = Logical address of start of region R1 = Physical address of start of region R2 = Length of region R3 = Flags: Bit 0: Bounce buffer must be used R12 = R1 value from SWI Out: R0-R3, R12 corrupt Potential error in R0 to abort the operation The kernel will call the output function for each contiguous block of physical memory. The address ranges will be in the same sequence as returned by the input function, but they may have been split (e.g. to cope with discontiguous physical pages or bounce buffer usage) or joined (e.g. if two successive input blocks happen to be physically contiguous). If bit 0 of R3 is set, then it indicates that the address range described by the region is not suitable for DMA and a bounce buffer must be used. Typically this will only be the case for DMA writes to partial cache lines. But it in the future it could be extended to allow for other situations, e.g. memory which the kernel/HAL knows can’t be used for DMA. Cache maintenance implementation When the SWI is called, the kernel will perform the appropriate cache maintenance for the different memory regions. Depending on the flags in R0, there are essentially four variants of the routine: Start of DMA read: The cache will be flushed for any cacheable regions. End of DMA read: No-op Start of DMA write: The cache will be flushed for any cacheable regions that cover whole cache lines. Any partial cache lines will be instructed to use a bounce buffer. End of DMA write: The cache will be invalidated for any cacheable regions that cover whole cache lines Technically, the start of a DMA write only needs to invalidate the cache, since there’s no need to write back any dirty cache lines. But if the DMA is cancelled before it starts, or if the DMA terminates early, then this has the potential to destroy the old buffer contents. So for safety the cache is flushed instead. This behaviour also allows for read-write type DMA. DMA to overlapping regions If multiple DMA operations target the same area of memory, then the most sensible way of dealing with it would be to act as if each cache line has its own read-write lock that allows multiple clients to hold a read lock but only one client to hold a write lock (and no read locks while a write lock is held). However, it is effectively a programming error if this situation is encountered – the initiator of the DMA or the owner of the memory should be the one to make sure any concurrent DMA accesses are safe, the same way that it should make sure CPU access to DMA regions is safe. So the initial implementation isn’t expected to perform any memory locking, but the option of adding it is there if in the future we find that it’s necessary. Potential tweaks/improvements An extra register or two that get passed into the input/output functions to allow for easier state tracking. This would make things easier for C modules (no need to have a pre-veneer which extracts the state and module private word from R12 before calling the CMHG veneer) An extra flag for the SWI which indicates that the current buffer contents can be discarded (to allow the kernel to invalidate the cache for DMA writes instead of flushing it) If there are any machines for which we’d prefer to make pages temporarily uncacheable, this call is still suitable for use there. Internally it would just make pages cacheable/noncacheable, the same as OS_Memory 0, and there’d be no need for bounce buffer usage. I believe DMAManager, SATADriver and AHCIDriver are the only components which make use of OS_Memory 0 for controlling the cacheability of pages, so they’ll all want to be updated to use the new call. Also, as with OS_Memory 0, it’s going to be the caller’s responsibility to listen out for Service_PagesUnsafe. Does this sound OK to everyone? Anything that I might have missed?

Jun 24, 2017 5:59am Clive Semmens (2335) 3276 posts	I used to bend my brain around questions like these, but that was ten years ago. I don’t think anything I’d contribute now would be worth the rent on the pixels it was printed on. Sorry 8~(

Aug 16, 2017 1:20pm Jeffrey Lee (213) 6048 posts	The design I went with is pretty similar to the above. API-wise, the differences are that the input and output functions are both allowed to use R9 as a way of keeping track of their state. The input function is also capable of returning the “use bounce buffer” flag – which is useful for ensuring that SATA transfers are halfword aligned. One thing I didn’t foresee though, is that using callback functions, deferred allocation of bounce buffers, etc. makes Service_PagesUnsafe a lot harder to deal with. Because the callback functions might trigger Service_PagesUnsafe, the kernel can’t cache any logical → physical translation across calls to the input/output functions (or it must be capable of invalidating the cached translation if Service_PagesUnsafe is triggered). Likewise, if the input/output functions perform memory allocation in a manner which might trigger Service_PagesUnsafe, they must also be capable of either updating or discarding the results that have been received so far. Updating the results (either during the DMA op, or during the OS_Memory call) is also going to be tricky, because the physical contiguity of regions may have changed – so any scatter list which is being constructed may need to grow, requiring more memory allocations, triggering more Service_PagesUnsafe, etc. So the easiest/best option is likely to be to throw everything away and call OS_Memory again.

Reply

To post replies, please first log in.

Forums → Code review →

Search forums

Social

Follow us on

and

ROOL Store

Buy RISC OS Open merchandise here, including SD cards for Raspberry Pi and more.

Donate! Why?

Help ROOL make things happen – please consider donating!

RISC OS IPR

RISC OS is an Open Source operating system owned by RISC OS Developments Ltd and licensed primarily under the Apache 2.0 license.

Description

Developer peer review of proposed code alterations.

Voices

Options

Forums
Login

Contact Us | About Us

The RISC OS Open Beast theme is based on Beast's default layout
Site design © RISC OS Open Limited 2024 except where indicated

Hosted by Arachsys

Powered by Beast © 2006 Josh Goebel and Rick Olson
This site runs on Rails