SMP IRQ handling

4 posts, 2 voices

Jan 21, 2017 5:46pm Jeffrey Lee (213) 6048 posts	I’m trying to work out the best way of enhancing the OS’s IRQ handling to allow it to deal with IRQs in a SMP world. As a bit of background, multi-core chips typically operate as follows: The OS can route each external interrupt to whichever core(s) it desires (and control whether it’s routed to the IRQ or FIQ pin of the core) If one interrupt is enabled for multiple cores, there’ll usually be some kind of hardware mechanism within the interrupt controller to make sure that multiple cores don’t attempt to service the interrupt in parallel. E.g. although both cores may enter their IRQ vector, it will be the first core which reads the IRQ status register which “wins” and the other core will either be told it was a spurious interrupt or will be given another pending interrupt to deal with. Some of the IRQ controller registers will be banked so that they’re only accessible from one core (e.g. a core will only be able to read/write its own interrupt configuration) It’s the second point that’s the tricky one – in a perfect SMP system the programmer should have no need to know or care what core his thread is running on, and the OS should be free to move threads from one core to another at will. But if you need to be on a specific core to interact with the IRQ configuration of that core then that makes things a lot more difficult. The current version of my SMP test module doesn’t attempt to solve this problem – it specifies that the HAL IRQ calls should only affect the interrupt configuration of the core which they’re executed on. This then places a heavy burden on hardware driver developers to make sure that all their IRQ management is done from threads which are locked to specific cores via their affinity mask. This may be tricky to achieve if e.g. a public SWI (which could be called by other software from any thread) needs to alter the IRQ configuration, and could easily lead to bugs when porting drivers from other platforms. There’s also another problem the SMP module doesn’t deal with, which is what to do about IRQs that can’t be routed to a particular core. At the moment there’s no way for the HAL to specify which core(s) an interrupt can be routed to, or even which interrupts can be configured as FIQs. So in order for the SMP work to progress I’m trying to find a solution to these problems. Problem one: Banked IRQ registers I think it should be possible to solve this using a bit of smoke and mirrors, by storing a soft copy of the interrupt configuration in RAM. E.g. when HAL_IRQEnable is called, the HAL would perform the following steps: Work out which core the interrupt is to target (e.g. this could be encoded in the device number) Update the softcopy of the interrupt configuration for that core to mark the interrupt as enabled (If we’re on the correct core) update the IRQ controller and exit Else, send a message to the target core to get it to update its IRQ mapping (e.g. ring one of the doorbells) When a core receives a “update IRQ mapping” interrupt it merely needs to compare the IRQ mapping softcopy against the state of the IRQ controller and apply any pending changes (this can easily be replaced with more optimal solutions, e.g. have the message indicate which IRQs need updating). Ideally the HAL should make sure this interrupt is given priority over all others. Regardless of whether we were on the correct core or not, the HAL_IRQEnable call can return immediately. This is because it doesn’t really matter if the call doesn’t update the hardware immediately – the target core will be in one of two states: It will have interrupts enabled, in which case it will receive the “update IRQ mapping” message straight away, allowing it to start responding to interrupts from the device “immediately” (there’ll be a small delay from having to service the mailbox message, but other than that it will be fine) It will have interrupts disabled, in which case it will receive the “update IRQ mapping” message as soon as they’re re-enabled. But this won’t add any (significant) delay to when it’s able to start receiving interrupts from the device, because even if the caller of HAL_IRQEnable was able to directly adjust the IRQ configuration of the core, the receipt of any interrupts would have still been delayed until the target had re-enabled IRQs. Note that we can’t make HAL_IRQEnable wait until the target core has acted on the message because that would require a lot of work to make sure there’s no danger of deadlock. E.g. if core A is trying to claim a spinlock which is held by core B (which would involve sitting in a loop with IRQs disabled), and core B makes a HAL_IRQEnable call which would affect core A, then core A will never respond to the IRQ from core B. Looking at the different HAL IRQ calls, this would affect them as follows: HAL_IRQEnable, HAL_IRQDisable – Post a message to the other core, as above (although, technically HAL_IRQDisable should only be called by the kernel when it receives an unhandled IRQ – so perhaps that call only needs to be capable of affecting the current core) HAL_IRQSource, HAL_IRQClear – These calls only make sense when called from an IRQ handler, so they only need to be able to affect the current core HAL_IRQStatus – We should probably try and deprecate this call. We’ve already run into hardware which can’t implement it faithfully (e.g. the Pi), and things will get even more complicated (read: impossible) if we need to read a banked register which belongs to another core. A quick search through the source (Pi source, admittedly) suggests that it might only be ADFS which relies on this call, to allow it to perform IDE transfers in a polling mode. Really it’s something that could be replaced with a dedicated IRQ handler which sets a flag in memory whenever the IRQ arrives. For FIQs, things are a bit trickier, since the OS doesn’t implement an interrupt dispatcher for FIQs – the FIQ goes straight into the installed handler code. So there’s nowhere to insert the doorbell/message handling to allow deferred banked register updates to be dealt with. So for FIQs we may have to keep with the current setup where the HAL FIQ calls are only required to be able to affect the FIQ state of the current core. But since drivers rarely use FIQs, this shouldn’t be such a burden for them to deal with on their own. Problem two: Dictating IRQ routing I think that there are a few options available to us with how we dictate the IRQ routing to the HAL (in terms of which cores each interrupt is routed to): Don’t let the OS dictate the IRQ routing; make each HAL use a fixed routing Add new HAL calls to control which core(s) each IRQ is routed to (e.g. HAL_IRQSetCores, HAL_IRQGetCores). Note that for FIQs no function is necessary since we’ve already decided above that FIQ calls should only affect the current core. Extend the ‘device number’ that’s passed into each HAL call to allow it to specify which core(s) it should target Options two and three are the most interesting (option one is probably too inflexible). With option two, things will work nicely for situations where the device numbers are homogeneous (i.e. a specific device will appear as the same IRQ number to all cores), but will lead to complications for devices where this isn’t the case. E.g. multi-core ARM chips tend to have at least one “private timer” per core, and because they’re completely private to the associated core, the IRQ controller will typically refer to them all by the same device number. The HAL would then have to deal with this by mapping that one device number to N different numbers by mixing in the core number. With option three, we could easily say that e.g. bits 24-30 of the ‘device number’ should be used to specify the core that a call to HAL_IRQEnable (or similar) should target. However it would then raise the question of how a driver should decide which core it should route its interrupt to – with option 2 the routing could be left to the OS via some kind of automatic load-balancing, but if the core number (or core mask) needs to be specified with every HAL call then that would be impossible. Having extra bits which need to be masked out of device numbers at different points would also risk bugs creeping into the system. So I think option 2 would be the best approach. Problem three: Describing IRQ routing options When describing the routing options for an interrupt, I think there are two key things to consider: Which core(s) the interrupt can be routed to (e.g. whether the interrupt is private to just one core or global) Whether the interrupt can be enabled for multiple cores at the same time To communicate this information to the OS, it may be easiest to add a new HAL call, e.g. __value_in_regs struct { int irq, fiq; } HAL_GetAllowedIRQRouting(int device) For interrupt device number ‘device’, it will return in R0 the mask of which cores can be configured to receive that interrupt as an IRQ, and in R1 the mask of which cores can receive it as an FIQ. If an interrupt can be assigned to multiple cores at once then bit 31 of the corresponding register will be set (analogous to the ‘shared’ flag in bit 31 of the device number), if the interrupt can only be assigned to one core at a time then bit 31 will be clear. Note that the Pi 2 and 3 essentially use two interrupt controllers which are chained together. The original Pi 1 interrupt controller is still present, and it feeds its two IRQ and FIQ output signals into a new interrupt controller which allows those signals to be routed to one of the four cores (and adds in the new interrupt sources like the mailboxes and private timers). Since RISC OS doesn’t really understand nested/chained interrupt controllers, we’ll probably want to have the HAL downplay the capabilities of the Pi 2/3 interrupt controller. I.e. have all the Pi 1 interrupts claim that they can only be routed to the primary core, and only support full routing control for the new Pi 2/3 interrupts. Problem four: How to make sure your IRQ handler isn’t running This is more of a driver design problem than a HAL/kernel problem, but it’s worth discussing it here. In a single-core world, there are a few basic approaches you can take to make sure your IRQ handler isn’t running (so you can e.g. alter device configuration or some control variables in memory): Disable IRQs in the PSR Disable IRQs in the device’s interrupt mask register Disable IRQs in the IRQ controller (HAL_IRQDisable) – not the best approach if the IRQ line is shared Deregister your interrupt handler Assuming your code hasn’t been re-entered (i.e. your aren’t doing the above from within your IRQ handler), you can be sure that once you’ve done any of the above your IRQ handler will no longer be running. But in a multi-core world it’s not so simple. Option one will only affect the current core, so is useless for the general case where you don’t know or care which core(s) your IRQ handler is assigned to Options two and three will stop future interrupts from being acted on, but won’t guarantee that an in-progress call to your interrupt handler has completed Option four has the most chance of success – it should be possible for the kernel to block until it knows your interrupt handler has finished (n.b. the current SMP module won’t do this). Or if this can’t be done without the danger of deadlock (e.g. if we need to support calls to OS_ReleaseDeviceVector from within IRQ handlers), maybe there could be the option for the kernel to set a flag when the handler has fully finished + deregistered Of course most of the time you’ll just want to temporarily pause the interrupt handler so that you can update some shared state, in which case a spinlock should be sufficient. But if you want to make sure your interrupt handler has exited and will never be called again then the only option that’s likely to work will be option four, where you deregister the handler and rely on the OS to let you know when it’s fully finished. Can anyone see any problems with the above, or any situations that I’ve missed? Any bits that are too hard to follow?

Jan 21, 2017 6:03pm Clive Semmens (2335) 3276 posts	Good luck! You’re a braver man than I am, Gunga Din.

May 29, 2017 3:43pm Jeffrey Lee (213) 6048 posts	Has anyone encountered a device which uses a shared IRQ line, but doesn’t allow masking of interrupts within the device? After taking a bit of a hiatus I’ve now returned to this topic, and have run into a bit of a blocker regarding HAL_IRQEnable and HAL_IRQDisable. Basically the problem is that using them on shared IRQ lines is inherently unsafe in multi-core (and some multi-threaded) environments. So I’m wondering whether it would be feasible to ban the use of HAL_IRQEnable/HAL_IRQDisable in all except the following circumstances: After calling OS_ClaimDeviceVector, a driver must call HAL_IRQEnable to enable receipt of the IRQ (really it should be the kernel which does this, but there’s not much point fixing that now) The kernel is allowed to call HAL_IRQDisable to silence an unhandled IRQ Device drivers can call HAL_IRQEnable and HAL_IRQDisable at other times only if it’s a non-shared IRQ. For shared IRQs other methods of controlling IRQ receipt must be used (e.g. masking interrupts within the peripheral). This point covers use-cases such as nested interrupts; if an interrupt handler wants to run with IRQs disabled then it needs some way of temporarily blocking interrupts from its device, otherwise clearing PSR.I may lead to infinite recursion and a stack overflow Of course for nested interrupts it is worth mentioning that most modern SoCs support nested interrupts via an interrupt prioritisation scheme, which if used should be an improvement over the way RISC OS currently does things. But adapting drivers to work with hardware-assisted nested interrupts would require a fair amount of work (potentially requiring the driver to take different code paths depending on the OS/hardware it’s running on), and wouldn’t solve all of the problems we’re facing here (e.g. it could be something in the foreground which wants to use HAL_IRQDisable disable interrupts from a device)

Aug 3, 2017 1:02pm Jeffrey Lee (213) 6048 posts	To give this thread some closure, this doc covers the approach I’ve gone with. To keep HAL changes as simple as possible I’ve weakened the requirements a bit compared to what was described here (e.g. HALs are not expected to support one core controlling the state of a private interrupt that belongs to another core)