ARM generic timer
Jon Abbott (1421) 2651 posts |
Sounds sensible to me, they’re both likely to be wanted at some point. |
nemo (145) 2546 posts |
Annoyingly, I wrote this some time ago and then lost it under a pile of windows and forgot to send it. Belatedly:
That document says:
It also says: unsigned int HAL_TimerGranularity(int timer) [ entry no 14 ] It does not say that the granularity or period of Timer 0 are any particular values. Timer 0 is used to provide the ‘centisecond clock’, it is not defined to have a granularity of 100Hz or a period of 1. Indeed, on some theoretical platform that may be impossible. Given all that, and the fact that I don’t think OS_Hardware appeared until RO6, you shouldn’t have to worry about that Iyonix documentation, I’d have thought.
So that needs fixing… but I’d argue that it needs fixing anyway. Having said all that, I agree that IOMD HAL is entitled to provide a synthesised 100Hz Timer 0 derived from the hardware timer that could be presented as Timer 2. |
Dave Higton (1515) 3525 posts |
In the old days there was the Device Claim protocol, which works at application level. Doesn’t that work on RISC OS 5? e: Or is it plain not good enough? |
Jeffrey Lee (213) 6048 posts |
AIUI the device claim protocol was implemented using wimp messages, which obviously means that it’s only suitable for use by wimp tasks. So not the most useful protocol, I’d say. However, one of the earlier ideas for managing access to the HAL timers was to use service calls to claim/release them. I can’t remember if it was mentioned on the forums or if it was just in emails between me and Ben, but I think the idea was that it would be very similar to the device claim protocol: A program would issue a “I want to use this timer” service call, and anything which is already using that timer would claim the call and return a “no, I’m using it” message. Then the original program would presumably try claiming the next timer in the list. (I think I had some concerns about the idea when Ben initially proposed it, but can’t remember what they were. Perhaps it’s because the service call based nature would mean that applications wouldn’t be able to use it?) But even if that protocol existed, it would do f-all to help for platforms like the Pi where there’s only a couple of timers available. The kernel uses timer 0, some other software (perhaps in the OS, perhaps not) decides it wants to use timer 1, and then when the user tries to load another piece of software which wants/needs a third timer they’re out of luck. Since we’re mostly dealing with software running on a single core, a single flexible timer should be sufficient for most of our needs. However, even with the ARM generic timer, I think we may still run into some issues – the timer is (always?) driven from the CPU clock, so it’ll change frequency whenever we switch CPU speed. And if the PLL needs to be re-locked, it’ll be hard to judge how much time is lost or gained during the operation – the HAL will probably have to use an independent timer as a reference time source. Something for me to tackle in the new year, hopefully. |
Dave Higton (1515) 3525 posts |
Yes, I realised that later, when I started to write a module, which would not need a runnable part except for that reason.
That’s great for modules, but what if an app wants to claim and use a device? Maybe, if it’s been claimed via a service call, the OS could respond to Wimp DeviceClaim messages. The other direction (an app has it reserved, so the service call should also say so) may be problematic on grounds of speed.
True, but it’s like all devices: you can’t use a device unless you have that device. For example, you can’t use serial communication unless you have a serial port, can’t use Bluetooth unless you have a Bluetooth host and driver; etc. etc. In the current context, you can’t do MIDI unless you have a MIDI interface. Not a problem. Nothing to lose any sleep over.
Yes, I’ve been wondering about that too. I think there is often a clock source available for the devices that doesn’t change with CPU clock multiplier setting. It will take careful study of the chip’s TRM. (And there’s another problem, in some cases.) But, if it exists, the HAL should use it, and the timer granularity should reflect it correctly. |
Steve Pampling (1551) 8170 posts |
The simple side of me says why not have a master that regulates the timing of a module that offers multiple second tier timers?
Which for some reason doesn’t make the clock/time display vary in speed so it is possible to set an interval timer that doesn’t vary with CPU clock speed. |
Rick Murray (539) 13840 posts |
Write a wrapper in a module? What happens if the OS needs the device claimant to “do something” and the application is paged out?
Are you sure about that? How do you know that, when your clock speed switches down to idle, your machine isn’t slowing down the entire universe so you perceive time to pass at the same speed as before? |
Steve Pampling (1551) 8170 posts |
Feature only available in Intel based systems. |
Tristan M. (2946) 1039 posts |
Now I have to look back at what I did. It’s been a while. It’s not what I’d call finished, but it functions in some fashion and uses the generic timer. |
Jeffrey Lee (213) 6048 posts |
Luckily, some experimentation suggests that this isn’t the case. Titanium, OMAP5, Pi 2, and Pi 3 all appear to keep the timer frequency stable when the CPU speed is changed. I guess I’d better start writing a driver then! |
Jeffrey Lee (213) 6048 posts |
I’ve now got drivers for the ARM generic timer (Titanium + OMAP5 + Pi 2/3), the Pi 1 timer, and the Cortex-A9 global timer (OMAP4 & iMX6). Next on my list will be OMAP3. Unfortunately the Cortex-A9 timers do change frequency in response to changes to CPU speed, so to avoid incremental clock drift I’ll have to come up with a different approach – probably using a clock-invariant timer (like the current HAL timers) to generate the 64bit time value, and only use the A9 timers for scheduling interrupts. When the CPU speed changes it should be pretty straightforward to fire off a message to each core to tell it to reprogram its pending timer interrupt – the core sending the message doesn’t even have to wait for a reply, it can just be a pending interrupt which the receiving core can deal with at its leisure. For OMAP4 I could probably just use two of the SoC timers (to allow one scheduled interrupt per core), but the iMX6 SoC doesn’t have enough timers for that to be possible, so using a SoC timer for the 64bit clock and the per-core A9 timers for scheduled interrupts is likely to be the next best thing. The other thing I’ve been thinking about is how to deal with the clock fine-tuning that the RTC module performs. Originally I was hoping that the adjustment could be kept to just the 100Hz ticker (which will be synthesised from the new 64bit timer), as that would allow the new API to use the 64bit time value directly (quick access to the timer since there’ll be no unit conversion, and good timestamps for profiling code due to the high frequency). But then I realised that if code used a mix of the 100Hz ticker API and the new 64bit API then it would see time advance at two different rates. And of course there’d be no built-in way to correct for any drift/inaccuracy in the 64bit timer. So now I’m thinking the clock fine tuning should be applied to the 64bit timer, and that the application-level API should use a fixed frequency (e.g. 1MHz). This’ll slow things down a bit by adding extra conversion code to all the calls, but it’ll avoid programs seeing conflicting clock rates, and the fixed frequency should reduce the number of inaccuracies that occur as a result of programs implementing their own code for converting from the hardware-level tick rate to more useful values (e.g. Titanium / OMAP5 have an awkward timer rate of something like 6147451Hz). For platforms where the timer frequency is higher than 1MHz, code will lose out on the extra fidelity that offers when using the timers to profile code, but maybe we can counter that by having an extra call which returns the 1MHz time and a fractional part (of unspecified precision) |
Dave Higton (1515) 3525 posts |
Does this include making FastTickerV work? |
Jeffrey Lee (213) 6048 posts |
I’m planning on implementing a high-precision OS_CallAt (i.e. OS_CallAfter, but using an absolute microsecond time value instead of a relative centisecond one). For 99% of cases this will be better than FastTickerV, since it’ll allow programs to control the ticker rate (unlike FastTickerV where you’re just given what the OS feels like), and it’ll allow uninteresting ticks to be skipped (saving CPU cycles & energy consumption). Of course, if someone felt that FastTickerV was necessary then there’s nothing stopping them from implementing it in a softloadable module by building ontop of OS_CallAt (well, apart from the OS_ReadSysInfo 6 item that reports the ticker rate – which will need a bit of work to resolve the conflict with RISC OS 5’s usage) |
Dave Higton (1515) 3525 posts |
OS_CallAt is interesting but it worries me. What happens if a task sets a CallAt a time that has just passed, perhaps because something else took too much processing time, or even the task itself took too much time – maybe the previous CallAt was late, and that plus the reasonable processing time took it over the interval? |
Jeffrey Lee (213) 6048 posts |
Late events will be processed as soon as possible. It’s no worse than TickerV / FastTickerV, where there’ll be an unknown amount of time between the interrupt firing and your handler being called (whether due to other handlers taking time, or the system being busy doing something else). However the good thing is that your routine will be able to read the current time on entry and work out how long the delay was (unlike OS_ReadMonotonicTime, which is both too low-resolution, and will be incorrect if timer interrupts have been missed). For the other problem – of long routines taking too long and completely bogging down the system – I’m not sure yet what the best approach will be. Routines could be smart with when they reschedule the event – e.g. if you want a 1kHz event, at the end of your routine, you can read the current time and then round that up to the next millisecond in order to work out when the event should be scheduled. So a routine which you want to execute every millisecond, but takes 1.5 milliseconds to execute, will automatically adjust itself so that it runs every 2 milliseconds instead. This is roughly what happens with TickerV / FastTickerV if the system becomes overloaded and timer interrupts get missed. Since we’re adding a high-resolution timer, the OS could also take an active role in things – measuring the percentage of CPU time that’s going into each routine (and how much is left for the foreground) and artificially limiting the frequency of routines until the foreground is able to execute. But that’ll require a lot of work to get right (punish the bad routines without hurting the good ones). |
Steve Pampling (1551) 8170 posts |
Hmmm, “nice” |
Jeffrey Lee (213) 6048 posts |
OMAP3, Iyonix and IOMD timers are now working. For Iyonix I’m basically using the approach I outlined here – instead of constantly changing the reload register value, I’m leaving it set at maximum and performing reloads manually by writing the desired value to the timer register. When reprogramming the timer, I use a limit of &7fffffff for how far in the future an interrupt can be scheduled (i.e. the maximum value I’ll write to the timer register). This simplifies the code for reading the current time – I can just sign-extend the timer register value to 64bits and then add it on to a biased reference time value that’s updated in the interrupt handler. Positive timer register values mean the interrupt hasn’t happened yet, negative means it has happened (and indicates how long ago that was). And reading the current timer value just before setting the new one should make the time loss during the reprogramming fairly predictable – although I haven’t tested it under load yet to try and measure exactly how much drift there really is. Although the IOMD timers operate slightly differently, I realised that I could use the same approach there – write the desired timer value to the input latch (which is used for reloading the timer), manually trigger a reload of the timer by writing to the GO register, and then reset the input latch to the maximum reload value. It’s even possible to get the same two-instruction “read current, set new” critical section of the reprogramming sequence. The only thing to watch out for is that writing to the GO register doesn’t seem to cause the timer to be reloaded immediately – it seems like it waits for the next 2MHz tick. So after writing to GO I’ve got to wait a few instructions before writing the max-reload value to the input latch, otherwise it sometimes reloads with the max-reload value instead of the value that was present in the latch earlier. Under light load it looks like it runs slow by 2-3 ticks per interrupt; over the course of a day this will easily add up to several seconds, so there’ll definitely need to be some correction applied. Again, I still need to test it under heavy load to see if this time loss really is predictable. But if it can be made to work accurately enough then it will be a nice single-timer solution for IOMD & Iyonix which will leave the other timer free for use by user software. |
Jeffrey Lee (213) 6048 posts |
It’s probably about time I tried to move this forward. Here’s my current proposal: OS APIThere will be a few new OS SWIs which applications can use:
HAL APIHALs will implement support for the 64bit timer via a HAL device. Main properties are:
Since the OS presents the timer as being 1MHz, but the HAL presents it at native frequency, the OS will maintain a scale factor which can be used to convert between the two units. Also, since the OS is going to be converting things, it might not actually be necessary for the HAL interface to be 64bit. So I might try narrowing it down to 32bit if it makes things easier/quicker. Kernel integration
OS integrationThe RTC module fiddles with the period of timer 0. This will need changing so that it instead fiddles with the ticks-to-MHz value that the OS uses for converting between MHz 64bit time values and the raw timer values. This will likely require an extra kernel SWI, along with extra code in the time conversion functions (to represent the fact that the rate adjustment is applied at a given point in time, not at time zero). Plus whenever a change is made the timer interrupt will need reprogramming to take into account the new conversion factor. In a multi-core world, this will also involve firing a message to the other cores so that they can reprogram their own timer interrupts. Also note that by fiddling with the ticks-to-MHz conversion value, the time fiddling that the RTC module performs will affect all OS-level time APIs (perhaps also OS_Sleep). I think this is a better approach than having it only affect the monotonic timer, since it’ll ensure that programs see consistent times / delta times no matter what API they use. Current statusI’ve got timer device implementations for IOMD, Iyonix, OMAP3, Pi 1, and the ARM generic timer (OMAP5, Pi 2+, Titanium, and probably all new/future machines like the Pinebook – although I still need to make some improvements to interrupt handling in the Titanium HAL before it can be used there). OMAP4 & iMX6 are the odd ones out; the intent is to use the Cortex-A9 global timer, since that’s synchronised between the cores and provides per-core comparators/interrupts. But the timer rate is also dependent on the CPU clock speed, so there’ll have to be some communication between the Portable module / CPU clock HAL device and the kernel / HAL timer device so that the MHz-to-ticks rate can be adjusted whenever the CPU speed changes (and I’m yet to decide exactly how that’ll work). Any thoughts/comments/concerns on the above? |
nemo (145) 2546 posts |
All nice. But regarding sleep, whereas CallAt et al naturally support multiple independent callers, is the intention that Sleep and SleepUntil maintain a minimum wake time – so that Task A requesting sleep until 5pm is not gazumped by Task B then requesting SleepUntil next Thursday? So Sleep/SleepUntil will be “sleep until <time> at the latest”? Will it maintain all desired wake times (so the above example will wake at 5pm and next Thursday), or are all requests lost when the system does wake? |
John Sandgrounder (1650) 574 posts |
If Task A sleeps the CPU, then task B will not be running. Or, have I missed something? |
nemo (145) 2546 posts |
Yes I’ve not explained myself well. How does Task A, that needs to be awake at 5pm, and Task B, that needs to be awake at 4pm, both express that need without stopping the other? In other words, does !AntiSocial sleeping until next week stop !Alarm from sounding an alarm in the morning? SleepUntil ought to mean “you may sleep until X if there’s nothing else on” and not “you will sleep until X regardless of everything else”. |
Rick Murray (539) 13840 posts |
Would it be possible to have a call like MonotonicTime but clocking at 1kHz, for times when the centisecond ticker is too slow, and a million ticks per second is just insane (not to mention potential issues with handling 64 bit values in BASIC)? |
Jeffrey Lee (213) 6048 posts |
That’s the plan.
I guess it’s worth considering. If the MonotonicTime64 SWI accepted some kind of scale/divisor value as input then it might be possible to slot it into the ticks-to-MHz conversion algorithm in a fairly optimal manner. |
Sprow (202) 1158 posts |
Rather than proliferating similar-but-different SWI names (I already find the various OS_Call* ones confusing!), how about a magic word selector like we have for OS_ReadUnsigned? So R1=‘WIDE’ (since that can’t be a valid address to call) or R3=‘WIDE’. So you want a call after? OS_CallAfter is as far as you need to look, then choose whether you want centi or microseconds. OS_ReadMonotonicTime64 seems clear from the name alone (and OS_ReadMonotonicTime doesn’t take any args). |
nemo (145) 2546 posts |
+1 what he said |