ARM generic timer

89 posts, 19 voices

Pages: 1 2 3 4

Jun 17, 2018 1:34pm Jon Abbott (1421) 2651 posts	we could easily pass in both the current time and the scheduled time, and let the event decide which ones it wants to make use of Sounds sensible to me, they’re both likely to be wanted at some point.

Jun 17, 2018 4:12pm nemo (145) 2546 posts	Annoyingly, I wrote this some time ago and then lost it under a pile of windows and forgot to send it. Belatedly: For third-party software, the HAL counter / timer APIs are included in the set of HAL entry points that were publicly documented by Castle around the release of the Iyonix. So any software which uses HAL_TimerReadCountdown (for timer 0) or HAL_CounterRead to get sub-centisecond timing values is going to get confused when the timer starts to get used for more than just the 100Hz ticker. That document says: Timers are numbered from 0 upwards. Timer 0 must exist. RISC OS uses timer 0 for its centisecond clock… It also says: unsigned int HAL_TimerGranularity(int timer) [ entry no 14 ] Returns basic granularity of timer n in ticks per second. It does not say that the granularity or period of Timer 0 are any particular values. Timer 0 is used to provide the ‘centisecond clock’, it is not defined to have a granularity of 100Hz or a period of 1. Indeed, on some theoretical platform that may be impossible. Given all that, and the fact that I don’t think OS_Hardware appeared until RO6, you shouldn’t have to worry about that Iyonix documentation, I’d have thought. The USB drivers contain code which does this `/* conversion to ns, assume counter is for 1 cs / ns_factor = 10000000 / max_count;` So that needs fixing… but I’d argue that it needs fixing anyway*. Having said all that, I agree that IOMD HAL is entitled to provide a synthesised 100Hz Timer 0 derived from the hardware timer that could be presented as Timer 2.

Dec 9, 2018 8:21pm Dave Higton (1515) 3525 posts	with the drawback that RISC OS 5 doesn’t have a way for programs to coordinate access to the extra timers In the old days there was the Device Claim protocol, which works at application level. Doesn’t that work on RISC OS 5? e: Or is it plain not good enough?

Dec 10, 2018 2:17pm Jeffrey Lee (213) 6048 posts	AIUI the device claim protocol was implemented using wimp messages, which obviously means that it’s only suitable for use by wimp tasks. So not the most useful protocol, I’d say. However, one of the earlier ideas for managing access to the HAL timers was to use service calls to claim/release them. I can’t remember if it was mentioned on the forums or if it was just in emails between me and Ben, but I think the idea was that it would be very similar to the device claim protocol: A program would issue a “I want to use this timer” service call, and anything which is already using that timer would claim the call and return a “no, I’m using it” message. Then the original program would presumably try claiming the next timer in the list. (I think I had some concerns about the idea when Ben initially proposed it, but can’t remember what they were. Perhaps it’s because the service call based nature would mean that applications wouldn’t be able to use it?) But even if that protocol existed, it would do f-all to help for platforms like the Pi where there’s only a couple of timers available. The kernel uses timer 0, some other software (perhaps in the OS, perhaps not) decides it wants to use timer 1, and then when the user tries to load another piece of software which wants/needs a third timer they’re out of luck. Since we’re mostly dealing with software running on a single core, a single flexible timer should be sufficient for most of our needs. However, even with the ARM generic timer, I think we may still run into some issues – the timer is (always?) driven from the CPU clock, so it’ll change frequency whenever we switch CPU speed. And if the PLL needs to be re-locked, it’ll be hard to judge how much time is lost or gained during the operation – the HAL will probably have to use an independent timer as a reference time source. Something for me to tackle in the new year, hopefully.

Dec 10, 2018 4:52pm Dave Higton (1515) 3525 posts	AIUI the device claim protocol was implemented using wimp messages, which obviously means that it’s only suitable for use by wimp tasks. So not the most useful protocol, I’d say. Yes, I realised that later, when I started to write a module, which would not need a runnable part except for that reason. A program would issue a “I want to use this timer” service call That’s great for modules, but what if an app wants to claim and use a device? Maybe, if it’s been claimed via a service call, the OS could respond to Wimp DeviceClaim messages. The other direction (an app has it reserved, so the service call should also say so) may be problematic on grounds of speed. when the user tries to load another piece of software which wants/needs a third timer they’re out of luck. True, but it’s like all devices: you can’t use a device unless you have that device. For example, you can’t use serial communication unless you have a serial port, can’t use Bluetooth unless you have a Bluetooth host and driver; etc. etc. In the current context, you can’t do MIDI unless you have a MIDI interface. Not a problem. Nothing to lose any sleep over. we may still run into some issues – the timer is (always?) driven from the CPU clock, so it’ll change frequency whenever we switch CPU speed. Yes, I’ve been wondering about that too. I think there is often a clock source available for the devices that doesn’t change with CPU clock multiplier setting. It will take careful study of the chip’s TRM. (And there’s another problem, in some cases.) But, if it exists, the HAL should use it, and the timer granularity should reflect it correctly.

Dec 10, 2018 5:50pm Steve Pampling (1551) 8170 posts	But even if that protocol existed, it would do f-all to help for platforms like the Pi where there’s only a couple of timers available. The kernel uses timer 0, The simple side of me says why not have a master that regulates the timing of a module that offers multiple second tier timers? However, even with the ARM generic timer, I think we may still run into some issues – the timer is (always?) driven from the CPU clock, so it’ll change frequency whenever we switch CPU speed. Which for some reason doesn’t make the clock/time display vary in speed so it is possible to set an interval timer that doesn’t vary with CPU clock speed.

Dec 10, 2018 6:37pm Rick Murray (539) 13840 posts	That’s great for modules, but what if an app wants to claim and use a device? Write a wrapper in a module? What happens if the OS needs the device claimant to “do something” and the application is paged out? It’s the same problem as service calls and events – they aren’t reasonably useful to foreground applications unless said application completely takes over the machine (no TaskWindow). Which for some reason doesn’t make the clock/time display vary in speed Are you sure about that? How do you know that, when your clock speed switches down to idle, your machine isn’t slowing down the entire universe so you perceive time to pass at the same speed as before?

Dec 10, 2018 7:31pm Steve Pampling (1551) 8170 posts	How do you know that, when your clock speed switches down to idle, your machine isn’t slowing down the entire universe so you perceive time to pass at the same speed as before? Feature only available in Intel based systems.

Dec 11, 2018 3:46am Tristan M. (2946) 1039 posts	Now I have to look back at what I did. It’s been a while. It’s not what I’d call finished, but it functions in some fashion and uses the generic timer. https://raw.githubusercontent.com/experimentech/RISC_OS_AWH3Dev/master/RISC_OS_Dev/mixed/RiscOS/Sources/HAL/AWH3/Timers.s

Feb 11, 2019 10:16pm Jeffrey Lee (213) 6048 posts	However, even with the ARM generic timer, I think we may still run into some issues – the timer is (always?) driven from the CPU clock, so it’ll change frequency whenever we switch CPU speed. Luckily, some experimentation suggests that this isn’t the case. Titanium, OMAP5, Pi 2, and Pi 3 all appear to keep the timer frequency stable when the CPU speed is changed. I guess I’d better start writing a driver then!

Feb 25, 2019 2:23pm Jeffrey Lee (213) 6048 posts	I’ve now got drivers for the ARM generic timer (Titanium + OMAP5 + Pi 2/3), the Pi 1 timer, and the Cortex-A9 global timer (OMAP4 & iMX6). Next on my list will be OMAP3. Unfortunately the Cortex-A9 timers do change frequency in response to changes to CPU speed, so to avoid incremental clock drift I’ll have to come up with a different approach – probably using a clock-invariant timer (like the current HAL timers) to generate the 64bit time value, and only use the A9 timers for scheduling interrupts. When the CPU speed changes it should be pretty straightforward to fire off a message to each core to tell it to reprogram its pending timer interrupt – the core sending the message doesn’t even have to wait for a reply, it can just be a pending interrupt which the receiving core can deal with at its leisure. For OMAP4 I could probably just use two of the SoC timers (to allow one scheduled interrupt per core), but the iMX6 SoC doesn’t have enough timers for that to be possible, so using a SoC timer for the 64bit clock and the per-core A9 timers for scheduled interrupts is likely to be the next best thing. The other thing I’ve been thinking about is how to deal with the clock fine-tuning that the RTC module performs. Originally I was hoping that the adjustment could be kept to just the 100Hz ticker (which will be synthesised from the new 64bit timer), as that would allow the new API to use the 64bit time value directly (quick access to the timer since there’ll be no unit conversion, and good timestamps for profiling code due to the high frequency). But then I realised that if code used a mix of the 100Hz ticker API and the new 64bit API then it would see time advance at two different rates. And of course there’d be no built-in way to correct for any drift/inaccuracy in the 64bit timer. So now I’m thinking the clock fine tuning should be applied to the 64bit timer, and that the application-level API should use a fixed frequency (e.g. 1MHz). This’ll slow things down a bit by adding extra conversion code to all the calls, but it’ll avoid programs seeing conflicting clock rates, and the fixed frequency should reduce the number of inaccuracies that occur as a result of programs implementing their own code for converting from the hardware-level tick rate to more useful values (e.g. Titanium / OMAP5 have an awkward timer rate of something like 6147451Hz). For platforms where the timer frequency is higher than 1MHz, code will lose out on the extra fidelity that offers when using the timers to profile code, but maybe we can counter that by having an extra call which returns the 1MHz time and a fractional part (of unspecified precision)

Feb 25, 2019 9:43pm Dave Higton (1515) 3525 posts	Does this include making FastTickerV work?

Feb 25, 2019 10:14pm Jeffrey Lee (213) 6048 posts	I’m planning on implementing a high-precision OS_CallAt (i.e. OS_CallAfter, but using an absolute microsecond time value instead of a relative centisecond one). For 99% of cases this will be better than FastTickerV, since it’ll allow programs to control the ticker rate (unlike FastTickerV where you’re just given what the OS feels like), and it’ll allow uninteresting ticks to be skipped (saving CPU cycles & energy consumption). Of course, if someone felt that FastTickerV was necessary then there’s nothing stopping them from implementing it in a softloadable module by building ontop of OS_CallAt (well, apart from the OS_ReadSysInfo 6 item that reports the ticker rate – which will need a bit of work to resolve the conflict with RISC OS 5’s usage)

Feb 25, 2019 10:36pm Dave Higton (1515) 3525 posts	OS_CallAt is interesting but it worries me. What happens if a task sets a CallAt a time that has just passed, perhaps because something else took too much processing time, or even the task itself took too much time – maybe the previous CallAt was late, and that plus the reasonable processing time took it over the interval?

Feb 26, 2019 12:05am Jeffrey Lee (213) 6048 posts	Late events will be processed as soon as possible. It’s no worse than TickerV / FastTickerV, where there’ll be an unknown amount of time between the interrupt firing and your handler being called (whether due to other handlers taking time, or the system being busy doing something else). However the good thing is that your routine will be able to read the current time on entry and work out how long the delay was (unlike OS_ReadMonotonicTime, which is both too low-resolution, and will be incorrect if timer interrupts have been missed). For the other problem – of long routines taking too long and completely bogging down the system – I’m not sure yet what the best approach will be. Routines could be smart with when they reschedule the event – e.g. if you want a 1kHz event, at the end of your routine, you can read the current time and then round that up to the next millisecond in order to work out when the event should be scheduled. So a routine which you want to execute every millisecond, but takes 1.5 milliseconds to execute, will automatically adjust itself so that it runs every 2 milliseconds instead. This is roughly what happens with TickerV / FastTickerV if the system becomes overloaded and timer interrupts get missed. Since we’re adding a high-resolution timer, the OS could also take an active role in things – measuring the percentage of CPU time that’s going into each routine (and how much is left for the foreground) and artificially limiting the frequency of routines until the foreground is able to execute. But that’ll require a lot of work to get right (punish the bad routines without hurting the good ones).

Feb 26, 2019 7:22am Steve Pampling (1551) 8170 posts	artificially limiting the frequency of routines until the foreground is able to execute. But that’ll require a lot of work to get right (punish the bad routines without hurting the good ones). Hmmm, “nice”

Mar 1, 2019 11:58pm Jeffrey Lee (213) 6048 posts	OMAP3, Iyonix and IOMD timers are now working. For Iyonix I’m basically using the approach I outlined here – instead of constantly changing the reload register value, I’m leaving it set at maximum and performing reloads manually by writing the desired value to the timer register. When reprogramming the timer, I use a limit of &7fffffff for how far in the future an interrupt can be scheduled (i.e. the maximum value I’ll write to the timer register). This simplifies the code for reading the current time – I can just sign-extend the timer register value to 64bits and then add it on to a biased reference time value that’s updated in the interrupt handler. Positive timer register values mean the interrupt hasn’t happened yet, negative means it has happened (and indicates how long ago that was). And reading the current timer value just before setting the new one should make the time loss during the reprogramming fairly predictable – although I haven’t tested it under load yet to try and measure exactly how much drift there really is. Although the IOMD timers operate slightly differently, I realised that I could use the same approach there – write the desired timer value to the input latch (which is used for reloading the timer), manually trigger a reload of the timer by writing to the GO register, and then reset the input latch to the maximum reload value. It’s even possible to get the same two-instruction “read current, set new” critical section of the reprogramming sequence. The only thing to watch out for is that writing to the GO register doesn’t seem to cause the timer to be reloaded immediately – it seems like it waits for the next 2MHz tick. So after writing to GO I’ve got to wait a few instructions before writing the max-reload value to the input latch, otherwise it sometimes reloads with the max-reload value instead of the value that was present in the latch earlier. Under light load it looks like it runs slow by 2-3 ticks per interrupt; over the course of a day this will easily add up to several seconds, so there’ll definitely need to be some correction applied. Again, I still need to test it under heavy load to see if this time loss really is predictable. But if it can be made to work accurately enough then it will be a nice single-timer solution for IOMD & Iyonix which will leave the other timer free for use by user software.

Aug 22, 2019 12:38pm Jeffrey Lee (213) 6048 posts	It’s probably about time I tried to move this forward. Here’s my current proposal: OS API There will be a few new OS SWIs which applications can use: SWI OS_ReadMonotonicTime64 Returns a 1MHz 64bit time in R0+R1 SWI OS_CallAt, OS_RemoveCallAt Allow scheduling & removal of events (similar to OS_CallAfter / OS_RemoveTickerEvent), except you specify a 1MHz 64bit timestamp instead of a centisecond delta time. Possibly we’d also want an OS_CallEvery64 for easy scheduling of periodic events SWI OS_Sleep, or maybe OS_SleepUntil Sleep the CPU for a certain period of time, or until a certain time (using 1MHz 64bit time values) I believe SleepUntil is usually considered the better API since it minimises over-sleeping, but programmers are lazy and so Sleep is the more commonly used one This will act as a replacement for calling HAL_CounterDelay, the intent being that the OS can decide on the best approach to use depending on how long a delay is required (spin loop like HAL_CounterDelay, or low-power WFI, or yield the thread in threaded environments) HAL API HALs will implement support for the 64bit timer via a HAL device. Main properties are: TimerFrequency – Field giving the frequency of the timer in Hz TimerEventPeriod – If the timer generates SEV-style CPU events, how many ticks between each event. Zero if not supported. (The OS could use this for implementing timeouts in WFE loops) TimerGetTimestamp – Returns the current 64bit time value (at native frequency) TimerGetDevice – Returns IRQ number of the timer, for the calling CPU core TimerScheduleInterrupt – Schedules an interrupt to occur at the given 64bit timestamp, for the calling CPU core. If the requested time is too far in the future, an earlier interrupt can be scheduled instead (the OS will check the current time in the IRQ handler and re-schedule the interrupt as appropriate) Since the OS presents the timer as being 1MHz, but the HAL presents it at native frequency, the OS will maintain a scale factor which can be used to convert between the two units. Also, since the OS is going to be converting things, it might not actually be necessary for the HAL interface to be 64bit. So I might try narrowing it down to 32bit if it makes things easier/quicker. Kernel integration The kernel will be changed so that the monotonic timer, TickerV, and other related things will be built ontop of the 64bit timer This will theoretically free up HAL timer 0 so that other software can use it – however because lots of existing software relies on HAL timer 0 / the HAL_Counter API, the OS will intercept calls to those HAL entry points and emulate them: From the perspective of modules & applications, all HAL timer numbers will be offset by one, so HAL timer 0 is the emulated timer, and timers 1+ are the real timers (mapping to timers 0+ in the HAL) HAL_TimerDevice for timer 0 will return -1 HAL_TimerGranularity for timer 0, and HAL_CounterRate, will return 1MHz HAL_TimerMaxPeriod + HAL_TimerPeriod for timer 0, and HAL_CounterPeriod, will return 10000 (i.e. 1MHz/100) HAL_TimerSetPeriod for timer 0 will be ignored HAL_TimerReadCountdown for timer 0, and HAL_CounterRead, will return the (1MHz) time of the next TickerV tick minus the current (1MHz) time, modulo 10000 HAL_CounterDelay will pass through to the HAL, and must still be implemented there (there’s little point replicating the API in the new timer HAL device). So new + old code can continue to use it (although OS code should probably migrate to the new OS_Sleep SWI) Currently the kernel programs timer 0 and sets up the monotonic timer before the call to HAL_InitDevices. This will need to be changed; after the call to HAL_InitDevices it’ll check for the 64bit timer device and decide whether to use the new or old monotonic timer/TickerV implementation (and whether to wrap the HAL timer APIs returned by OS_Hardware, etc.). Luckily the call to HAL_InitDevices occurs early enough in the kernel initialisation that there shouldn’t be any side-effects from delaying the monotonic timer initialisation. For compatibility with old HALs it can still program timer 0 early on (so that HAL_CounterDelay works reliably) OS integration The RTC module fiddles with the period of timer 0. This will need changing so that it instead fiddles with the ticks-to-MHz value that the OS uses for converting between MHz 64bit time values and the raw timer values. This will likely require an extra kernel SWI, along with extra code in the time conversion functions (to represent the fact that the rate adjustment is applied at a given point in time, not at time zero). Plus whenever a change is made the timer interrupt will need reprogramming to take into account the new conversion factor. In a multi-core world, this will also involve firing a message to the other cores so that they can reprogram their own timer interrupts. Also note that by fiddling with the ticks-to-MHz conversion value, the time fiddling that the RTC module performs will affect all OS-level time APIs (perhaps also OS_Sleep). I think this is a better approach than having it only affect the monotonic timer, since it’ll ensure that programs see consistent times / delta times no matter what API they use. Current status I’ve got timer device implementations for IOMD, Iyonix, OMAP3, Pi 1, and the ARM generic timer (OMAP5, Pi 2+, Titanium, and probably all new/future machines like the Pinebook – although I still need to make some improvements to interrupt handling in the Titanium HAL before it can be used there). OMAP4 & iMX6 are the odd ones out; the intent is to use the Cortex-A9 global timer, since that’s synchronised between the cores and provides per-core comparators/interrupts. But the timer rate is also dependent on the CPU clock speed, so there’ll have to be some communication between the Portable module / CPU clock HAL device and the kernel / HAL timer device so that the MHz-to-ticks rate can be adjusted whenever the CPU speed changes (and I’m yet to decide exactly how that’ll work). Any thoughts/comments/concerns on the above?

Aug 22, 2019 5:27pm nemo (145) 2546 posts	All nice. But regarding sleep, whereas CallAt et al naturally support multiple independent callers, is the intention that Sleep and SleepUntil maintain a minimum wake time – so that Task A requesting sleep until 5pm is not gazumped by Task B then requesting SleepUntil next Thursday? So Sleep/SleepUntil will be “sleep until <time> at the latest”? Will it maintain all desired wake times (so the above example will wake at 5pm and next Thursday), or are all requests lost when the system does wake?

Aug 22, 2019 6:36pm John Sandgrounder (1650) 574 posts	SWI OS_Sleep, or maybe OS_SleepUntil - Sleep the CPU for a certain period of time so that Task A requesting sleep until 5pm is not gazumped by Task B If Task A sleeps the CPU, then task B will not be running. Or, have I missed something?

Aug 22, 2019 7:39pm nemo (145) 2546 posts	Yes I’ve not explained myself well. How does Task A, that needs to be awake at 5pm, and Task B, that needs to be awake at 4pm, both express that need without stopping the other? In other words, does !AntiSocial sleeping until next week stop !Alarm from sounding an alarm in the morning? SleepUntil ought to mean “you may sleep until X if there’s nothing else on” and not “you will sleep until X regardless of everything else”.

Aug 22, 2019 10:08pm Rick Murray (539) 13840 posts	Would it be possible to have a call like MonotonicTime but clocking at 1kHz, for times when the centisecond ticker is too slow, and a million ticks per second is just insane (not to mention potential issues with handling 64 bit values in BASIC)?

Aug 22, 2019 11:42pm Jeffrey Lee (213) 6048 posts	Will it maintain all desired wake times That’s the plan. Would it be possible to have a call like MonotonicTime but clocking at 1kHz, for times when the centisecond ticker is too slow, and a million ticks per second is just insane (not to mention potential issues with handling 64 bit values in BASIC)? I guess it’s worth considering. If the MonotonicTime64 SWI accepted some kind of scale/divisor value as input then it might be possible to slot it into the ticks-to-MHz conversion algorithm in a fairly optimal manner.

Aug 23, 2019 8:51am Sprow (202) 1158 posts	SWI OS_CallAt, OS_RemoveCallAt, possibly we’d also want an OS_CallEvery64 Rather than proliferating similar-but-different SWI names (I already find the various OS_Call* ones confusing!), how about a magic word selector like we have for OS_ReadUnsigned? So R1=‘WIDE’ (since that can’t be a valid address to call) or R3=‘WIDE’. So you want a call after? OS_CallAfter is as far as you need to look, then choose whether you want centi or microseconds. OS_ReadMonotonicTime64 seems clear from the name alone (and OS_ReadMonotonicTime doesn’t take any args).

Aug 23, 2019 10:50am nemo (145) 2546 posts	+1 what he said