RISC OS Open: Forum: OS_Byte 13 event state wraps

Aug 19, 2016 7:01pm

Jon Abbott (1421) 2651 posts

Tested on Pi3 and SA build date 19-08-16 using event 4

If the event is already disabled, OS_Byte 13,4 turns the event on indefinitely, it’s then impossible to turn the event off.

Repro:

SYS "OS_Byte",13,4 TO ,A%:PRINT A%
SYS "OS_Byte",13,4 TO ,A%:PRINT A%

The final state should be 0, but is instead 255.

Aug 20, 2016 1:04pm

Jeffrey Lee (213) 6048 posts

255 is treated as a failsafe value – if there are so many systems interested in an event that the 8-bit counter for that event is in danger of wrapping then then the counter will lock itself at the maximum value, with neither OS_Byte 13 nor 14 having an effect. This allows you to do things like call OS_Byte 14 300 times and then call OS_Byte 13 299 times without the event being becoming disabled at any point. Of course if you called OS_Byte 13 once more then the event would still be enabled, but having the event enabled when it isn’t needed is safer than having it disabled when it is needed.

The bug here is that you’re causing the counter to wrap by decrementing it too many times – although the OS could trap that and clamp it to zero, really it’s a bug in your code, because you’re not balancing the event enable & disable calls. And when faced with imbalanced calls, maybe the counter being locked to 255 is a good thing – it prevents the buggy code from doing any further damage (if it was clamped to zero, it might mean that the event gets disabled while someone else is still using it)

Of course, other people may disagree with me on this!

Aug 20, 2016 2:04pm

Jon Abbott (1421) 2651 posts

the counter will lock itself at the maximum value, with neither OS_Byte 13 nor 14 having an effect

If you’re decreasing an event below zero, you’d expect the event to shut off, not lock on to the point you have to restart the system. It wouldn’t be so bad if OS_Byte 14 corrected the situation. Up until RO5, the behaviour was to leave the event disabled, locking it on makes no logical sense.

This change in behaviour is not mentioned on the OS_Byte 13 wiki page. Whoever implemented the change failed to update the documentation to state going negative will require a reboot to correct.

Aug 20, 2016 2:57pm

Jeffrey Lee (213) 6048 posts

It was here where the behaviour was changed.

Another bug fix is that once the event semaphores had saturated at 255,
OS_Byte 13 was still happy to decrement the semaphore, so for example 256
enables followed by 255 disables would have disabled the event.

If you read the PRM, you’ll see that OS_Byte 14 and OS_Byte 13 exist in order to reduce CPU load, by allowing the OS to avoid distributing events that it knows that no-one is interested in. You can’t rely on the behaviour of OS_Byte 13 to disable event generation – apart from the failsafe behaviour there’s also the possibility that another piece of software (e.g. running under IRQ) could come along and enable the event that you just disabled.

Also if you read the full commit message for the above change then you’ll note that it’s only event numbers 0-31 which have counters available – event numbers outside that range have never been influenced by OS_Byte 13/14. OS_Byte 13 is just a hint to allow reduction of CPU load, not a commandment that the event must be silenced.

May 17, 2018 7:36am

Jon Abbott (1421) 2651 posts

Got caught out by this issue again, having found it to be the cause of machine locks. I don’t use events on RO5 preferring RTSupport instead, so the locks are being caused by other software not initialising before claiming EventV. I’m not sure if there was/is any developer guidance on how to claim EventV, but I’ve seen many poorly written code sequences that claim EventV, setup variables etc, then enable the event – a sure recipe for a page zero access, abort or lock!

Ignoring poorly written software for the minute, I fully understand the reasoning behind locking the event semaphore to 255 if an event is turned on too many times. I don’t however understand why its allowed to wrap below zero, it just shouts bug to me.

May 17, 2018 9:43am

Jeffrey Lee (213) 6048 posts

I don’t however understand why its allowed to wrap below zero, it just shouts bug to me.

I suspect the justification for that is that (for RISC OS 5) the only time when it could possibly wrap below zero is if buggy code has called OS_Byte 13 too many times. In which case it’s probably safer to leave the event on than to disable it.

(Or maybe it’s just an edge case which the programmer didn’t consider at the time)

May 17, 2018 10:55am

nemo (145) 2546 posts

If there’s more disables than enables than the one thing that is certain is that something has gone wrong. Leaving the event enabled is therefore the only sensible response – the counting has failed, stop counting.

It is, as Jeffrey has explained, only an optimisation anyway, and as we all know there’s only two rules for optimisation:

Don’t do it
(For experts only) Don’t do it yet

May 18, 2018 6:24am

Jon Abbott (1421) 2651 posts

I find the whole change odd, if the bug fix was to handle more than 255, I personally would have increased the semaphores to 32bit internally and only put 255 in the OS Vars if it went higher. I would have left the previous behaviour of ignoring a decrease below zero.

If there’s more disables than enables than the one thing that is certain is that something has gone wrong

The key thing with semaphores is the caller must track how many times they’ve increased the semaphore. When a shutdown occurs either on purpose or due to an Error, Exit etc the respective number of calls should be made to reduce the semaphore. I’ve yet to see a single piece of code do that, they all assume they’ve called OS_Byte 14 once, so issue one OS_Byte 13 regardless.

An issue I can see with global semaphores is when processes are terminated, unless the OS is tracking them on a per process basis they’re immediately indeterminate once a process is terminated.

Another issue is programmers that try to force an event off by repeatedly calling OS_Byte 13 (eg Deeva), instead of implementing internal semaphores to track if they’re capturing an event, or where they assume events aren’t occurring (eg Ibix the Viking, Hoverbod)

It’s a pity the original PRM didn’t include a code example, stating your code must track how many times its increased a semaphore and reduce accordingly when it terminates.

May 18, 2018 11:00am

Rick Murray (539) 13840 posts

I’ve yet to see a single piece of code do that, they all assume they’ve called OS_Byte 14 once, so issue one OS_Byte 13 regardless.

?

Is it wrong to assume that if you turn an event on once, you turn it off once?

May 19, 2018 1:21pm

nemo (145) 2546 posts

programmers that try to force an event off by repeatedly calling OS_Byte 13 (eg Deeva)

Broken.

assume events aren’t occurring (eg Ibix the Viking, Hoverbod)

Broken.

Both these scenarios are simple bugs. They were wrong when they were released, they were wrong on RISC OS, they were wrong on Arthur, they were wrong on MOS (Beeb¹).

I would have left the previous behaviour of ignoring a decrease below zero.

The only way you can decrease below zero is that something is trying to disable what it did not enable. By definition that can cause other systems to malfunction (as it could disable the event the other system is relying on). Consequently disabling the optimisation is the only sensible recourse. As for 32bit semaphores, we never have anything like 100 claimants as it is. That’s not the problem.

¹ The BBC AUG says:

Making a copy of [EVNTV] before changing this vector…allows more than one event handling routine to be used at one time.

In other words, there are ALWAYS other EventV claimants, any of which might have enabled any of the events. Assuming anything else is dead wrong and always has been.

May 19, 2018 3:45pm

nemo (145) 2546 posts

32bit semaphores

On my machine, which has been running for some days now, I have:

Event Enables Name
  4      1    VSync
 11      3    Key transition
 19      3    Internet socket
 21      2    Misc (RO4 scroll wheel)

By these claimants:

EventV
  DeepKeys+5D0
  TaskWindow+14BC
  WindowManager+A0D0
  ShareFS+C98
  Freeway+304
  ZeroConf+268
  WindowScroll+3F8
  ScreenBlanker+788
  UtilityModule+1C5D4 "EventProcessScrolls"
  Kernel+126C (default handler)

EventV is a poor way of sharing a vector, and by far the worst usage is Vsync (I’m not bothered about WindowScroll getting called every time I press a key, I am irritated that ShareFS is getting called 50 times a second).

All these things should have their own vector, which is what VectorExtend enables. I have been delaying the (long overdue) release because of RO6.2’s gargantuan SWI and vector handlers… but I think I will do without 6.20 compatibility and wait until someone asks really nicely.

May 19, 2018 5:56pm

Rick Murray (539) 13840 posts

I’m not sure if there was/is any developer guidance on how to claim EventV,

Common sense?

but I’ve seen many poorly written code sequences that claim EventV, setup variables etc, then enable the event

One shouldn’t claim anything that results in the OS calling into your code (which you should assume could happen the instant you enable the event) until you’re in a state where that will work.

I don’t however understand why its allowed to wrap below zero, it just shouts bug to me.

The bug is surely something turning it off too many times? If this is the case, we will have – in the process – reached a point where it has been turned off behind the back of something that was still using it.

In other words, there are ALWAYS other EventV claimants

Indeed. EventV is “all the other stuff”. Buffers full/empty, ADC conversion, VSync, Escape, Econet, mouse/key transition, a new bar (and not one that serves Stella Artois), MIDI, Internet, battery… With that in mind, it is foolish to think that there aren’t other services using some of these events.

Just wrote a quick program to report the number of claims, and on my machine it’s eight for the key transition and four for the Internet event. Surprised nothing is hanging on to VSync.

EventV is a poor way of sharing a vector,

Too many (disparate) things all lumped together.

I am irritated that ShareFS is getting called 50 times a second

It could have been worse, it could have been sitting on TickerV. ;-)

Surely using VSync means that ShareFS’s responsiveness depends upon the monitor refresh rate?!?

May 19, 2018 6:21pm

nemo (145) 2546 posts

One shouldn’t claim anything that results in the OS calling into your code (which you should assume could happen the instant you enable the event)

Let’s be very clear: Your code will get called regardless of whether you enabled any particular event.

Your code is on EventV, which is called if anyone has enabled any event.

Your code will therefore be called, by definition, under completely unpredictable circumstances.

May 19, 2018 6:27pm

nemo (145) 2546 posts

Surely using VSync means that ShareFS’s responsiveness…

You misunderstand. ShareFS wants to know about an Internet Socket Event, so it claims EventV and enables Event 19.

Somebody else wants VSync, so claims EventV and enables Event 4.

Now, whenever there is an internet socket event or a VSync, both claimants get called.

This was a regrettable compromise on the Beeb, sharing one of the very few vectors between a couple of Sideways ROMs. On RISC OS, with ~200 modules running, it’s very silly indeed.

It is analogous to Service Calls, which are just about the worst API in RISC OS – again, inherited from the BBC Micro which might, if your spent money and got your soldering iron out, have ten sideways ROMs listening for twenty service calls. I have about 200 modules listening to well over 200 service calls.

In RISC OS the Service Call mechanism is the primary victim of the lack of vectors. Even the extensive RISC OS 4 service call ‘optimisation’ could only damp the flames down a bit (which is why ImageFileRender_Artworks is called every time there’s a MimeMap request).

May 19, 2018 7:19pm

Colin (478) 2433 posts

The UpCallV is bad as well I doubt it was envisaged to be the ‘best’ way to use USB when it was conceived. Most of the other stuff on the vector is notification of occasional events. But then again I don’t think that USB should be on a vector at all.

May 20, 2018 10:48am

nemo (145) 2546 posts

Actually I disagree, I think every USB device should have its own vector, plus an overall USB admin vector.

USB device inserted, a vector is dynamically allocated for it and announced on the fixed USB vector. Thereafter all device-specific events occur on the device’s own vector.

But I like vectors. They are what makes RISC OS so flamboyantly insecure.

May 20, 2018 10:57am

Colin (478) 2433 posts

Don’t mind that, just don’t like using vectors for USB transfer callbacks.

May 21, 2018 11:51am

nemo (145) 2546 posts

No, agreed. Vectors for transactions, direct (vector-negotiated) API for data transfer.

OS_Byte 13 event state wraps

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Aug 19, 2016 7:01pm Jon Abbott (1421) 2651 posts	Tested on Pi3 and SA build date 19-08-16 using event 4 If the event is already disabled, OS_Byte 13,4 turns the event on indefinitely, it’s then impossible to turn the event off. Repro: `SYS "OS_Byte",13,4 TO ,A%:PRINT A% SYS "OS_Byte",13,4 TO ,A%:PRINT A%` The final state should be 0, but is instead 255.

Aug 20, 2016 1:04pm Jeffrey Lee (213) 6048 posts	255 is treated as a failsafe value – if there are so many systems interested in an event that the 8-bit counter for that event is in danger of wrapping then then the counter will lock itself at the maximum value, with neither OS_Byte 13 nor 14 having an effect. This allows you to do things like call OS_Byte 14 300 times and then call OS_Byte 13 299 times without the event being becoming disabled at any point. Of course if you called OS_Byte 13 once more then the event would still be enabled, but having the event enabled when it isn’t needed is safer than having it disabled when it is needed. The bug here is that you’re causing the counter to wrap by decrementing it too many times – although the OS could trap that and clamp it to zero, really it’s a bug in your code, because you’re not balancing the event enable & disable calls. And when faced with imbalanced calls, maybe the counter being locked to 255 is a good thing – it prevents the buggy code from doing any further damage (if it was clamped to zero, it might mean that the event gets disabled while someone else is still using it) Of course, other people may disagree with me on this!

Aug 20, 2016 2:04pm Jon Abbott (1421) 2651 posts	the counter will lock itself at the maximum value, with neither OS_Byte 13 nor 14 having an effect If you’re decreasing an event below zero, you’d expect the event to shut off, not lock on to the point you have to restart the system. It wouldn’t be so bad if OS_Byte 14 corrected the situation. Up until RO5, the behaviour was to leave the event disabled, locking it on makes no logical sense. This change in behaviour is not mentioned on the OS_Byte 13 wiki page. Whoever implemented the change failed to update the documentation to state going negative will require a reboot to correct.

Aug 20, 2016 2:57pm Jeffrey Lee (213) 6048 posts	It was here where the behaviour was changed. Another bug fix is that once the event semaphores had saturated at 255, OS_Byte 13 was still happy to decrement the semaphore, so for example 256 enables followed by 255 disables would have disabled the event. If you read the PRM, you’ll see that OS_Byte 14 and OS_Byte 13 exist in order to reduce CPU load, by allowing the OS to avoid distributing events that it knows that no-one is interested in. You can’t rely on the behaviour of OS_Byte 13 to disable event generation – apart from the failsafe behaviour there’s also the possibility that another piece of software (e.g. running under IRQ) could come along and enable the event that you just disabled. Also if you read the full commit message for the above change then you’ll note that it’s only event numbers 0-31 which have counters available – event numbers outside that range have never been influenced by OS_Byte 13/14. OS_Byte 13 is just a hint to allow reduction of CPU load, not a commandment that the event must be silenced.

May 17, 2018 7:36am Jon Abbott (1421) 2651 posts	Got caught out by this issue again, having found it to be the cause of machine locks. I don’t use events on RO5 preferring RTSupport instead, so the locks are being caused by other software not initialising before claiming EventV. I’m not sure if there was/is any developer guidance on how to claim EventV, but I’ve seen many poorly written code sequences that claim EventV, setup variables etc, then enable the event – a sure recipe for a page zero access, abort or lock! Ignoring poorly written software for the minute, I fully understand the reasoning behind locking the event semaphore to 255 if an event is turned on too many times. I don’t however understand why its allowed to wrap below zero, it just shouts bug to me.

May 17, 2018 9:43am Jeffrey Lee (213) 6048 posts	I don’t however understand why its allowed to wrap below zero, it just shouts bug to me. I suspect the justification for that is that (for RISC OS 5) the only time when it could possibly wrap below zero is if buggy code has called OS_Byte 13 too many times. In which case it’s probably safer to leave the event on than to disable it. (Or maybe it’s just an edge case which the programmer didn’t consider at the time)

May 17, 2018 10:55am nemo (145) 2546 posts	If there’s more disables than enables than the one thing that is certain is that something has gone wrong. Leaving the event enabled is therefore the only sensible response – the counting has failed, stop counting. It is, as Jeffrey has explained, only an optimisation anyway, and as we all know there’s only two rules for optimisation: Don’t do it (For experts only) Don’t do it yet

May 18, 2018 6:24am Jon Abbott (1421) 2651 posts	I find the whole change odd, if the bug fix was to handle more than 255, I personally would have increased the semaphores to 32bit internally and only put 255 in the OS Vars if it went higher. I would have left the previous behaviour of ignoring a decrease below zero. If there’s more disables than enables than the one thing that is certain is that something has gone wrong The key thing with semaphores is the caller must track how many times they’ve increased the semaphore. When a shutdown occurs either on purpose or due to an Error, Exit etc the respective number of calls should be made to reduce the semaphore. I’ve yet to see a single piece of code do that, they all assume they’ve called OS_Byte 14 once, so issue one OS_Byte 13 regardless. An issue I can see with global semaphores is when processes are terminated, unless the OS is tracking them on a per process basis they’re immediately indeterminate once a process is terminated. Another issue is programmers that try to force an event off by repeatedly calling OS_Byte 13 (eg Deeva), instead of implementing internal semaphores to track if they’re capturing an event, or where they assume events aren’t occurring (eg Ibix the Viking, Hoverbod) It’s a pity the original PRM didn’t include a code example, stating your code must track how many times its increased a semaphore and reduce accordingly when it terminates.

May 18, 2018 11:00am Rick Murray (539) 13840 posts	I’ve yet to see a single piece of code do that, they all assume they’ve called OS_Byte 14 once, so issue one OS_Byte 13 regardless. ? Is it wrong to assume that if you turn an event on once, you turn it off once?

May 19, 2018 1:21pm nemo (145) 2546 posts	programmers that try to force an event off by repeatedly calling OS_Byte 13 (eg Deeva) Broken. assume events aren’t occurring (eg Ibix the Viking, Hoverbod) Broken. Both these scenarios are simple bugs. They were wrong when they were released, they were wrong on RISC OS, they were wrong on Arthur, they were wrong on MOS (Beeb¹). I would have left the previous behaviour of ignoring a decrease below zero. The only way you can decrease below zero is that something is trying to disable what it did not enable. By definition that can cause other systems to malfunction (as it could disable the event the other system is relying on). Consequently disabling the optimisation is the only sensible recourse. As for 32bit semaphores, we never have anything like 100 claimants as it is. That’s not the problem. ¹ The BBC AUG says: Making a copy of [EVNTV] before changing this vector…allows more than one event handling routine to be used at one time. In other words, there are ALWAYS other EventV claimants, any of which might have enabled any of the events. Assuming anything else is dead wrong and always has been.

May 19, 2018 3:45pm nemo (145) 2546 posts	32bit semaphores On my machine, which has been running for some days now, I have: Event Enables Name 4 1 VSync 11 3 Key transition 19 3 Internet socket 21 2 Misc (RO4 scroll wheel) By these claimants: EventV DeepKeys+5D0 TaskWindow+14BC WindowManager+A0D0 ShareFS+C98 Freeway+304 ZeroConf+268 WindowScroll+3F8 ScreenBlanker+788 UtilityModule+1C5D4 "EventProcessScrolls" Kernel+126C (default handler) EventV is a poor way of sharing a vector, and by far the worst usage is Vsync (I’m not bothered about WindowScroll getting called every time I press a key, I am irritated that ShareFS is getting called 50 times a second). All these things should have their own vector, which is what VectorExtend enables. I have been delaying the (long overdue) release because of RO6.2’s gargantuan SWI and vector handlers… but I think I will do without 6.20 compatibility and wait until someone asks really nicely.

May 19, 2018 5:56pm Rick Murray (539) 13840 posts	I’m not sure if there was/is any developer guidance on how to claim EventV, Common sense? but I’ve seen many poorly written code sequences that claim EventV, setup variables etc, then enable the event One shouldn’t claim anything that results in the OS calling into your code (which you should assume could happen the instant you enable the event) until you’re in a state where that will work. I don’t however understand why its allowed to wrap below zero, it just shouts bug to me. The bug is surely something turning it off too many times? If this is the case, we will have – in the process – reached a point where it has been turned off behind the back of something that was still using it. In other words, there are ALWAYS other EventV claimants Indeed. EventV is “all the other stuff”. Buffers full/empty, ADC conversion, VSync, Escape, Econet, mouse/key transition, a new bar (and not one that serves Stella Artois), MIDI, Internet, battery… With that in mind, it is foolish to think that there aren’t other services using some of these events. Just wrote a quick program to report the number of claims, and on my machine it’s eight for the key transition and four for the Internet event. Surprised nothing is hanging on to VSync. EventV is a poor way of sharing a vector, Too many (disparate) things all lumped together. I am irritated that ShareFS is getting called 50 times a second It could have been worse, it could have been sitting on TickerV. ;-) Surely using VSync means that ShareFS’s responsiveness depends upon the monitor refresh rate?!?

May 19, 2018 6:21pm nemo (145) 2546 posts	One shouldn’t claim anything that results in the OS calling into your code (which you should assume could happen the instant you enable the event) Let’s be very clear: Your code will get called regardless of whether you enabled any particular event. Your code is on EventV, which is called if anyone has enabled any event. Your code will therefore be called, by definition, under completely unpredictable circumstances.

May 19, 2018 6:27pm nemo (145) 2546 posts	Surely using VSync means that ShareFS’s responsiveness… You misunderstand. ShareFS wants to know about an Internet Socket Event, so it claims EventV and enables Event 19. Somebody else wants VSync, so claims EventV and enables Event 4. Now, whenever there is an internet socket event or a VSync, both claimants get called. This was a regrettable compromise on the Beeb, sharing one of the very few vectors between a couple of Sideways ROMs. On RISC OS, with ~200 modules running, it’s very silly indeed. It is analogous to Service Calls, which are just about the worst API in RISC OS – again, inherited from the BBC Micro which might, if your spent money and got your soldering iron out, have ten sideways ROMs listening for twenty service calls. I have about 200 modules listening to well over 200 service calls. In RISC OS the Service Call mechanism is the primary victim of the lack of vectors. Even the extensive RISC OS 4 service call ‘optimisation’ could only damp the flames down a bit (which is why ImageFileRender_Artworks is called every time there’s a MimeMap request).

May 19, 2018 7:19pm Colin (478) 2433 posts	The UpCallV is bad as well I doubt it was envisaged to be the ‘best’ way to use USB when it was conceived. Most of the other stuff on the vector is notification of occasional events. But then again I don’t think that USB should be on a vector at all.

May 20, 2018 10:48am nemo (145) 2546 posts	Actually I disagree, I think every USB device should have its own vector, plus an overall USB admin vector. USB device inserted, a vector is dynamically allocated for it and announced on the fixed USB vector. Thereafter all device-specific events occur on the device’s own vector. But I like vectors. They are what makes RISC OS so flamboyantly insecure.

May 20, 2018 10:57am Colin (478) 2433 posts	Don’t mind that, just don’t like using vectors for USB transfer callbacks.

May 21, 2018 11:51am nemo (145) 2546 posts	No, agreed. Vectors for transactions, direct (vector-negotiated) API for data transfer.