RISC OS Open: Forum: Wimp message to closing task crashes the Wimp

Feb 22, 2019 10:37am

Definitely my bug that causes this, but I think the consequences are rather severe.

My central heating app has some web pages. The Overview page shows the state of pretty much everything, and has lots of clickable links to allow things to be altered. The information and commands are exchanged with the main controller app via Wimp messages. The “hot water” message has four possible arguments: -1 to read, and 0, 1 and 2 to turn off, turn on, or toggle. The read is to get the information to build up the web page. Clicking a link sends the message with the toggle argument, then reloads the page so that the new state is displayed.

My bug was that the controller always sent a reply message, whichever argument was sent. Clearly a reply is wanted for read, but not for the others. So a Wimp reply was sent to a task (the web page, which is BASIC cgi-bin in WebJames) that was closing down or had just closed down.

The end result, sometimes, was that the entire desktop would exit, and there would be a black screen with the command prompt. That’s beyond what any of my apps could achieve.

So, is it possible that there is a bug in the Wimp, perhaps related to the order in which things are removed when a task ends?

Raspberry Pi 2 of some kind (it’s buried in the airing cupboard and runs headless), RO 5.24 of 16 Apr 2018.

Feb 22, 2019 1:48pm

Jon Abbott (1421) 2651 posts

The end result, sometimes, was that the entire desktop would exit, and there would be a black screen with the command prompt.

That implies the desktop exit handler was called.

How does WebJames handle the reply it wasn’t expecting is the question.

Feb 23, 2019 3:50pm

John Sandgrounder (1650) 574 posts

@Dave,

Can you confirm that your script is run as a Wimp Task? So that replies should go to your task and not WebJames.

Feb 23, 2019 8:02pm

John Sandgrounder (1650) 574 posts

This is interesting.

I use the same method for controlling a couple of Applications of mine. WebJames BASIC scripts in cgi-bin use Wimp_SendMessage to communicate with the Application and the Application uses Wimp_SendMessage to reply. (sometimes, but not always, having used Wimp_TransferBlock to transfer the required information)

I have edited the BASIC scipts to closedown prematurely, but I have not been able to replicate this bug. I am using a Pi 2 Cortex A53.

Feb 23, 2019 11:02pm

Dave Higton (1515) 3534 posts

Can you confirm that your script is run as a Wimp Task? So that replies should go to your task and not WebJames.

As you’ve probably seen by now, it’s not quite like that. My task calls PROCCGI_Init and PROCCGI_MultiTask from WebJames’s cgi-lib. The result is that my task is visible under its own name, and that’s how the other app gets to know its task handle (it does a TaskManager search for the cgi-bin task’s name) so that it knows how to respond.

So WebJames must in some way be kicking the task off, but the task appears under its own name and Wimp messages can be exchanged between it and an ordinary BASIC task.

Feb 23, 2019 11:05pm

Dave Higton (1515) 3534 posts

I should add that my apps only crashed sometimes. Repeatability was an issue, which is one reason why it took me a long time to find my error.

Feb 23, 2019 11:59pm

John Sandgrounder (1650) 574 posts

it’s not quite like that. My task calls PROCCGI_Init and PROCCGI_MultiTask from WebJames’s cgi-lib.

As you say WebJames is kicking the task off but after those two calls your cgi-bin BASIC is left running as its own Wimp Task. (hence you can see it in the Task Manager window)

that’s how the other app gets to know its task handle (it does a TaskManager search for the cgi-bin task’s name) so that it knows how to respond.

When your Application receives the Wimp Message from your cgi-bin task, the message will include the sending task handle, so there should be no need for a task manager search to know where to send the reply. Perhaps, it is that unecessary search (as the task closes) which is giving you trouble.

Another thing which may be worth checking is that you have not chosen message numbers which are already in use elsewhere on the system.

Feb 25, 2019 9:17am

Dave Higton (1515) 3534 posts

When each of the tasks involved starts up, it searches the Task Manager for the handles of the tasks it knows to communicate with, and stores them. From then on it listens for messages of tasks closing and starting, updating itself accordingly. There are tests to ensure it doesn’t send to a task that has closed down, but of course a reply to a closing task would be sent before the task closing message arrived.

Feb 25, 2019 12:31pm

John Sandgrounder (1650) 574 posts

That sounds OK.

I would still use a reply handle when you have it. (If only to eliminate other problems. I have still not be able to replicate your bug)

Feb 25, 2019 1:15pm

Dave Higton (1515) 3534 posts

I always use a reply handle. I’m puzzled as to why you would imagine anything else.

Feb 25, 2019 1:23pm

John Sandgrounder (1650) 574 posts

My test page (a WebJames style BASIC in cgi-bin) runs as follows:
– Uses a Wimp Broadcast – asking my Application to reply with its task handle
– Sends a message to the Application, asking for the data (126K of it)
– but then closes after a random timeout – before getting the response
– the generated web page then has a Meta Refresh of 0 and repeats the request

This has been running continally for an hour now and although Firefox on the PC locks up occaisionally, the Pi keeps running wilst tracking my car using !RiscOSM (and 25 other GPS trackers) at the same time.

Feb 25, 2019 3:10pm

John Sandgrounder (1650) 574 posts

I always use a reply handle. I’m puzzled as to why you would imagine anything else.

The result is that my task is visible under its own name, and that’s how the other app gets to know its task handle (it does a TaskManager search for the cgi-bin task’s name) so that it knows how to respond.

Feb 25, 2019 4:52pm

nemo (145) 2554 posts

The end result, sometimes, was that the entire desktop would exit, and there would be a black screen with the command prompt. That’s beyond what any of my apps could achieve.

It’s very easy to achieve, just broadcast a Quit message. You’ll get no warnings about unsaved data, everything will immediately quit. If you’re very lucky you might get an error message from a particularly badly written task. Most do exactly what they’re told.

It is extremely easy to broadcast a Quit message by mistake, it just needs a buffer full of zeros (other than the first word) and a typical ‘reply’ function that takes its destination task ID from the buffer. As long as the first word is a multiple of 4 between 20 and 252, goodbye desktop and everything sailing in it. :-(

There is very little of RISC OS that is defensively programmed. Mostly it’s Jenga.

Feb 25, 2019 5:02pm

nemo (145) 2554 posts

It would be possible for the Wimp to disallow Quit from a task that hasn’t just sent a PreQuit.

Similarly, it would be possible to disallow Wimp_TransferBlock for writing except during the handling of a RamTransmit (and limited to the allowed range) and to always disallow reading.

There’s all sorts of things the Wimp could do. <shrugs>

Feb 25, 2019 6:03pm

John Sandgrounder (1650) 574 posts

It’s very easy to achieve, just broadcast a Quit message.

Wow! Thanks Nemo. I just tried it. It certainly looks like the answer to this bug.

Somebody in Acorn didn’t think that through – using zero for Broadcast and zero for Quit

Feb 26, 2019 3:13pm

John Sandgrounder (1650) 574 posts

Similarly, it would be possible to disallow Wimp_TransferBlock for writing except during the handling of a RamTransmit (and limited to the allowed range) and to always disallow reading.

But those two facilities are a major part of the reason we chose RISCOS for our applications.

Feb 26, 2019 4:06pm

Rick Murray (539) 13850 posts

and to always disallow reading.

Isn’t that how Zap (etc) read application workspace?

Somebody in Acorn didn’t think that through – using zero for Broadcast and zero for Quit

Alternatively: not my fault your code is buggy… :-)

Feb 26, 2019 6:40pm

Steve Pampling (1551) 8172 posts

and to always disallow reading.

Isn’t that how Zap (etc) read application workspace?

I believe somebody once said

Alternatively: not my fault your code is buggy… :-)

Or to put it in terms that Nemo would probably agree with:

There’s all sorts of things the Wimp could do.

i.e. The Wimp could be made more robust and be made to protect against “unusual” backdoor access.
Like maybe someone could do what Acorn described decades ago and protect against dodgy zero page access – ah, already done that one.

TBH. If Nemo does what he seems to thinking of doing with the Wimp, keyboard input and allied bits I suspect the Wimp will be more robust simply because, if I’m any judge of character, it’s an all or nothing when he works on something.

Feb 28, 2019 8:06pm

nemo (145) 2554 posts

Which would be flattering if not for my long catalogue of “nothing”. :-/

Feb 28, 2019 8:59pm

Dave Higton (1515) 3534 posts

Interesting discussion.

@John: my apologies, I originally gave the wrong half of the description of how the task in question finds who’s sending messages to it. In general, tasks search the Task Manager when they start up to find the handles of the tasks they need to know about, and store them for instant comparison. Subsequently those stored values are updated from task quitting and task starting messages. In the particular case of the Overview page app, of course it’s only there briefly, so the way the main task gets to know the Overview’s task handle is by a task starting message.

When a task handle is unknown, I store zero there. That would be the value I check against when I decide whether (and, in some cases, how) to reply. So if my app were sent a message, from task handle 0 in block + 4, and with my chosen message number (&89AC03), it would send a message &89AC03 to task handle 0, i.e. broadcast it.

I still don’t see any mechanism for my code to broadcast a message 0, though.

Feb 28, 2019 11:58pm

John Sandgrounder (1650) 574 posts

No need for any apologies. :)

Good luck with the hunt for your messsage zero.

In the meantime a check for a zero message (and zero task handle) just before each send would be a safeguard.

Wimp message to closing task crashes the Wimp

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Feb 22, 2019 10:37am Dave Higton (1515) 3534 posts	Definitely my bug that causes this, but I think the consequences are rather severe. My central heating app has some web pages. The Overview page shows the state of pretty much everything, and has lots of clickable links to allow things to be altered. The information and commands are exchanged with the main controller app via Wimp messages. The “hot water” message has four possible arguments: -1 to read, and 0, 1 and 2 to turn off, turn on, or toggle. The read is to get the information to build up the web page. Clicking a link sends the message with the toggle argument, then reloads the page so that the new state is displayed. My bug was that the controller always sent a reply message, whichever argument was sent. Clearly a reply is wanted for read, but not for the others. So a Wimp reply was sent to a task (the web page, which is BASIC cgi-bin in WebJames) that was closing down or had just closed down. The end result, sometimes, was that the entire desktop would exit, and there would be a black screen with the command prompt. That’s beyond what any of my apps could achieve. So, is it possible that there is a bug in the Wimp, perhaps related to the order in which things are removed when a task ends? Raspberry Pi 2 of some kind (it’s buried in the airing cupboard and runs headless), RO 5.24 of 16 Apr 2018.

Feb 22, 2019 1:48pm Jon Abbott (1421) 2651 posts	The end result, sometimes, was that the entire desktop would exit, and there would be a black screen with the command prompt. That implies the desktop exit handler was called. How does WebJames handle the reply it wasn’t expecting is the question.

Feb 23, 2019 3:50pm John Sandgrounder (1650) 574 posts	@Dave, Can you confirm that your script is run as a Wimp Task? So that replies should go to your task and not WebJames.

Feb 23, 2019 8:02pm John Sandgrounder (1650) 574 posts	This is interesting. I use the same method for controlling a couple of Applications of mine. WebJames BASIC scripts in cgi-bin use Wimp_SendMessage to communicate with the Application and the Application uses Wimp_SendMessage to reply. (sometimes, but not always, having used Wimp_TransferBlock to transfer the required information) I have edited the BASIC scipts to closedown prematurely, but I have not been able to replicate this bug. I am using a Pi 2 Cortex A53.

Feb 23, 2019 11:02pm Dave Higton (1515) 3534 posts	Can you confirm that your script is run as a Wimp Task? So that replies should go to your task and not WebJames. As you’ve probably seen by now, it’s not quite like that. My task calls PROCCGI_Init and PROCCGI_MultiTask from WebJames’s cgi-lib. The result is that my task is visible under its own name, and that’s how the other app gets to know its task handle (it does a TaskManager search for the cgi-bin task’s name) so that it knows how to respond. So WebJames must in some way be kicking the task off, but the task appears under its own name and Wimp messages can be exchanged between it and an ordinary BASIC task.

Feb 23, 2019 11:05pm Dave Higton (1515) 3534 posts	I should add that my apps only crashed sometimes. Repeatability was an issue, which is one reason why it took me a long time to find my error.

Feb 23, 2019 11:59pm John Sandgrounder (1650) 574 posts	it’s not quite like that. My task calls PROCCGI_Init and PROCCGI_MultiTask from WebJames’s cgi-lib. As you say WebJames is kicking the task off but after those two calls your cgi-bin BASIC is left running as its own Wimp Task. (hence you can see it in the Task Manager window) that’s how the other app gets to know its task handle (it does a TaskManager search for the cgi-bin task’s name) so that it knows how to respond. When your Application receives the Wimp Message from your cgi-bin task, the message will include the sending task handle, so there should be no need for a task manager search to know where to send the reply. Perhaps, it is that unecessary search (as the task closes) which is giving you trouble. Another thing which may be worth checking is that you have not chosen message numbers which are already in use elsewhere on the system.

Feb 25, 2019 9:17am Dave Higton (1515) 3534 posts	When each of the tasks involved starts up, it searches the Task Manager for the handles of the tasks it knows to communicate with, and stores them. From then on it listens for messages of tasks closing and starting, updating itself accordingly. There are tests to ensure it doesn’t send to a task that has closed down, but of course a reply to a closing task would be sent before the task closing message arrived.

Feb 25, 2019 12:31pm John Sandgrounder (1650) 574 posts	That sounds OK. I would still use a reply handle when you have it. (If only to eliminate other problems. I have still not be able to replicate your bug)

Feb 25, 2019 1:15pm Dave Higton (1515) 3534 posts	I always use a reply handle. I’m puzzled as to why you would imagine anything else.

Feb 25, 2019 1:23pm John Sandgrounder (1650) 574 posts	My test page (a WebJames style BASIC in cgi-bin) runs as follows: – Uses a Wimp Broadcast – asking my Application to reply with its task handle – Sends a message to the Application, asking for the data (126K of it) – but then closes after a random timeout – before getting the response – the generated web page then has a Meta Refresh of 0 and repeats the request This has been running continally for an hour now and although Firefox on the PC locks up occaisionally, the Pi keeps running wilst tracking my car using !RiscOSM (and 25 other GPS trackers) at the same time.

Feb 25, 2019 3:10pm John Sandgrounder (1650) 574 posts	I always use a reply handle. I’m puzzled as to why you would imagine anything else. The result is that my task is visible under its own name, and that’s how the other app gets to know its task handle (it does a TaskManager search for the cgi-bin task’s name) so that it knows how to respond.

Feb 25, 2019 4:52pm nemo (145) 2554 posts	The end result, sometimes, was that the entire desktop would exit, and there would be a black screen with the command prompt. That’s beyond what any of my apps could achieve. It’s very easy to achieve, just broadcast a Quit message. You’ll get no warnings about unsaved data, everything will immediately quit. If you’re very lucky you might get an error message from a particularly badly written task. Most do exactly what they’re told. It is extremely easy to broadcast a Quit message by mistake, it just needs a buffer full of zeros (other than the first word) and a typical ‘reply’ function that takes its destination task ID from the buffer. As long as the first word is a multiple of 4 between 20 and 252, goodbye desktop and everything sailing in it. :-( There is very little of RISC OS that is defensively programmed. Mostly it’s Jenga.

Feb 25, 2019 5:02pm nemo (145) 2554 posts	It would be possible for the Wimp to disallow Quit from a task that hasn’t just sent a PreQuit. Similarly, it would be possible to disallow Wimp_TransferBlock for writing except during the handling of a RamTransmit (and limited to the allowed range) and to always disallow reading. There’s all sorts of things the Wimp could do. <shrugs>

Feb 25, 2019 6:03pm John Sandgrounder (1650) 574 posts	It’s very easy to achieve, just broadcast a Quit message. Wow! Thanks Nemo. I just tried it. It certainly looks like the answer to this bug. Somebody in Acorn didn’t think that through – using zero for Broadcast and zero for Quit

Feb 26, 2019 3:13pm John Sandgrounder (1650) 574 posts	Similarly, it would be possible to disallow Wimp_TransferBlock for writing except during the handling of a RamTransmit (and limited to the allowed range) and to always disallow reading. But those two facilities are a major part of the reason we chose RISCOS for our applications.

Feb 26, 2019 4:06pm Rick Murray (539) 13850 posts	and to always disallow reading. Isn’t that how Zap (etc) read application workspace? Somebody in Acorn didn’t think that through – using zero for Broadcast and zero for Quit Alternatively: not my fault your code is buggy… :-)

Feb 26, 2019 6:40pm Steve Pampling (1551) 8172 posts	and to always disallow reading. Isn’t that how Zap (etc) read application workspace? I believe somebody once said Alternatively: not my fault your code is buggy… :-) Or to put it in terms that Nemo would probably agree with: There’s all sorts of things the Wimp could do. i.e. The Wimp could be made more robust and be made to protect against “unusual” backdoor access. Like maybe someone could do what Acorn described decades ago and protect against dodgy zero page access – ah, already done that one. TBH. If Nemo does what he seems to thinking of doing with the Wimp, keyboard input and allied bits I suspect the Wimp will be more robust simply because, if I’m any judge of character, it’s an all or nothing when he works on something.

Feb 28, 2019 8:06pm nemo (145) 2554 posts	Which would be flattering if not for my long catalogue of “nothing”. :-/

Feb 28, 2019 8:59pm Dave Higton (1515) 3534 posts	Interesting discussion. @John: my apologies, I originally gave the wrong half of the description of how the task in question finds who’s sending messages to it. In general, tasks search the Task Manager when they start up to find the handles of the tasks they need to know about, and store them for instant comparison. Subsequently those stored values are updated from task quitting and task starting messages. In the particular case of the Overview page app, of course it’s only there briefly, so the way the main task gets to know the Overview’s task handle is by a task starting message. When a task handle is unknown, I store zero there. That would be the value I check against when I decide whether (and, in some cases, how) to reply. So if my app were sent a message, from task handle 0 in block + 4, and with my chosen message number (&89AC03), it would send a message &89AC03 to task handle 0, i.e. broadcast it. I still don’t see any mechanism for my code to broadcast a message 0, though.

Feb 28, 2019 11:58pm John Sandgrounder (1650) 574 posts	No need for any apologies. :) Good luck with the hunt for your messsage zero. In the meantime a check for a zero message (and zero task handle) just before each send would be a safeguard.