RFC: Proposal for enhanced Wimp_SendMessage
Rick Murray (539) 13840 posts |
Reasoning The original protocol for message sending was devised in the mid ’80s and it is a 256 byte block header, which means the useful size of the block is, at most, 239 characters (plus terminator). Implementation problems The most suitable implementation would be for the Wimp to accept Wimp version 5.22 (or whatever) and simply switch to a larger block of data in the Wimp_Poll return, assuming that a modern app will understand. This creates significant issues. Those of you who read my wittering will know that I would like to create a Unicode-capable Wimp that runs “new” apps with a new API and older apps with the existing API, but there are many issues regarding information transfer between the two. It is the same sort of problem here. Outline A special message will be reserved, with a message action code such as The message body will be as follows: +0 Length of block (to satisfy existing API) +4 (task handle of sender) +8 (my_ref) +12 your_ref (0 for original message) +16 &D0B16 +20 Message action +24 Pointer to data block +28 Flags The message action, normally at +16 is now in +20. The following word points to a block of data claimed in the RMA that should be of an agreed standardised size (between 1KiB and 4KiB – must be capable of holding an entire path; comments?). Messages with the Flags should be zero at this time. As this proposal develops, they will be expanded to provide guidelines such as “non-Latin UTF” or “this is a dynamic area”(instead of RMA), or perhaps even data size markers (1KiB, 2KiB, etc). The proposal will hereby state that you should not make assumptions in the absence of flags, but you can assume if a flag is set. :-) Only messages that require the extra space will use this API. It is not a complete replacement for the existing API; thus messages such as Message_SlotSize (uses only two words) would simply not make use of this protocol. Example #1 Message_FilerOpenDir: +0 32 +16 &D0B16 +20 &400 +24 Pointer to data +28 0 The data will be: +0 Filing system number +4 Bit 0 set -> do not canonicalise +8 Full name of directory Note that due to the two information words, a 1KiB block will not be able to represent every possible path. Would 2KiB block size be better? This problem is exacerbated with Message_FilerOpenDirAt. Behaviour is broadly similar for other filesystem-related messages, such as Message_DataLoad. Example #2 Message_HelpReply: +0 32 +16 &D0B16 +20 &503 +24 Pointer to data +28 0 The data will be: +0 Null-terminated help message This will provide enhanced help to a more capable help engine, for example to permit styles and effects to be used in the help (either based upon StrongHelp (preferred) or very basic HTML), but also to permit this in conjunction with support for reasonable length messages for non-Latin languages. To this end, such text probably ought to begin with “{utf}” as well to guide the help display as to how to render the text? This is given in reply to Message_HelpRequest, and is sent via User_Message_Recorded. If the message is not Ack’d, then the sender can either try again with a simpler message and the standard Message_HelpReply, or respond with a standard “Your help application does not support blah-de-blah” response (via Message_HelpReply). Okay. Over to you. Comments? |
Gulli (1646) 42 posts |
I’m far from being the most qualified person to comment on RISC OS development but my experience is that it’s usually better to define new API signatures when adding a major new functionality in the API (I’d qualify UTF-8 support as a major change). It’s far less likely to introduce errors in the existing API, the documentation is clearer plus the new signatures will allow more changes and fixes to be implemented without causing problems. |
Jeffrey Lee (213) 6048 posts |
Replacing one arbitrary limit with another arbitrary limit probably isn’t a good way of doing things. The only reason Wimp messages currently have a fixed size is because the contents of the message block gets copied into the recipients buffer. But if your message body is in the RMA, is there any need to copy it to each recipient? (Answer: maybe. I guess it depends on who owns the RMA pointer, and how long-lived it needs to be) Either way, it would be much better to have a field which indicates the size of the message, rather than assuming everyone is working to the same hardcoded limit. |
Edward Nokes (1656) 3 posts |
Hi, Rick. Interesting idea. I think your goal is good but the implementation could be fitted much more closely to the existing API. Peppering the RMA with temporary blocks that are owned by applications seems like a bad idea. Easy to forget to free them afterwards, leaks memory if a task dies whilst sending messages, gets in the way of memory protection, makes pre-emption / multithreading harder, etc. Also, using a single message code for all long messages breaks the wimp’s optimisation of only delivering messages for which a task has registered an interest. Software would have to be rewritten anyway to make use of long messages, so might as well bite the bullet and fix it at the Wimp level. Having long messages without a filer that understands them is largely pointless for most use cases, so might as well require an OS upgrade. The existing API extends reasonably. The main concerns are: (1) sending a long message; (2) degrading to a short message if needed; (3) receiving a long message. Working on the assumptions that most messages are short, and that we will update core software to understand long messages where needed, and that older software will generally not be adversely impacted if it misses a long message. The last is ‘mostly’ valid because most long messages will be for filenames/pathnames and these are already broken. As a quick sketch… Wimp_SendMessage could accept longer messages easily – just set the block size > 256 bytes. Existing Wimps will return a unique error message (WimpBadMessageSize) if > 256. A new Wimp could happily accept > 256 and send. All message formats are unchanged. Wimp_Poll – add flag to indicate that the task understands how to receive long messages. If set, provide the length of your buffer when calling the SWI – eg. 256, 1k, 4k, whatever. For almost no effort, you can now receive longer messages, and none of the rest of your code needs changing. Furthermore, if set, a new wimp should be allowed to return with R1 pointing at a different buffer than specified on entry. This would generally be the wimp’s internal message buffer (not an application owned RMA buffer) – avoiding the wimp repeatedly copying the same broadcast message into every task’s receive buffer, keeping the cache cleaner and speeding up message passing. This ought to be user-mode-read-only memory. This also means that a very large broadcast message (eg. long filename) can be passed around all tasks without them all needing their own large receive buffers. This also means that the common case (opening files after a double click / drag) are not very dependent on the receiver having a large buffer even for very long names, because the receiver can receive the entire message via a read-only wimp buffer, and dynamically allocate enough workspace to form a response. Hence protocols are no longer restricted to a one-size-fits-all maximum. The Wimp knows when it can safely deallocate its message buffer, and no memory will ever be leaked. Use case 1 – recorded broadcast message, trying to locate a partner task to help with something (eg. double click a file, early parts of data transfer protocols). Sender attempts to send a long message. Wimp delivers it to tasks that can receive long messages. If one of those is interested, it acks, and we’re done. If the message bounces or get a WimpBadMessageSize error, then sender tries to degrade to a shorter message (eg. Message_HelpReply), and tries again (NOTE), and we’re done. If degradation is impossible (eg. most filer messages) then give up, the situation is no worse than without this API. NOTE: if we get a bounced (type 19) long message and are able to degrade to a short message, then the degraded message should only go to non-long-message-aware receivers, to avoid it being received twice. A flag in Wimp_SendMessage r0 can inform the Wimp of this. Use case 2 – initiate a message exchange with a single task (eg. later parts of data transfer protocols). Sender attempts to send a long message. Either this works, and we’re done, or old wimps give an WimpBadMessageSize error, or new wimps give an error if recipient doesn’t understand long messages, or the recipient just ignores the long message and it bounces. In the latter three cases, degrade if possible, otherwise give up – again, the situation is no worse than without this API. Use case 3 – continue a recorded message exchange with a single task (eg. drag and drop, clipboard, dataload/save). Similar to above, but there’s sometimes a spanner in the works. The wimp checks the block size quite late in the SendMessage code, after cancelling other messages in the chain using myref/yourref. Hence trying to send a long message on an old wimp would return an error but would still acknowledge the message. In this case, would need to use Wimp_ReadSysInfo to check that wimp supports long message before trying it. Otherwise, as per case 2. This is a non-issue if the sender knows that he can always degrade to sending a short reply, since the double acknowledgement won’t matter (eg. Rick’s example of sending a long Message_HelpReply then falling back to a truncated short one). For most of the examples I’ve looked at, meaningful degradation from a long to short message is impossible (eg. shouldn’t truncate a pathname), so sending a long message involves nothing more than putting a longer size in the message block and calling Wimp_SendMessage (checking ReadSysInfo first where necessary). Tasks that don’t understand long messages don’t get them. Receiving a long message just means telling the Wimp about your larger buffer / understanding of common buffer. Hence ought to be very easy to upgrade most apps to understand long messages and long filenames. Some long messages can be safely truncated (eg. Help and Notify messages) when passed to a repicient that doesn’t understand long messages. Long message senders could flag this to Wimp_SendMessage, and the Wimp could do it automatically, avoiding the need to degrade manually. Or the Wimp could maintain a list of such messages. Or an filter patch could mimic this behaviour for non-aware recipients. Toolbox etc. should be able to use pre-/post-poll filters to deal with long filenames even if the underlying application is unaware. Older software can be bodged to deal with longer pathnames via a filter that sets path macros and rewrites data/filer messages, although this would be leaky. |
Paul Sprangers (346) 524 posts |
The only thing that I understand from everything above is the Message action code &D0B16 (= DO BIG!) – and even that doesn’t make fully sense to me. Shouldn’t it be &D0B19 ? |
Steve Pampling (1551) 8170 posts |
Only if you’re in the habit of writing DO BIg rather than DO BIG. |
Colin (478) 2433 posts |
Agreed but it could be worked around using Wimp_TransferBlock then the data remains in application space.
I don’t see the advantage over having a new set of message action codes the program still has to cope with the existing and the new. If you wanted an new action code number related to the old one just set bit 31. I think messing with the existing messages is a bad idea. I think we just need new protocols ie for a new data load protocol long pathnames could be sent in multiple messages. A program would then try to start the new protocol if the path was > 203bytes otherwise it uses the old. |
Rick Murray (539) 13840 posts |
The problem is that there are numerous protocols that could benefit from having more than 230-240 characters. The reason for RFC was exactly for this sort of discussion. Keep the ideas coming! :-) |
Colin (478) 2433 posts |
You can’t make any of the current messages that don’t work with > 256 bytes work. You have to write new protocols. The >200 data transfer limit is already solved Message_RAMFetch already does it. For example you could use a Message_DataSaveLongAck reply for long filenames in the datasave protocol. If the Ack isn’t understood by the caller the protocol fails – the caller can’t cope with long paths. If it does understand you do a ramfetch for the filename. Similarly you can fit in a Message_DataLoadLong. |
John Sandgrounder (1650) 574 posts |
Surely this proposal misses the point of the message. A short quick communication. If the receiving task then needs large amounts of data, it can transfer it using Wimp_TransferBlock. (The address and length of the data having been sent in the message) I have been sending large amounts of data between tasks since the days of RISCOS 2. I can not see the need for larger message blocks. |
Rick Murray (539) 13840 posts |
Colin:
Well, isn’t this always the way? We can do a lot with RISC OS, but there is always the need to not arbitrarily mess up older applications. My idea for a Unicode Wimp falls bang-smack into this category. It is a bit of a bodge of having Latin1 apps running alongside Unicode apps. Why? Because the number of Unicode applications at the moment is, well, exactly zero. I believe that they can come, if a Unicode compliant Wimp can be demonstrated to work. But it isn’t going to work in any way, shape, or form if it comes with the footnote of “oh, by the way, all your old software won’t work”1. This is where we sit at the moment. There has been minimal work on Unicode support (just try it in !Edit) because we can choose UTF8 (LatinX apps won’t work correctly1) or we can choose LatinX (UTF8 not available).
The “DOBIG” is a new protocol, with +20 being the old-style message code that this is a ‘big’ version of. This is to allow existing #define values to be used, instead of specifying and registering as many “new” protocols as necessary to duplicate functionality of the older protocols, only bigger.
Solved for that specific functionality. Indeed, it isn’t unlike that which I am proposing, however instead of transferring files, it will be transferring a buffer. John:
For the sort of sizes of buffer we are considering (1KiB-4KiB(ish)), I really don’t think it is going to make any sort of impact on speed. Certainly, writing to a shared buffer area (DA, RMA) is going to be quicker than Wimp_TransferBlock and all the palaver the Wimp goes through to get data from one task into another task.
Then please permit me to draggy-drop this file with a 250 character canonicalised name to an app. Oh, wait, I can’t. I’m not going to bang the drum for Unicode and such (yet) when the system cannot handle legitimate filenames if they exceed the limits imposed by the message protocols. ;-) Edward: Hi! Nice first post. I think what the Wimp really needs (and I am surprised doesn’t have) is a flags word set during Wimp_Initialise. The current versions of the Wimp use the somewhat ugly method of determining capabilities by looking at the “known Wimp version”. Certainly, this failing is why the poll mask has been hijacked for flags like “needs FP state saved” which are more likely to be application specific than this-poll-now specific.
I suppose the most applicable way to deal with this situation is that if a message (DataLoad or whatever) can be expressed using a regular message, then it must be. This will permit long filenames to be passed in long messages without worrying about older less capable applications not being able to cope, for if the filename is too long to be expressed in a short message, the app wouldn’t have coped anyway. To give an example: I created an insanely long directory structure in the emulator. As you can see from the top, there is only so deep you can go before double-clicking directory names just stops working. I made a reference Meh$Dir to point further down the structure than the Filer would go, and trying to open that is either ignored (the first attempt) or causes an abort (the second attempt). This is a valid filename. An insane one, certainly, but a valid one. I’m glad XP copes better, because on the Windows side, it decodes as D:\RPCEmu\hostfs\12345678901234567890123456789012345678901234567890\12345678901234567890123456789012345678901234567890\12345678901234567890123456789012345678901234567890\12345678901234567890123456789012345678901234567890\1234567890\1234567890\1234\1234567, which is 255 characters (any more and XP’s Explorer would probably fail too). :-) 1 This likely won’t affect English speakers that much, but it will play havoc with languages using accented characters. |
Colin (478) 2433 posts |
I’m not arguing against the need for a change to allow long pathnames – it’s long overdue. I just think we already have a mechanism to do it. You want a “1Kish to 4kish” wimp poll buffer so which is it. You can’t know you’d be guessing. Using multiple messages or wimp_transfer block takes all of the guesswork out of it with either of these you can have a 5k pathname if you like. It’s how riscos works now and has for years. There’s no need to change the rom. The problem is not the size of the poll block it’s the protocols. You just need to modify the protocols and have the new protocols adopted and give the new messages new names so that program’s are explicit in what they are doing and existing mechanisms for limiting which messages a task sees are unaffected. |
Steve Revill (20) 1361 posts |
I’m not a big fan of the initial proposal – it screams ‘cludge’ in so many ways. I’ve not thought about this in any depth yet but I would say Edward’s proposal seems to be closer to the mark. It’s certainly crazy to replace one hard-wired limit with another (which is still tiny in the global scale of things); if you’re going to extend the Wimp’s messaging protocol, you should probably attempt to make a once-and-for-all change. |
Rick Murray (539) 13840 posts |
Yes, it may be a bit of a kludge (of which there are plenty in the Wimp API); I’m not so certain that TransferBlock is something we should make regular. It involves a bunch of task swaps and more critically it lets a task rummage around in another track’s workspace with no opportunity for the task to veto it… |
Colin (478) 2433 posts |
As I see it 4 options are proposed 1) Increase the wimp_poll block size to some size big enough to transfer the largest data block you are ever likely to want. 2) Keep the poll block size and transfer the data via multiple messages. 3) Keep the poll block size transfer the data via wimp_TransferBlock. 4) Keep the poll block size but allow messages to be any size. To do this the wimp copies the data for a Wimp_SendMessage to an internal RMA buffer and returns a pointer to this internal buffer in wimp_poll – requires the internal buffer to be variable size otherwise you run into the same problem as 1. All options also require changes to current protocols. 3 and 4 are semantically similar you either read the data from the sending task or you read the data from the RMA block created by the Wimp. |
Dave Higton (1515) 3526 posts |
We have an interesting situation w.r.t. compatibility. A task can in theory send any size of message it wants to, but a receiving task doesn’t know how many octets are in the message, and currently assumes that the message will in no circumstances be bigger that 256 octets. Perhaps we need a task to register with a flag that says it can receive messages bigger than 256 octets. Perhaps, for these tasks, Wimp_Poll[Idle] should also return the message length in R2 so that the receiving task can ensure that it has reserved a big enough buffer. I know the message length is already in block + 0, but, by the time the message has come into the buffer, the buffer may already have been overrun. Presumably the Wimp should not pass large messages on to tasks that cannot receive them. The question then is how to let the sending task know so that it can fall back, report an error, or take whatever other action is necessary. |
Colin (478) 2433 posts |
That is covered by Edward for case 4. A similar scheme would work for case 1. Cases 2 and 3 don’t require any changes or a new wimp version.
When you get the message it’s yours until the next wimp poll so a size in the poll block is adequate.
Thats why changes to the protocols are needed – whether the wimp is changed or not. |