"Edit may have gone wrong"
John Sandgrounder (1650) 574 posts |
The error messages are a statement of the obvious. No may about it. I have an HTTP server running with its log files stored in RAM Disc. The RAM Disc is shared unprotected. If I look at the files, using !Edit’ on the server system, I see no problems. But, if I look at the files using !Edit on another system with shared access to the files, then !Edit evenually crashes with message “Edit may have gone wrong”. Both systems are on Raspberry Pi running the 5.25 build produced by Colin on 11th May 2018 to fix the issue of key repeats. I see no other problems with this build and the size of the log files does not seem to be relevant although some of them do eventually get quite large. The server is a Pi 2.2 and the other is a Pi 1B+. |
Jeffrey Lee (213) 6048 posts |
If it’s a shared folder, I’d say it’s more likely to be ShareFS than Edit which is going wrong. You might want to give the debugger exception dumps a try – when Edit/ShareFS crashes you should get an annotated stack dump, similar to what’s shown in the ZeroPain logs. |
Timothy Baldwin (184) 242 posts |
Edit, along with many RISC OS programs first reads the size of a file, allocates memory, then reads the entire file. If the file increases in size between reading the size and reading contents, then memory used by the memory allocator will be overwritten. On the Linux port, attempting to load Edit’s wimpslot into Edit reliably causes causes Edit to quit after reporting “EFAULT” (from the filing system with unfinished error handling) and “Flex Memory Error”. If the HTTP server runs as a wimp task then it is safe locally as it can’t lengthen the log files whilst Edit is opening it |
John Sandgrounder (1650) 574 posts |
That sounds very plausible. Perhaps Edit (along with many RISC OS programs) should should read only that part of the file specified by the length amd then check the file size again and report an error if the length has changed. |
Rick Murray (539) 13850 posts |
Ah, that again. :-) Remember, RISC OS is predominantly a single process single user system where FileCore doesn’t like being called when something else is happening, so the chance of a file changing between the two atomic operations of “how big is this file” and “read this file” are slim. That said, two questions come to mind:
1 OS_GBPB |
Chris Mahoney (1684) 2165 posts |
That footnote was not necessary :) |
Timothy Baldwin (184) 242 posts |
This lack of concurrency on a file server can make the problem worse, for example:
Also FileSwitch can return a file length that is up-to 2 seconds out of date.
Using OS_File, txtedit_newwithoptions calls txtfile_insert which calls OS_File 255.
In the Linux port I implemented wimp slots using files. There is another way this can fail deterministically, first note closing files on at least FileCore, ArcFS 2 and TBAFS can increase their apparent size:
OS_File calls Service_CloseFile which may therefore result in the file growing. However if bit 20 (fsinfo_dontuseload) of the Filing system information word is set FileSwitch appears to use the old size to load the file; the use of a 2 second old size value may avoid the problem as well as cause it. |
Rick Murray (539) 13850 posts |
Jeez. People actually use a SWI that tells the OS to load a file of unknown length 1 into a bit of memory with no bounds checking!? What could possibly go wrong? It’s more work, certainly, to Open File, HeeBeeGeeBee it, Close File. But it’s a pretty solid way to get exactly what you expect to get. Maybe us grey haired gits that grew up hacking PostMan on Econet systems 2 are more used to that sort of problem? 1 As thus demonstrates, that you just looked at the size of the file doesn’t necessarily mean that file is that size. ;-) 2 Nutshell version: PostMan had a data file that was owner readable and world writable, so when somebody wanted to send you mail, the file would be opened for update and their message pasted to the end of the file. As this happened on the FileStore, MDFS, whatever, it happened behind the back of the computer running the mail software, so you absolutely couldn’t make any assumptions unless you opened the file read/write (which locked it from being accessed by others, but meant nobody could send you mail during that time). |
nemo (145) 2554 posts |
I’m actually in total agreement with Rick (see, it does happen). I’ve remarked about this vulnerability before, yet have to admit that despite that I’m guilty of the OS_File,5, OS_File,255 pattern many, many times over. Regardless of the correctness of the OS_Find, OS_Args, OS_ArgyBargy, OS_Find approach, it is verbose, and a new OS_File would be very attractive – a bounded version of File,255. ie OS_File,254,“filename”,load,,required_length would either load the file or fail with “Wrong length”. Trivial to implement, easily retrofitted via a tiny module. I commend it to the house. Actually, it would be a good idea to fix the unrelated GST asymmetry at the same time (in the same module). [I realise 12,14 and 16 have the same vulnerability… but who uses those?! There’ll have to be new equivalents for them too I suppose] |
nemo (145) 2554 posts |
Actually, chaps… what may be better is to use R3 b30 of the existing calls to say that you’re supplying the length in R5 (a pity, but R4 is used for 12 & 14). Then older OSes will silently ignore the length (and you’re no worse off), and we don’t need four new OS_Files elsewhere. As before, a tiny module brings the new functionality to older OSes, where desired. |
Timothy Baldwin (184) 242 posts |
RMLoad, OS_SpriteOp load/merge, FileSwitch (run absolute, run utility, copy files ), BASIC, MessageTrans, ToolBox, !Draw, !Paint, Zap, StrongED and Fireworkz all do so. Wimp, FilerAction, Obey and PipeDream do not. Desktop is excused as it explicitly reads from ResourceFS. To my knowledge, that’s all the native whole file loaders I have. Curiously the changelog for the Wimp suggests that it was changed to use OS_File, but I find no trace of that code. |
nemo (145) 2554 posts |
SaferOSFile is now available for Beta test. It has been observed to work twice and not catch fire on two different OSes, so that’s a good start. But anything that futzes with OS_File could potentially ruin your day, so save your buffers folks. Once a few people have tried it and ripped it apart, I’ll do a proper release. SaferOSFile adds a new flag to R3 in OS_File 255,12,14 and 16. R3b30 means “check the file length in R5”. If the file length is not R5, you get a nice safe Buffer overflow error, instead of remote code execution or other exciting developments. Please note this does not fix any existing code. It provides a safe API with only a tiny change to an established pattern. i.e. SYS"OS_File",5,A$TOO%,,,,L% IF(O%AND1)=0:ERROR1,"That's not a file" D%=FNalloc(L%) SYS"OS_File",255,A$,D%,1<<30,,L% : REM <<< NEW!!! It’s only the “ I’m open to advice as to whether keeping a file open read-only is sufficient on Unix – would it be better to open it for update? |
Steve Pampling (1551) 8172 posts |
Jeez. People actually use a SWI that tells the OS to load a file of unknown length 1 into a bit of memory with no bounds checking!? Even I, with a severely limited knowledge of programming, have said before that the OS needs to check input parameters. A few years back I made a strong point when many zero page issues brought the lack of checks to light. The historic expectation that a programmer would be silly enough to expect the OS to catch his/her mistake is just that – an item of history – but there’s an old saying that those who will not learn from history are condemned to repeat it. Could I suggest that where people identify elements of the OS that fail to check, and prevent misuse, input parameters then look for a means to alter the OS behaviour to prevent the abuse. |
nemo (145) 2554 posts |
I rashly started a module called SaferAPIs some time ago that aimed to provide the parameter checking missing from so much of RISC OS. I should perhaps return to it. |
Steve Pampling (1551) 8172 posts |
Oooh, the whole berg?
It would be appreciated by every non-Boris. |
Rick Murray (539) 13850 posts |
Yup. Most of those look to use a simple OS_File call to load in data (having previously checked the file size and verified/allocated memory); however it’s worth noting that most of this stuff happens in SVC mode which doesn’t mean that file size changes cannot happen (networked filesystem), just that they’d be ridiculously rare. A size change would need to take place at the exact instant between reading the file size, allocating memory for it, and loading it in. I had a look at what the OS_File does, and wow – what a convoluted mess. ;-)
And this shows how simple it would be to switch between blindly loading the file, and using OS_GBPB for a safer load. I propose rolling nemo’s suggestion directly into RISC OS, so if bit 30 is set, then it’ll GBPB it with the length in R5. Won’t fix existing code, but it’s a start for making newer code safer.
Wimp06 → SWIWimp_OpenTemplate It’ll try to allocate RMA to load in the template, and if that fails (“ERROR in template cache claim”), it’ll use GBPB to cache directly from file.
Not something RISC OS is noted for doing. ;-)
I strongly disagree – it is NOT a historic expectation. If anything, more modern OS’ do a lot more to catch and correct for program errors. I lament this, as the more the OS does, the crapper software becomes.
It’s interesting to observe how it is handled elsewhere. TurboC/C++++, for example, writes four NULLs to the start of program space, followed by the TurboC copyright banner. On program exit (for Small and Medium memory models only), it will check that this NULL word is still NULL. If it is not, it’ll print out a “Null pointer assignment” error message. ; Magic symbol used by the debug info to locate the data segment public DATASEG@ DATASEG@ label byte ; The CopyRight string must NOT be moved or changed without ; changing the null pointer check logic CopyRight db 4 dup(0) db 'Turbo C++ - Copyright 1990 Borland Intl.',0 lgth_CopyRight equ $ - CopyRight IF LDATA EQ false IFNDEF __TINY__ CheckSum equ 00CA5h NullCheck db 'Null pointer assignment', 13, 10 lgth_NullCheck equ $ - NullCheck ENDIF ENDIF It’s worth noting that this does nothing to prevent null pointer assignements, it just flags that one happened at some time during the execution of the software.
Ooooh boy. Where do we begin? Let’s start with an obvious one. Do you recall that there’s effectively no arbitrary limitation to filename length, path length, nesting, blah blah? When asking what technical limitations, Jeffrey told me that it wasn’t an issue that anybody wanted people to assume, so the limitations wouldn’t be disclosed? I replied that I was going to both assume and hardcode a maximum path size of 256 bytes. I wonder how many applications assume the command line is 256 bytes? As far as I can see (searching kernel for “CLIsize”) there’s no ReadSysInfo command to tell what size to expect for the command line, without physically scanning it and allocating memory on the fly. And then, there’s a problem with BASIC… SYS "OS_GetEnv" TO cli$ PRINT LEN(cli$) PRINT CHR$(34)+cli$+CHR$(34) SYS "OS_GetEnv" TO cli% p% = 0 REPEAT IF (cli%?p% <> 0) THEN p% += 1 UNTIL (cli%?p% = 0) PRINT p% I just threw that together. Now give it an Obey file to do something like: <Obey$Dir>.CLItest 1234567890 And it’ll reply: 83 "BASIC -quit "SDFS::RISCOSPi.$.Coding.Projects.JimTime.BASIC1" 12345678901234567890 " 83 Now copy-paste the numbers to be really long, and it’ll now show: 0 "" 502 To show that, with a long command line, BASIC has simply silently swallowed the entire string. No error, nothing. So, like I said, where do we begin? Perhaps by identifying problems as they arise, but…
…this requires fixes to be rolled back into the OS instead of having add-on patches. There’s nothing wrong with nemo’s patch, and indeed it’s necessary for older OSs. However since RO5 is being developed and source is available, there’s no sensible reason not to implement such functionality in the proper place, in FileSwitch… |
nemo (145) 2554 posts |
To agree concisely with Rick’s point: I am concerned only with the API… What? should be rigidly defined, but How? and Where? are implementation details. There is an argument for having detectable bugfixes where the fix changes behaviour (ie I require bug X fixed, so I will check the flag/version-number/module-name) but tighter checks are an optional enhancement. |
Rick Murray (539) 13850 posts |
Twice in as many days?!? Would you like to go lie down? Shall I go put the kettle on?
Exactly. Just to make the point (because I’m at home on a Saturday and not feeling like writing single sentence replies): We have a little blue mouse pointer on the screen. It is controlled by a physical mouse with at least three buttons. The “API” is the mouse behaviour – dragging the mouse down and left and clicking the left button will take the pointer to the bottom left corner of the screen, and the click – most likely over a filing system icon, will open the filer for that filing system. This is the mouse’s API as seen from the point of view of the end user. This is the expected behaviour. That some hardware has a quadratic (Acorn) mouse, that some have a serial mouse, that many of the newer machines will have a USB mouse… whether it is attached by wires, 2.4GHz radio link, or infrared… those are all implementation details and not relevant when looking at it in the context of “moving this lump around the table moves the pointer on the screen”. It would be unacceptable for the mouse to wrap around to some other part of the screen, or to move backwards on a Tuesday, or for the buttons to require more clicks in the evening than in the morning. These would be unexpected behaviours. There is an argument for dynamic mouse acceleration, but that’s usually a user option. Ditto for pointer trails for visually impaired users, on-screen target lines, whether the on-screen object activates on mouse click or release, etc etc.
<cough> |
Rick Murray (539) 13850 posts |
I noticed a UI oddity: ==> Help on keyword WimpButtonType *Configure WimpButtonType sets whether the back, close, iconise and toggle-size icons act when you click on them or when you release the mouse button afterwards. Syntax: *Configure WimpButtonType Click|Release This is probably what is meant in Settings as “Button clicks take instant effect”. But, wait, why it is specifically four bits of window furniture? What about the Cancel and Save icons? Radio buttons? Etc? It would be useful if this behaviour could be extended to icons in general. |
nemo (145) 2554 posts |
I suspect the thinking is that Templates can be changed, whereas Furniture is forever. [insert Ikea joke here] |