Basic and ZeroPain
Pages: 1 2
Martin Avison (27) 1494 posts |
I have spent some time recently investigating ZeroPain reports (see thread ) for a large Basic program, and I thought I would share some of my findings. 1. The ZeroPain report shows register values, and nothing Basic related (like the line number). I wrote a program to extract information from ZP reports, and then find the Basic line number and PROC/FN name. 2. In my case, for extra fun, the executed Basic is crunched, so I then had to manually work out the real source file, line number, and PROC/FN name. 3. Basic evaluates 4. The majority of the problems in the program were caused when processing lists which are linked by a ‘next address’ in the first word, with -1 indicating end of list. Contructs like are failed by ZeroPain if expr2 depends on P% being valid, because when P% gets to -1 the evaluation of expr2 can result in data being read from page zero. Before ZeroPain this data read by expr2 was just ignored.
5. I can see no simple solution to the WHILE, apart from adding… and replacing the WHILE condition with just FNtest. Can anyone suggest a better (or faster) solution? 6. There are differences between how an RPi and Iyonix behave with ZeroPain. 7. Most of the bugs are very long standing (before my time, honest!) 8. I can see no way that the ZeroPage reads could be avoided automatically (except by emulation). |
Jeffrey Lee (213) 6048 posts |
I had lots of fun with fixing null pointer dereferences in the printer manager sources that were caused by lack of lazy evaluation. Plus use of the $ operator to construct strings from null pointers. Adding lazy evaluation support would be nice, but it would have to be some kind of opt-in thing to avoid compatibility issues (and even then you might end up with software which runs on the new version of BASIC but not on the old one). 6. There are differences between how an RPi and Iyonix behave with ZeroPain. That’ll be down to the Pi version of BASIC avoiding using unaligned loads (while the Iyonix one does an unaligned LDM). Fixing ZeroPain to have some level of support for unaligned loads is still on my todo list (along with reporting the BASIC line number) – hopefully I’ll get it done within the next few days. |
Jon Abbott (1421) 2651 posts |
Can you not initially evaluate the expression outside of the loop and only update it if P% is valid? P% = liststart% |
Steve Drain (222) 1620 posts |
I have done a few linked lists and see this as an algorithmic problem – I would only use one condition for the WHILE, which is to move through the list from first to last item. Then an exit from the list would be made using expr2.
However, BASIC does not let you do that, although you can guess what does. ;-) Even so, there is a way, by wrapping the code in a procedure.
Does that replicate what you have in mind? |
Steve Drain (222) 1620 posts |
This topic has raised its head from time to time over many, many years. Two, rather clumsy, ways to cope with this in BASIC as it stands are, for AND:
and OR:
Even more clumsy for AND:
If I were to implement lazy evaluation in Basalt, it would probably have this syntax:
which would return FALSE on the first non-true condition.
which would return TRUE on the first true condition. This could also be implemented in BASIC without conficting with existing code, athough it would not solve the problem highlighted by Martin. |
Steve Drain (222) 1620 posts |
That cannot work – AND followed by ( is valid syntax – which is probably why I never implemented it. Alternatives are not obvious, but there is the rather awkward
or variants. |
Steve Drain (222) 1620 posts |
BASIC V has always prevented the writing of strings to addresses below &8000, with error
Perhaps that could be extended to reading, and to the other indirection operators. It adds an instruction, which would once have been important, but hardly now. |
Jeffrey Lee (213) 6048 posts |
BASIC V has always prevented the writing of strings to addresses below &8000, with error Ah, I didn’t know about that. Extending that error to cover reading strings from low memory would certainly make sense – but probably with a cutoff of 256 rather than &8000 (old versions of the OS did store things like the command line in the first 4K of workspace). Not sure about the other indirection operators though – it currently doesn’t error for writes using them. Maybe the solution, without breaking compatibility with older OS versions, would be to detect any read/write of <&8000 and BL to another routine which does a detailed check of the address (OS_ValidateAddress? OS_Memory 24?) before deciding whether to throw an error or not. Obviously these changes won’t do anything to help peoples broken code, but since BASIC is an interpreted language it makes sense for the interpreter to show a friendly error message when the program tries to do something silly instead of triggering a crash inside the interpreter. |
Martin Avison (27) 1494 posts |
Thanks for the comments. While (!) all solutions to the WHILE will work, I think I prefer Steve’s because it seems to express what is going on more clearly, and it only has expr2 defined once (I hate duplication, as it can lead to problems later). Out of interest, I checked if there were any speed differences between the original code and my solution, Jon’s and Steve’s. The original code is fastest (but wrong!). My solution was about 22% slower, Jon’s about 36% slower, but Steve’s only 1% slower. But the slowest was about 1/4 second for 100,000 entries, so probably not significant. Using EXIT would be very nice, but I wanted to use just Basic. |
Steffen Huber (91) 1953 posts |
DEC Pascal solved the “full eval vs. short circuit eval” rather neatly by introducing the keywords and_then and or_else (obviously inspired by Ada). But I guess introducing new reserved words/tokens in BBC BASIC is a headache. |
Steve Drain (222) 1620 posts |
Nice to know. ;-) Here is a further alternative to test:
but I think that might be slower and I do not like the logic as much.
A pity, that. ;-( It is a coincidence, but I have been revisiting loops in Basalt over the last few days. I have rewritten EXIT to do what I always planned and it will now exit from a particular type of loop at a given nesting level, properly discarding all other stacked information, but not beyond PROC/FN. I have also rewritten the code for DO..LOOP, which was only experimental up to now. It does not match the requirement, but this is possible in a list context:
|
Steve Drain (222) 1620 posts |
It is not for Basalt, and would not be for BASIC if anyone were to want to. However, they would have to be 2-byte tokens, as WHILE is: &C8&95. In a program AND THEN and OR ELSE are distinct syntaxes in BASIC and would tokenise correctly, so they could be used. AND_THEN is not, by the way. ;-) If the context is purely IF…THEN…ELSE… then fairly efficient code can already be written for short circuiting – see above, or I could expand on that. The problem is when the context is more general, as with the WHILE in Martin’s example. |
GavinWraith (26) 1563 posts |
AFAIK the original Dartford BASIC of Kemeny and Kurtz (1964) did not have AND or OR. The only way of achieving laziness was to use IF. Judging by the confusion about evaluation strategies evident in the multiplicity of early LISPs around that time, I think it is fair to say that it was not until long after the creation of BASIC that the penny dropped. |
Martin Avison (27) 1494 posts |
It is slower than your ENDPROC example – about 10% worse than my original. And I agree that the logic is not as clear! Recent Basic modules can be built either for a particular machine (in ROM) or a standalone version that works on any machine softloaded, all without the BasicTrans complications. I realise that softloading more than once is dangerous, but I think it may be time to (a) encourage use on older machines, and (b) consider again changes that could be made to improve Basic. |
Rick Murray (539) 13840 posts |
I think there ought to be a conditional in the assembly that will select old behaviour for old machines and new improved for new machines. The inclusion of a routine to select in all versions sounds a lot like strangling ourselves with broken backwards compatibility. |
Jeffrey Lee (213) 6048 posts |
I’m not sure having a conditional is the right way to go, primarily because I’d like us to be in the situation where the kernel is the only component which knows at build time where the kernel workspace/processor vectors are. Everything else should behave itself and use the proper APIs to work it out (Currently FPEmulator is the only thing violating this rule – it’ll take a bit of messing about to get rid of its global workspace pointer). On RISC OS 5, the overhead of the validation code will be minimal – if you’re hitting the validation code, it’s because you’re trying to access an area which probably doesn’t exist, so you’re most likely going to hit the error case. Softload versions for older OS’s will see more of an impact, because they’re more likely to have valid data there, but the impacts can be minimised by caching the address range validity on module/interpreter startup. If, for compatibility, we have an abort handler or dummy page to deal with errant page zero accesses then it would be nice for BASIC to be able to detect that and adjust its error checking accordingly – which wouldn’t be possible if the logic was hardcoded into the build of the module. So in my eye the options are “do nothing”, “block all string reads below 256”, or “add zero page address validation to all memory read/write operators”. |
GavinWraith (26) 1563 posts |
May I presume that no current version of BASIC raises an error when an address in the zero page is the target of a poke? Would it be sensible to have such a feature, or should such checking be left to the programmer? Different compromises between security, speed and nannying are possible; but I cannot see BASIC as a good candidate for sandboxing. |
Rick Murray (539) 13840 posts |
FTFY. Remember, BASIC can nanny1 because if you really need to access low memory you can jump into assembler, switch to SVC mode and do the access. If you can’t do that much, you really shouldn’t be poking around there (or any other part of OS workspace). 1 Disclosure: I’m the one that reckons stability ought to be improved by making much of the system parts of memory inaccessible to USR mode code. |
Steve Drain (222) 1620 posts |
The last would be best, if more work, and offer the chance for updating/debugging older programs when the errors pop up. It would reduce the ‘Abort on data transfer’ errrs, wouldn’t it? Meanwhile, word, bye and float writes below &8000 could probably be modified to generate an error like the string writes. |
Steve Drain (222) 1620 posts |
For lazy evaluation I suggested the impossible syntax:
I have been cogitating and come up with ANDLOG for AND logical, not bitwise:
There can be ORLOG, too. How do those feel? Implementing them in Basalt has been trivial, as it would be in native BASIC. In case there are objections to the conjunction of two keywords, I would point to SUMLEN. ;-) |
Ronald May (387) 407 posts |
I have been cogitating and come up with ANDLOG for AND logical, not bitwise: sounds like LOGarithm |
Jon Abbott (1421) 2651 posts |
Ah, I didn’t know about that. Extending that error to cover reading strings from low memory would certainly make sense – but probably with a cutoff of 256 rather than &8000 (old versions of the OS did store things like the command line in the first 4K of workspace). Not sure about the other indirection operators though – it currently doesn’t error for writes using them. Wouldn’t the cutoff be &4000? And do you need to worry about older OS versions considering RO5 moved the environment high. I can’t say I’ve come across any software that doesn’t get the address legally via OS_GetEnv. One problem I did come across is USR software that modifies the environment string, which obviously fails as its read only now. Detecting and doing a detailed check is problematic as any read in BASIC beyond bad programming, will be down to accidental reads as noted in the OP. In this scenario the software would have failed randomly or worked by pure luck as what’s returned would have been indeterminate. It’s for this very reason that I chose not to provide an automatic fix in ADFFS for inadvertent Page Zero reads and instead look in detail at the code and try to figure out what it was doing – and correct it at runtime. This won’t be possible in the case of BASIC, so reporting the issue with a sensible error report, beyond simply Aborting would be a good start, so the authors can recode to avoid the inadvertent Page Zero read.
Does BASIC need extending? Workarounds are fairly easy to implement. |
Rick Murray (539) 13840 posts |
BASIC shouldn’t cause a crash like that. With the following code, one would expect expr2 being zero to return zero (FALSE), not crash attempting to actually read from address zero…
You also are assuming that the software in question is being maintained and that somebody is going to perform the necessary modifications. |
Steve Pampling (1551) 8170 posts |
Who knew how it did that apart from Sophie (and colleagues)? Zero page change is one of those under-the-bonnet changes that needs a matching under-the-bonnet fix such that the same BASIC code works on pre and post ZP change systems.
Always lots of fun trying to change a crunched/compressed BASIC file. |
Steve Drain (222) 1620 posts |
There are workarounds for many things, but not all.1 Workarounds can be clumsy and not natural to implement. Workarounds are useless unless the programmer has come across them.2 Workarounds are almost inevitably slower than native BASIC implementations.3 Does BASIC need extending? No, programs can be still be written. Is there an advantage to extending BASIC? Certainly, witnessed by the multitude of requests for BASIC to do more over the decades. A challenge: do you have a workaround for a single-line IF depending on lazy evaluation of multiple OR conditions? ;-)
1 An almost intractable problem is an indeterminate number of parameters to a routine or dimensions to an array. I have a method for the first and SW provided one for the second, but I doubt they are widely known, even if seldom required. 2 Only in the last month or so a programmer has expressed surprise that there is a method of creating named variables from BASIC, a workaround that dates back to RO2. 3 Just look at the variety of results that Martin found with the workarounds for his WHILE problem. That there is a workaround with very little penalty is unusual, and it is clearly not well known. With Basalt, the speed advantages of some keywords can be a couple of orders of magnitude. |
Pages: 1 2