Basic ERL after an Abort
Martin Avison (27) 1494 posts |
Have you ever been infuriated when a program gives an Abort on Data Transfer and a Basic error line … only to discover that it just refers to the last line of the program? If so, please read on… The Problem There is a significant problem with Basic when external code is called using CALL, USR, SYS, or a *command (either directly or using OSCLI), and that code Aborts with Data Transfer, Undefined Instruction or Instruction Fetch. The Error Line (ERL) then given by Basic is often the last line of the program, rather than the Basic line containing the CALL, USR, SYS or *command. The useless ERL is very confusing, and does not help diagnosis, particularly when it is the only report from a user. It only tells the developer that there is a severe error somewhere in the program, which is little help. I know this lack of information has caused me, and many other developers, to waste countless hours just identifying where in the program the problem is, before the actual problem itself can be tackled. The larger the program, and the more external calls, the more wasted time – even with debugging aids (eg Reporter!) A very simple example program purely to demonstrate the problem is:
which gives an an Abort on Data Transfer with an ERL of 50, not 30. There is a worse case when a command is executed from disc (either using OSCLI or without) and the command Aborts. A message ‘Attempt to use badly nested error handler (or corrupt r13)’ is given, then the Abort message (without any ERL), and then the program ends. This also applies to any aborting Module command which has changed R13_usr. The Cause The problem is caused when the Basic LINE pointer in R12 (which points to the current line) is corrupted, and that is used by Basic to find the program line number (ERL) after an error. If R12 is outside the program, ERL is currently set to zero or the last line of the program. After an Abort in a SYS, or in a module *Command (issued direct or via OSCLI), R12 is always corrupted. After an Abort in code executed from a CALL or USR then R12 is corrupted only if the called code changed it. However, I can find no recommendations in the Basic manual for register usage in code called by CALL or USR. If R12 < PAGE then ERL is always set to 0 (and omitted by the default error handler). If R12 > TOP the ERL is always set to the last line of the program. If r13_usr has been corrupted, the Basic error handler is reset to the default, causing the program to end abruptly without any line number at all. Possible Solutions There would seem to be two possible ways to avoid the R12 corruption and provide a useful ERL: 1. Change the Kernel Abort Handlers to restore or avoid corrupting R12. This would seem to be very complex to achieve, although would probably be the least performance overhead. 2. Change Basic to enable it to recover R12 after an Abort. R12 could be saved on entry to, and restored after an Abort by the Basic error handling code. The possible options seem to be to save R12 either on the Basic stack (which is already done for CALL and USR), or somewhere in &8500-86FF. The overhead would be a single extra memory save for each CALL, USR or SYS, which seems very acceptable for the potential gains. When r13_usr has also been corrupted, it would seem possible to restore r13 from the last error handler stack pointer in ARGSP-120, which then gives an error with an ERL, and enables the program to execute the current error handler. This is exactly what Basic does after an error anyway, if it thinks r13 was valid! Proof of Concept To see if it is practical to change Basic to avoid these problems, I have created a patched version of Basic v1.48 which corrects the ERL for all aborts without any observed detrimental effects. If it can be agreed where R12 (corrected from ERL) should be stored, then I will change the Basic source to mimic the patched code, and run further tests. My suggestion is to free some valuable storage by removing from Basic all code associated with the TWIN (and possibly ARMBE) editors, which is surely unnecessary today? I will be interested in any comments, either of support, or suggesting better ways to fix the problem, or things that I have missed or got wrong. Indeed, if anyone who knows the Basic source would like to cast their eyes over the details fo my work, please get in touch. |
Chris Hall (132) 3554 posts |
A very good idea. I had wondered why the last line number was returned. I also found a quick way to identify the last line of a programme was to do a LIST with CTRL and SHIFT held down and then press Escape but that now seems to return the last line listed as the error line. |
nemo (145) 2546 posts |
In all my large BASIC programs I use a dynamic LIBRARY system and make sure the last module starts with REM some unknown place so that BASIC reports Data abort at line 10 in some unknown place Which is meagre compensation I know, but saves staring at the wrong line.
I’ve never understood why BASIC is so lax about its Handlers. It would be easy to store the stack pointer on entry to any code or, indeed, a last-good-R12 (as memory corruption can also cause aborts). If one was going to do that, I might be tempted to register the Handlers with the ARGP pointer as part of getting BASIC away from having to be based at &8000. It would be rather nice to be able to have ARGP somewhere else, sometimes. |
Steve Drain (222) 1620 posts |
It is a bit of a PITA, isn’t it?
There are some hints in the CALL branches routines section. ‘R12 = LINE (for error reporting)’ or variants are there for some, but not all, routines that require R12. It is certainly not clearly laid down, but I quickly found out that it was essential for Basalt. I have it explicitly documented in my private SH manual for the Basalt code: ‘The LINE pointer in R12 must not be overwritten because it is used for error reporting, which is not always under the control of Basalt code’ Perhaps I should have included that in the BASIC SH manual. In a few keywords it was necessary to manipulate R12 to get an error reported at the most appropriate point.
That sounds very interesting. How extensive is the patch?
Do you mean R12? The error routine calculates the ERL, doesn’t it? This will only be available in new versions of the OS, I suppose, otherwise we come up against the barrier of softloaded BASIC. Nevertheless, it is a change that does not have to be backward compatible to be very useful. There are 5 words at the top of the arguments block that have not been specified AFAIK, although I have come across their unofficial use in the past.
I am slightly confused as to why you are referring to this as storage. If you need room for the necessary code that will surely just be a matter of re-assembly after the source changes. There is only a vestige of TWIN/ARMBE in the application BASIC arguments space, and part of that is already overwritten. |
jgharston (196) 8 posts |
In my PDP-11 version of BBC BASIC I update a LINE variable every time the program steps to the start of a new line, and on an error copy LINE to ERL, but that was mainly because on the PDP-11 I’ve only got six registers to play with. startup: |
jgharston (196) 8 posts |
> If it can be agreed where the ERL should be stored You mean LINE, not ERL. Basic.Hdr.Workspace has a line that says: so that’s logically where to place extra workspace data, which by my quick mental arithmetic would put it at &86E8. |
Martin Avison (27) 1494 posts |
Sounds a much larger change than I was contemplating.
Suprisingly small – only about 20 (well chosen) instructions!
Yup. Now corrected.
There must be some solution to that, surely?
The storage I was referring to was the ‘vestige in the arguments space’ of which the first byte is used for something else. It will also remove more code from the module than I plan on adding, but that is incidental!
A far larger change than I was planning, and a far larger overhead.
Yup. Done that. Looks safe to me! And I have not been able to break it (yet). |
Rick Murray (539) 13840 posts |
Yes. Really, BASIC should not be making any assumptions about the state of registers on return (i.e. BASIC should save state). While it is one thing to say to the user “don’t do this…”, what we see here is a fault condition where the OS “does stuff”. R12 is a private word pointer for modules on SWI call, for instance. There is nothing a user can do, for if an abort is taken, you’ll get tossed back to BASIC with a messed-up R12 whether you saved it or not. It’s BASIC’s responsibility. Likewise USR mode R13 getting munged. If BASIC maintains its own copy, then this should be restored on exit from the assembler code and, at most, it should WARN you that R13 was messed up (indicating potential stack/unstack discrepancy); though this thankfully hasn’t happened to me yet, for one of the first things I do is stash R14 into the stack for later retrieval, so if the stack is messed up… ;-) |
Steve Drain (222) 1620 posts |
It has occured to me that packaging could be an answer to installing an up to date soft loaded BASIC just once. It would work for those who are happy to let packaging update their !System, and those who will update their !System manually when it is required by a new program, but it would still leave those who do neither out in the cold. I am not sure I can side with the group who would ‘force’ the package solution.
MEMM. The words above that are the ones JGH and you and I are suggesting could be used, each in a different way. Be aware that there has been some juggling of the locations in that area of the arguments over the long life of BASIC, so I would go for the highest word. |
Martin Avison (27) 1494 posts |
Oddly enough, that is exactly what I am using for testing! |
Martin Avison (27) 1494 posts |
Changes which should preserve the Basic ERL after an abort have today been included in the Development ROMs. I hope this is seen as an improvement, and welcome to others! |