BBC BASIC - 64 bit integer support, long string support

145 posts, 21 voices

Pages: 1 2 3 4 5 6

Jan 30, 2024 6:44pm Stuart Swales (8827) 1357 posts	I think there should be a few more bits for error correction ;-)

Jan 31, 2024 6:21am Clive Semmens (2335) 3276 posts	Until I scrapped my RiscPC I could have printed out 256! for you, nae bother, then its pi th root to ten thousand decimal places… I’m not sure how useful that would be. I’ve not bothered to update my programs from the 26-bit world.

Jan 31, 2024 9:12am David J. Ruck (33) 1636 posts	Wont they run under Aemulor? But you could always feed them through my !ARMalyser, it probably only needs a couple of flag preserving instructions altered to get them OK for 32 bit.

Jan 31, 2024 9:22am Clive Semmens (2335) 3276 posts	I could fix them myself, easily – but unless someone expresses an interest in them, I cannae be bothered. “I’m not sure how useful [they] would be.” I’d actually be more interested in rewriting them in AArch64. And not interested enough to bother with that, either. More interesting fish to fry.

Jan 31, 2024 10:03am Jean-Michel BRUCK (3009) 362 posts	If you need to write 256! or pi th root to ten thousand decimal places… PariGP does it very well. 256! => 507 digits \p 10000 realprecision = 10009 significant digits (10000 digits displayed) Pi^(1/4) for the most curious… https://jeanmichelb.riscos.fr/MathUk.html

Jan 31, 2024 1:11pm Clive Semmens (2335) 3276 posts	PariGP does it very well. Which makes it even less likely I’ll bother to update my old 26-bit assembler routines! I’m interested to know the algorithm you’d use to calculate (256!)^(1/pi) (which was what I meant) though…

Jan 31, 2024 4:28pm Jean-Michel BRUCK (3009) 362 posts	Thanks for the clarification of the calculation, mathematical writing is more universal. (256!)^(1/Pi)= 2.3009717058286548507941211587893666081 E161 The same calculation with a precision of 10000, => 168 digits = 2300971705828654850794121158789366609023877923961811218803685877317 3399935491923421444185602221009491225772101615482650473218513092284 4000264589131900902203053407.185643898872800151078601069———-etc time 1,591ms (ARMX6) On the other hand, I am unable to tell you which algorithm is used. I wanted a program that would allow me to do calculations and draw curves. I created a HMI that allows me to use PariGP easily. The mathematical part is done by very competent people and it is written in C (We can use the functions in our own programs…)

Jan 31, 2024 5:57pm Clive Semmens (2335) 3276 posts	The mathematical part is done by very competent people and it is written in C I think I can fairly claim to be (or have been…) competent – 1st class honours in Maths, with a special interest in number theory, and came into documenting the ARM instruction set architecture and assembly language from a programming background. But I’ve never been a serious C programmer – apart from BBC Basic & ARM assembly language all my programming experience is on older languages like FORTRAN, COBOL & various obsolete assembly languages. I’m very happy to defer to someone else to do this stuff now!

Mar 20, 2024 2:15am tymaja (278) 174 posts	I know the feeling re: using BASIC + Asm – it is just too good a development environment, and it feels like a step back going to many ‘higher level’ languages! Regarding 64-bit integers and typos (%%); one way around this could be to disallow the creation of variables that differ only by the number of s, so integer and integer%% cannot both be created at the same time. It could help, but could become complicated because variables reside in memory between RUNning a program, and it isn’t possible (I think) to ‘delete’ individual variables from BASIC (aside from ‘NEW’ or ‘LOCAL’): Perhaps simpler would be to highlight such ambiguous naming in LVAR, perhaps colouring such ambiguous naming?

Jun 21, 2024 6:25am tymaja (278) 174 posts	Done a bit more digging into ARM BBC BASIC; It will probably be easier, in the short term (weeks) to add ‘longer string support’ than 64-bit integer support; The 32-bit integer support is probably best left ‘as-is’, changing it would break a lot of software. So, % seems the way to go there. It is fairly easy to ‘hook’ into the code – I have added % support (and set it as a new variable type), have added storage for A%-Z% ‘static’ int64s, and have added display of those to LVAR 😃. Currently I display those variables as &00000000:00000000 in LVAR, because otherwise there are two many zeros! However, this really is just experimentation. BASIC itself uses 32-bit data paths for integer work (for obvious reasons!), but it means that a LOT of work is needed, with one example of many being EXPR; even with a 64-bit integer variable type, anything that goes through EXPR will end up as a 32-bit signed int, which presents challenges. Essentially, the entire ‘core’ of BASIC needs to be upgraded to 64-bit data paths, to properly handle the existence of 64-bit integers. Such an upgrade would also need to think about signed and unsigned support, as the signed imm-32 support in BASIC, as it exists, can be challenging (especially the >> operator). If ARM64 was actually just ARM32 with wider registers, it would be a lot easier to move to 64-bit, and then just use 32 or 64 bit as needed. Long strings seem easier in some ways. The 5-byte string variable data would need changing to 6 (or 8!) bytes. The biggest challenge I see is the design of BASIC itself is such that it has the ARGP register, with the variables descending down from there. The VCACHE then sits at &0 (relative to AGRP), and is big enough (&1000+) that any data beyond that can’t be accessed easily at a word-aligned level. Shifting VCACHE causes a whole range of strange errors! ARGP itself is fixed to &8700, and uses all the data from &8000-&8700. Moving ARGP higher is possible, but there are some limits here too; &8700 fits into a ‘rotated immediate shifter operand’, but if you raise it too high, you end up with addresses beyond &10000, and then face challenges if you try to load &10100 as an immediate. This is relevant to strings, because of the string accumulator, 256 bytes, sitting below the ‘word aligned access’ region of ARGP (-4 to -&200ish). The ERRORS and OUTPUT memory areas., adjacent to STRACC, should really be increased (I haven’t understood those fully yet, but at least one of them can receive results from the string accumulator, which would write all over the entire workspace if the string was too long for the OUTPUT accumulator at least! A challenge that exists here is that BASIC currently uses &8000-8700 for the main workspace, and a decent amount for the VCACHE above ARGP. Increasing STRACC to 64K would require at least 64K in OUTPUT, and probably 128K in output, and maybe 64K in ERRORS also. If a version of BASIC was made using a fixed 65535-length string ability, it would need at least 128K of extra workspace for each instance of basic, meaning that all the small BASIC WMP apps would take up an extra 128K when running. On the flipside, extending a string length above 255 means adding another byte to the string descriptor data, at which point the most difficult part of 64K string support has been addressed! I don’t think any of the above are ‘deal-breakers’. In the short term, increasing string lengths to, say, 1024, would be more manageable … and the work to do it would be the same as 64K, and if done properly, enabling 64K string support could be done by changing a compiler variable at compilation time. Would 1024 byte string support be seen as a good step forward? Would be keen to know thoughts on this! In the meantime, I have done a few interesting things. I noticed that the ‘area from ~~&200~~&0 below ARGP’ was kind of crowded. Moving ARGP up higher was reasonably easy, and allowed room for static 64-bit ints, and longer string accumulators. Not that they are used as yet, but at least it doesn’t crash!. Still, there was ‘crowding’, so today I managed to make VCACHE mobile (without causing Bad Address Offset or corruption errors), and I then managed to move the position of the ‘word-length’ workspace relative to ARGP, and got it to work at &200 so far. This could be a minor breakthrough in some ways, because the overcrowding in the ‘space below ARGP’ vanishes, and everything still works (including running complex WIMP apps, and using the modified BASIC module softloaded (so far) causes no instability issues! I think there is life in the BBC BASIC code yet. It is a robust piece of code (despite appearing spaghetti-like in places). The stack usage is interesting – but actually allows for ‘dumping the whole stack and starting anew’ without losing the current program, or causing any data aborts etc, as one example. Moving it to 64-bit would certainly present challenges, not least from the architectural changes in Aarch64 (such as the -256 to 255 range for most immediate loads/stores, with the exception of ‘unsigned immediate’, but which is unsigned, so presents an issue with negative offsets from ARGP (…) although that seems less of an issue today!

Jun 21, 2024 6:27am tymaja (278) 174 posts	Edit : %% (two percent symbols in a row) was change to % (just one percent symbol) above for some reason, so hopefully it still makes sense!

Jul 24, 2024 4:06am tymaja (278) 174 posts	Looking at this a bit further; Long string support is definitely the easiest of the two ‘upgrades’; the string identifier is 5 bytes long (word,byte), and is unaligned, so it could be possible to extend it to 6 bytes long, to have two bytes for the string length; the STRACC, OUTPUT, ERRORS accumulators could either be left where they are, or moved to a (dynamic?) area, where they could be dynamically changed in size during program execution if needed (I would favour something along the lines of SWI “XBasic_Op” which could be used as an X SWI, allowing programs to maintain backward compatibility by handling this and limiting strings to 255 bytes if the op is not featured)… Do we have a SWI allocation for ‘SWI “Basic_”? – if not, we should probably allocate one (I will explore further re: allocating Basic_ without allocating any specific ‘operation’ strings beyond that)… The above is the ‘easy’ upgrade….. Next up; 64-bit integer support… A pretty big challenge here is that the numerical code in BASIC is based on a 32-bit integer accumulator… with the IACC being moved around between registers, pushed onto the stack, pulled from the stack. If we were to use two 32-bit registers for the IACC, problems arise when moving an IACC-register pair to another register (as it would need to be two registers), and BASIC juggles the registers (and the stack) in a very skilful way, using pretty much all the available resources. I can imagine a challenging situation if IACC expands to 64 bits on the stack, but some parts of BASIC use IACC as a 32 bit register still, etc; load/store of 4 or 8 bytes would be a challenge, leading to stack corruption if any errors are made… (however, BASIC VI does use 8 bytes for storing FPs, so that does offer some hope that an 8-byte integer can be set up) Assuming the above can be sorted, another issue arises; BBC BASIC V uses a 40 bit FP format, which is useful, because you can store 32 bits of mantissa, which is very useful because BASIC uses floating point to do a fair few of the integer operations (in a clever way, such that it is almost negligible in terms of performance loss compared to using dedicated 32-bit ints for % variables … and may even offer a speedup and reduction in code size compared to having alternate arithmetic pathways for int and float) BBC BASIC VI uses FPA or (modern ARM FP); which has plenty of bits for the FP mantissa, so there are no issues with integer accuracy loss with large integers if they are converted to and from FP. However … If we upgrade the integer subsystem in BASIC to 64 bits, we would need to have a floating point format with 64 bits of mantissa (ahem, FPA10, at least internally), otherwise there will be issues with floating point ‘corruption’ when using large 64-bit ints getting ‘compressed’ into ~54ish bits of ‘64-bit FP’ mantissa… Even the latest Aarch64 CPUs only offer 64-bit floating point, which is not good enough if BASIC was to be just ‘upgraded’ to 64-bit ints. So – with 64-bit ints, comes a major rewrite of the numerical pathways throughout the code, probably to reduce FP dependence as much as possible (around ints) – the two choices would be to have 80+ bit FP in software, or a major rewrite of a lot of the code to support 64-bit ints, keeping the ints fully isolated from the FP pathways – which would allow the use of 64-bit FPs using the on-chip hardware FP… It is a challenging situation (I have made a little progress on some of the easier stuff, though)

Jul 24, 2024 6:51am David J. Ruck (33) 1636 posts	I would opt for having 64 bit ints as a completely separate path with all operations performed in the integer domain, leaving 32 bit as is to use any FP5 tricks.

Jul 24, 2024 9:44am nemo (145) 2552 posts	For reference: Loading a 3MB text file into a string Manipulating it in the usual way As for 64b ints, BBC Basic relies throughout in being able to round-trip integers through floats – not just in the expression evaluator but in other parts of the interpreter. For example there is no integer CASE – all numbers are treated as floats. This is why Basic floats are 40b – they have a 32b mantissa for this reason. So “sixty four bit integers” is vastly more work than you could imagine, and much harder than long strings, which I’ve already done. If you’re interested in things I’ve done to basic, see https://nemoBasic.20000.org/

Jul 24, 2024 10:18am Rick Murray (539) 13850 posts	I would opt for having 64 bit ints as a completely separate path I completely agree. I think there’s a real risk of running into many edge cases when treating things as 64 bit, such as does setting a variable to &FFFFFFFF equate as TRUE? What happens if you add a little? Simply masking off the extra won’t work, and there may be all sorts of weirdness with address calculations that would need consideration if 64 bit. Not to mention a whole world of potential bugs and quirks due to changing a fundamental part of how BASIC works. Probably best to consider them as something different. If you’re interested in things I’ve done to basic Problem is, if the source mods aren’t being fed back to the master copy of BASIC… (and, whew, that’s a lot of bugs nailed to the wall)

Jul 24, 2024 11:14am Stuart Swales (8827) 1357 posts	So “sixty four bit integers” is vastly more work than you could imagine Ta for pointing it out – should put that idea to bed unless total rewrite. weirdness with address calculations that would need consideration if 64 bit You don’t have to use all of the address space, you know ;-)

Jul 25, 2024 11:04am tymaja (278) 174 posts	nemoBASIC looks very interesting! (the thing with the stack is very true – ARM BBC BASIC actually throws the stack away and starts fresh regularly). Also, when you look at the code, you can figure out how to make it do strange things, such as what happens when you do A%=%1010010001011 (and include more than 32 1s and 0s), or PRINT %1100010101012110 (the second one is particularly interesting) I extended my long string support in BASIC to use a word for string length now; still looking into 64 bit ints!

Jul 25, 2024 1:32pm Rick Murray (539) 13850 posts	to use a word for string length now What happens if you PRINT# such a strong, and can you INPUT# either type? you can figure out how to make it do strange things The lack of formal definition and loose parsing means you get away with some horrible things, like PROC/FN with multiple entry points, or FNs that can return different types of variable. By the way, is any version of BASIC smart enough yet to recognise SYS XOS_GenerateError and not actually bother calling the SWI in that case?

Jul 25, 2024 3:12pm nemo (145) 2552 posts	PRINT# New field type: But BPUT is vastly more powerful too: I’ve forgotten to update the help text – `BPUT(#F%,...)` serialises to file too. XOS_GenerateError The I CANNOT BELIEVE IT IS STILL BROKEN SWI handler in RO5 has a special-case for GenerateError despite the fact that its existence proves the R0-corruption was always indefensible. And that’s without the breakage of CallAVector and BreakPt. However, nemoBasic has `A$=BGET(ptr%,0)` or `A$=STRING$(ptr%)` for that. There may be a third way too. 🤔 smart enough It needed someone smart enough to think of it. So thank you!

Jul 25, 2024 3:20pm nemo (145) 2552 posts	A word on extending Basic: Please take backwards compatibility very seriously. An operating system that can’t run the software it was designed to run is a chocolate teapot. Alternate Basics can do what they like, but be very wary of changing anything in the built-in Basic unless you can prove it alters nothing for existing programs. That includes memory usage (ie WimpSlot) as well as the memory map and existing workspace layout presented to machine code via `CALL` and `USR`. There are many major and commercial programs that are a mix of Basic and machine code and you must not affect them in any way.

Jul 25, 2024 3:32pm nemo (145) 2552 posts	BTW, three new things going on here:

Jul 25, 2024 3:55pm nemo (145) 2552 posts	And finally, you get away with some horrible things How about Duff’s Device? `CASE duff OF IF0THEN WHEN0 ENDIF REM stuff that happens for 0 REM etc etc IF0THEN WHEN1 ENDIF REM stuff that happens for 0 and 1 IF0THEN WHEN2 ENDIF REM stuff that happens for 0, 1 and 2 IF0THEN WHEN3 ENDIF REM stuff that happens for 0, 1, 2 and 3 REM and so on ENDCASE` If you’re staring at that in horror, bear in mind that `IF` has no context – so `ENDIF` has nothing to check or throw away. `ENDIF` is always a NOP in BBC Basic – it’s simply a label for `THEN` and `ELSE` to look for. Although that works fine, I formalised it in nemoBasic with `ANDWHEN`[value] and `ANDOTHERWISE` which allow execution to flow from clause to clause in a `CASE`.

Jul 25, 2024 6:57pm Rick Murray (539) 13850 posts	If you’re staring at that in horror Umm… If I came across code like that, I’d nope out of there so damn fast. Just because it can be done doesn’t mean it should be done. Oh, now, BPUT OF LEN is a great addition. So thank you! 😊 Please take backwards compatibility very seriously. It’s okay to add things, bits have been added to BASIC in each Acorn machine release. But existing stuff MUST still work. No if but or maybe. To put this into context, regular BASIC (not the FP builds) still emulates calling addresses like &FFF1 for the MOS services that never existed on RISC OS (or Arthur) but might have been used in existing BASIC programs.

Jul 25, 2024 7:25pm nemo (145) 2552 posts	CALL Yup. In fact it only does that if there’s no `CALL` parameters (not supported on Beeb). This has always been a slight vulnerability (if you were unlucky with your mc% address) so nemoBasic supports `CALLmc%;` (note semicolon) to disable the emulation without params. The vast majority of my new features are backwards compatible because they use new syntax that wasn’t previously possible. But a few things are definitely different such as the fix for bitshift priority. This fixes the Most Perplexing Basic Bug™ I know of: `x=0 IF x = 1<<31 THEN PRINT "What?" ENDIF` Run that program, try to understand it. Then change the first line to `x=1` and explain. So I’m happy with that fix despite it being strictly incompatible. Meh, this isn’t `BASIC`, it’s `nemoBasic`.

Jul 25, 2024 11:05pm Colin Ferris (399) 1818 posts	Err – is there a copy of NemoBASIC to download :-)