What would AArch64 BASIC look like?
Julie Stamp (8365) 474 posts |
Sorry I wasn’t referring to your work which I think is great, I was musing about how one would build a compiler. I wonder if anybody has ever written a gcc frontend for a language with either dynamic scope or multiple entry-points? |
Steve Drain (222) 1620 posts |
There was a typo in the list of possible statements. Look for |
Richard Russell (1920) 95 posts |
BBC BASIC for Windows has had arbitrary (32-bit) length strings since version 6.00 (so for nearly six years), and BBC BASIC for SDL 2.0 has had them from the beginning. See for example the string allocation routine allocs() at line 729 here |
Steve Pampling (1551) 8154 posts |
So Richard, you’d be happy if RODL were consulting/talking with Sophie about enhancements? |
Terje Slettebø (285) 275 posts |
Hi guys. I’m late in this thread as well. I’ve read through it, and maybe I’ve missed it, but I think a reasonable starting point would be to let AArch64 BASIC be compatible in terms of syntax and semantics with AArch32 in general, even though integer variables would be 64-bit. I mean, how often would it be that you depend on the actual variable size in existing programs? The indirection operators would still work as normal, i.e. “!” would still address 4-byte words. We could need to come up with another indirection symbol for 8-byte double-words. Do you have any examples from existing code that wouldn’t work with 64-bit integer variables? If we could change existing code to not care about the size of integer variables, as long as they are at least 32 bits, then we wouldn’t have to come up with a new kind of 64-bit integer variable type. If this has already discussed and dismissed as impractical, then I’d appreciate a pointer. |
Julie Stamp (8365) 474 posts |
What will happen if you use indirection on a 32-bit machine? E.g., what is the result of doing
Also, are you proposing to sign-extend, or zero-extend? From Alarm.bas.Main,
which would need sign-extend when assigning to |
GavinWraith (26) 1563 posts |
This was a point I had to consider in RiscLua, which was devised to look as much as possible like BBC BASIC, while standard Lua uses 64-bit integers. |
Chris Hall (132) 3554 posts |
Surely !a% would give a 32 bit result whereas £a% (or whatever new indirection operator) would give a 64 bit result. Or even use DEEK and DOKE. |
Rick Murray (539) 13805 posts |
I would suggest keeping. ‘%’ as a 32 bit value, and sign extending of necessary when passing via SYS calls. Why? Because I can imagine a lot of stuff will carry an implicit expectation that the largest a hex value for an integer can be is eight characters as it is a 32 bit value (something that has been the case since the 6502 versions). Changing that to something larger risks breaking stuff in odd ways. It’s the same logic as C that left long as it was and used long long for 64 bit integers. |
Stuart Swales (1481) 351 posts |
Don’t know if this is actively being worked on but would suggest keeping an eye on how Richard Russell has gone about this [re BBC BASIC for Windows]. @Rick: it’s mostly Windows that went the long long route for 64-bit integer; most sane platforms went to LP64 (pointers and longs both 64-bit). Anyhow, int64_t, uintptr_t and friends. |
Terje Slettebø (285) 275 posts |
Hi Julie.
This would set A% to &FFFFFFFF
This would invert the bits, so would print 0.
Sign-extending. The idea is to do like you’d do when recompiling a C program written on a 32-bit computer to run on a 64-bit computer: The “int” type would be twice as large, but unless you depend on that fact, programs should work as they are. Naturally, the crux of the matter is this: Whether real-life code depend on this.
Right, so the question is, might sign-extending lead to different problems? |
Terje Slettebø (285) 275 posts |
@Rick
This depends on the platform. It’s perfectly valid for a compiler to treat “int” as 32-bit and “long” as 64-bit, or even both as 64-bit. “int” is supposed to be “the natural word size”, i.e. the word size that the processor uses natively. For AArch32 that’s 32-bit, and for AArch64 that’s 64-bit. Thus, I think it would be reasonable that var% is a 64-bit variable for a 64-bit ARM BASIC. If it breaks too much code, we may want to reconsider, but I’d definitely try this out first, rather than freezing the var% type at 32 bit. |
Julie Stamp (8365) 474 posts |
Sorry the I would have thought that then
Yes, definitely. Look at this
Let’s say Edit: Ok, it wasn’t so difficult, we get
In particular, it wouldn’t be the expected 1 for a zero-length string. 1 Again everything in this post is running on 32-bit OS. |
Chris Hall (132) 3554 posts |
When you are using an indirection operator, it returns what is stored at that address. It doesn’t matter (on your postulated 64 bit BASIC) if the address is a 64 bit value. The indirection operator ‘!’ returns the next four bytes, little endian. That is a 32 bit result. The indirection operator ‘?’ returns an 8 bit result. You would need a new indirection operator (say £) to return a 64 bit result. Likewise you would need a new integer variable type to hold a 64 bit value. a% would still only hold a 32 bit value. BASIC already allows real and integer values to be handled automatically in its arithmetic. Now you get into a minefield. |
Steve Pampling (1551) 8154 posts |
Some (many?) keyboard layouts don’t include £ so I think you need to look elsewhere |
Terje Slettebø (285) 275 posts |
You’re right, it was a typo on my part, I meant that A% would contain the 64-bit representation of -1, i.e. &FFFFFFFFFFFFFFFF.
I haven’t taken that into account, and it may not be a good idea to try something like that, because of the interaction with the OS calls, as you point out. My posting was based on a 64-bit BASIC running on a 64-bit OS. I don’t think it makes much sense to make a 64-bit BASIC for a 32-bit OS, so the following assumes a 64-bit BASIC on a 64-bit OS. Even though the variables would be 64-bit, the indirection operators would be unchanged. If we changed the semantics of “!”, that would break all the code using it. Let’s take your example again, removing the bitwise not: A% = -1 PRINT !A% The first would set it to &FFFFFFFFFFFFFFFF. The next tries to read a word at this address, i.e. the top of the 64-bit address range, so I’d think that would fail. Let’s explore the indirection operators some more, assuming addr% points to a valid address. Consider this 32-bit code: !addr% = &FFFFFFFF // -1 a% = ?addr% // Read the lower byte PRINT ?addr% // This prints 255, which means that "?" zero-extends. Moving to a 64-bit BASIC and OS, let’s invent a hypothetical indirector operator for doubleword, e.g. “#”, and see how it may interact with variables and the other indirection operators: A% = -1 // &FFFFFFFFFFFFFFFF #addr% = A% // Write the doubleword to addr% B% = ?addr% // &FF C% = !addr% // &FFFFFFFF D% = #addr% // &FFFFFFFFFFFFFFFF In other words, we could keep the zero-extending semantics of “?” also for “!”. When printing C%, it would print the number 2^32-1, not -1
If we were to keep a 32-bit int calling convention for SYS, then we’d definitely need to use zero-extending (both for passing and returning values), or we’d end up in the kind of mess you describe. However, if we run on a 64-bit OS, then this code should not cause any problems: We pass in a 64-bit address, and we get one back. Since A% and B% in this case are 64-bit integers, they’ll retain the full address. One problem with inventing a new 64-bit integer type for BASIC is that every existing program will then work on only half registers, so we need to take care when operating on them, making sure to sign-extend when needed in the interpreter. Moreover, having two integer types will also make the interpreter potentially much more complicated, since all integer operations and functions will need to work with both, as well as being able to convert between them. |
Steffen Huber (91) 1948 posts |
I wonder if it is a really good idea to aim for a grand unified BASIC variant for 64bit that is 100% backwards compatible. We already have the “classic” BASIC V, the “FP” BASIC VI and of course “BASICVFP”. So “BASIC64” would be the way to go. Once you started going down that alley, you wouldn’t needto worry too much for arcane behaviour to be kept 100% compatible, and start to add long strings and records and proper memory management and… Has anyone looked into porting Brandy recently? I think BASIC64 would need a good bit if prototyping, and I guess using Brandy for experimentation would lead to results much quicker than trying to experiment with our native BASIC interpreter. I remember someone saying that some parts of RISC OS are pretty standard ARM code, but that the BASIC interpreter is “a work of art”. I’m sure Steve will be around in a minute and will tell us that he already has extended Basalt to include those 64bit changes :-) |
Steve Drain (222) 1620 posts |
Not a hope. Basalt is in hibernation. In any case, it is so tightly bound up with the BASIC code that there is nothing to be done. On the other hand, this last week’s discussion has made me, also, think about Brandy as a possibly route forward. It is beyond me to do anything with it, but looking back at the original documentation it might be reasonable. Careful with “BASIC64”, that is the command to invoke BASIC VI FP. ;-) |
Steve Drain (222) 1620 posts |
‘£’ is &A3 which is the keyword Addendum
|
Steve Drain (222) 1620 posts |
If it is a work of art it is Cubist, but it is equally very valuable. |
Steve Fryatt (216) 2103 posts |
BB4W uses |
Steffen Huber (91) 1948 posts |
Is the tokenization of BBC BASIC V/VI really something to be kept intact for a 64bit variant? It doesn’t look like there will be a road to extend BASIC for 64bit data types and keep it 100% compatible. So why not allocate a new filetype for new extended 64bit BASIC, and keep the old 32bit BASIC as-is? |
Steve Drain (222) 1620 posts |
I had pretty well ignored the implications of AArch64 until this, so I had a quick look. One question arises immediately. A lot of the recent posts seem to have assumed that moving BASIC on would inevitably mean 64-bit integers, but by my reading 32-bit integers can continue to be used just as they are now with the W register designations rather than the X ones. There are the lack of conditional execution and multiple loads and saves, but otherwise it does not look to be a huge barrier. I have determined to start writing ARM assembler to at least avoid conditional execution from now on, much as it pains me ;-) Using 64-bit integers would then be an additional feature, for which a syntax would have to be settled. ‘]’ seems reasonable, and for most keywords there could be automatic casting – with errors for out of range – as there is now for integers and floats. As I wrote, but did not release, a version of Basalt with 64-bit integers, I claim my prize. Of course, the way they were implemented was completely different – they were passed by reference and required keywords for operators – so this is no basis for any future development of BASIC. ;-) |
Steve Drain (222) 1620 posts |
Another thought for transliterating straightforward ARM to AArch64, as would be needed for the BASIC module. Replace stacking multiple registers with LDM/STM with copying to/from the unused registers. It hardly makes best use of the processor, but could be pretty simple to implement. I am sure it must have been suggested before. ;-) |
Stuart Swales (1481) 351 posts |
That only works for one procedure call, Steve! Stuff has to be stacked, just with LDP/STP. And if it’s an odd number of registers to be stacked in current code, just specify one of the unused registers! And yes, 32-bit integers are just as ‘native’ as 64-bit. Retaining 32-bit as the natural BASIC/Basalt integer size would just leave the programmer porting their BASIC code to worry about those variables that hold pointers, and the size needed to store them. |