What would AArch64 BASIC look like?
Steve Fryatt (216) 2105 posts |
Because we’re not talking about a 64-bit integer variable suffix; we’re talking about a 64-bit integer indirection operator. And Richard addressed the reason why it’s OK in an earlier post, as he keeps pointing out. The question of 64-bit integer variables is another one entirely, if one decides to implement both 32-bit and 64-bit. BB4W has So you might do something like
in our hypothetical1 64-bit BASIC. 1 Hypothetical on RISC OS; it already exists elsewhere. |
Richard Russell (1920) 95 posts |
It doesn’t, but |
Richard Russell (1920) 95 posts |
For me, it’s maximising the compatibility that’s important. I do agree that porting my interpreter to RISC OS isn’t an attractive solution, not least because some of the ways in which my BASICs have diverged from Sophie’s over the years would have to be reversed. For example it would certainly have to be modified to load and save ‘Acorn format’ tokenised programs, including big-endian line numbers (ugh!) and two-byte tokens. It would also need to load and save Acorn-format data files, with their reversed strings etc. Not necessarily difficult changes, but somebody would have to implement them. Then there are some of the more subtle ways in which my BASICs differ from Sophie’s, for example I’ve merged ON ERROR LOCAL and LOCAL ERROR into one operation (LOCAL ERROR is accepted but does nothing). More significantly there’s the different behaviour on integer overflow (Sophie’s BASICs sometimes ‘wrap’ but mine always report ‘Number too big’). Plus there’s the issue of my ‘suffixless’ variables being numeric variants rather than floats, something which I think is a major advantage but some RISC OS users might find hard to swallow. There are plenty more minor differences when one looks closely, some of which could be deal-breakers. So in an ideal world I would be hoping for an AArch64 interpreter written in assembly language for speed and compactness, generally working in accordance with what RISC OS users have come to expect, but with the greatest possible degree of compatibility with my interpreters given those constraints. But if nobody steps up to the plate, porting my BASIC (with modifications) does at least provide a fallback which would – I hope everybody agrees – be better than nothing. |
Steve Pampling (1551) 8170 posts |
As a dabbler in programming and full time (paid) support person working on networks I feel that it should be pointed out that (lifting the text directly from a document1) So, essentially x86 is the one that is out of step and the item missing from the text above is that ARM deployment outnumbers x86 deployment so x86 instances are definitely a minority. 1 Serialization in Object-OrientedProgramming Languages – Konrad Grochowski, MichałBreiter and Robert Nowak |
Richard Russell (1920) 95 posts |
ARM can be big-endian but AFAIK all mainstream ARM platforms (e.g. iOS, Android, even RISC OS) are little-endian, and ARM C compilers by default assume little-endian operation (for example it’s necessary to use the -mbig-endian switch to force gcc to generate code for an ARM running in big-endian mode). But more to the point, the 6502 is little-endian so why did Sophie originally choose to use big-endian line numbers on that platform? I’ve asked before, but never received a satisfactory explanation. |
Rick Murray (539) 13840 posts |
Fifty hours a week is part time dabbling? I usually1 work 35 hours a week. That’s considered full time.
The 6502 might be little-endian, however line numbers are 16 bit values and registers eight bits, so I don’t think it matters which way around the bytes are read. 1 A bit less right now “because Covid” |
Steve Pampling (1551) 8170 posts |
Probably the network use and mainstream Unix (AIX, HP-UX, Solaris) would be the likely answer. The design of the ARM as either is probably for flexibility in leaving any switchover in usage to be a soft change – theoretically RO and apps could be switched… |
Richard Russell (1920) 95 posts |
Maybe not on the 6502, but the Z80 could read all 16 bits in one instruction, which Sophie must have been aware of. Given a free choice, little-endian to match the native byte-order of both the 6502 and Z80 would have been more logical IMO. It was sufficiently an issue that I chose to use a different program format for the Z80 version, and have stuck with it ever since. |
Richard Russell (1920) 95 posts |
That time machine must have come in handy: BBC BASIC 1981, HP-UX 1984, AIX 1986, Solaris 1992. I don’t think network use is relevant, but in any case I’m pretty sure Econet was little-endian. |
Steffen Huber (91) 1953 posts |
At least AIX on POWER as well as Solaris on SPARC is Big Endian and always has been. |
Richard Russell (1920) 95 posts |
PowerPC: 1992, SPARC: 1987. |
Steve Drain (222) 1620 posts |
I agree in principle, but as Richard points out, there are probably insurmountable differences and that inter-operation is not very likely. A shame. However, the closer the easier it is to convert one to the other.
That seems to be an excellent aim.
There’s the rub. There are so many things I might add to BASIC, before even thinking of AArch64, but like many before have found, there is a problem with the source. If I were to undertake it I would want to re-factor and fully comment the source. This would break the chain of development and would almost certainly not be accepted. A couple of years back, with Martin Avison’s encouragement, I attempted to do a partial job that would retain a byte compatibility with the assembled code. It was not very satisfying work and I stopped because of the tedium. Alongside I did some re-factoring to my own taste and I would still enjoy that.
Indeed. |
Richard Russell (1920) 95 posts |
Indeed, both key features of ARM32, and both abandoned in AArch64 (bringing this back on-topic). |
Steve Drain (222) 1620 posts |
RISC OS BASIC has only one byte for the line length and that seems to be tightly coded into the source. I suppose longer lines might be split with some clever logic, though. You might recollect that I chose ‘~’ to split lines in Basalt, but they were still limited in length. I kept ‘\’ as the breakout character to allow entry into the program, principally for structures. |
David Feugey (2125) 2709 posts |
I don’t find it yet. Just the BBCSDL project. I’ll check later.
Oh yes. And even if it’s sometimes different. |
Richard Russell (1920) 95 posts |
That’s the point. The physical line length is limited because of this, but by using the \ character logical lines can be any length in my BASICs (except for single-line IF … THEN … ELSE statements in which the ELSE must be on the first physical line).
|
Richard Russell (1920) 95 posts |
I’ve added the Console Mode editions to the BBCSDL project (because they share source files). |
Steve Drain (222) 1620 posts |
Got it! |
Chris Gransden (337) 1207 posts |
It does build OK for RISC OS but always gives the error ’Couldn’t allocate memory’ when run. Maybe the UnixLIb mmap emulation is not up to it. |
Richard Russell (1920) 95 posts |
It may well be that none of the |
jan de boer (472) 78 posts |
Line numbers being MSB first: reason could have been the program termination combination &D,&FF. A linenumber with LSB first, would be cumbersome for determining if the end of the program was reached (two byte checks needed), while MSB-first makes it possible to use all linenumbers until &FEFF with one check. |
Rick Murray (539) 13840 posts |
One can say much the same for C programs built using the DDE, so it’s more a really good argument against our continued dependence on a ~35 year old emulated floating point system.
Strings, as natively handled by BASIC, are limited in size. Some might say too limited. (and that’s asides from the general shonkiness that functions like strncpy may, if the size limit is reached, return invalid unterminated strings) The thing is, strings are harder than they look… |
GavinWraith (26) 1563 posts |
Amen to that. There is an unavoidable principle in computer science. How you represent something (i.e. what datatype should you use) depends on what you want to do with it. In arguing about programming languages most people, I am sorry to say, do not ask themselves such questions, being blinded by what they assume already and mistake for being universal solutions. By string what do you mean? Each programming language uses that word to mean something different. There are a number of things you can do with strings: print them, compare them in a number of ways, concatenate them, match them to patterns, extract data from them, …. . Tell me what your priorities are among these uses and only then can I tell you the most efficient datatype to use. But perhaps the use which you are actually prioritizing is it should be reminding me of the thrill I had all those years ago when I first programmed in .. ; that is to say nostalgia ? Be honest. Why this obsession with the past? Things have been learned since then. Look around. We should not be repeating choices made under constraints that no longer constrict us. |
Richard Russell (1920) 95 posts |
Strings, in my BASICs, have a 32-bit length. It’s limited (on a 64-bit system) but not in a meaningful way! ‘Unlimited’ length strings which can contain arbitrary data (i.e. there’s no reserved ‘terminator’ character such as NUL or CR) are incredibly useful. I routinely search multi-megabyte binary files using INSTR, or copy them like this:
|
Richard Russell (1920) 95 posts |
In my BASICs I put the line-length byte first (before the little-endian line number) so the end of the program is marked just as conveniently as &0D,&00. This also allows line numbers to go up to &FFFF rather than &FEFF because it’s the line length, not the line number, which marks the end of the program. |