Pi Model 3
Pages: 1 2
Chris Hall (132) 3559 posts |
Both !UnTarBZ2 (version 1.06a2) and !Netsurf (as in RC14) cause the model 3 to hang with no error message until you press ALT-Break and terminate them. LANMAN98 gives an AODT at address &FC18311C which is at offset &13470 in Shared C Library. Most puzzling – no clue what is causing it! The ZeroPain log is still empty. |
Chris Mahoney (1684) 2165 posts |
In NetSurf’s case, it’s due to a UnixLib issue. I’m not sure about UnTarBZ2. |
Chris Evans (457) 1614 posts |
are you using LM98 2.06? When does the AODT occur? |
David Pitt (102) 743 posts |
The internal bunzip2 tool uses the SharedUnixLibrary. By way of new business, attempting to build a ROM on the RPI3 failed at ‘ |
Rick Murray (539) 13851 posts |
Would there be some value in having the illegal instruction handler fake SWPs behaviour? |
Jeffrey Lee (213) 6048 posts |
Yeah, quite possibly. Another alternative would be to see if we can get the patcher to detect and patch the code sequences.
Reminder: It’s not SharedUnixLibrary that’s the issue, it’s the build of UnixLib which the programs are statically linked to. |
Jon Abbott (1421) 2651 posts |
SharedCLibrary uses deprecated ARMv7 instructions as well, such as R13 in LDM/STM reglists and STM with PC in reglist |
Jeffrey Lee (213) 6048 posts |
For ARMv8, it looks like the deprecations and behaviour of those instructions are about the same. (Actually they seem to be slightly less deprecated, e.g. STM where SP isn’t the base register but with SP in the register list is deprecated in ARMv7. But on ARMv8 it’s fine) |
Rick Murray (539) 13851 posts |
Isn’t that a basic APCS entry/exit? |
Jon Abbott (1421) 2651 posts |
Probably an errata in the ARMv8 documentation, you can’t undeprecate something that’s already been deprecated! We should probably note that although they’ve been deprecated, it doesn’t mean they’ve stop working. They may continue to work, or simply have unexpected results. It’s fairly clear where they’ve stopped working as it generates an undefined instruction eg. STR R0, [R0], #4 |
Jeffrey Lee (213) 6048 posts |
Yes, although Acorn’s and ARM’s APCS variants have drifted apart, so I don’t think ARM’s APCS has required stack frames to be constructed in that manner for quite some time.
<Pantomime voice>Oh yes you can!</Pantomime voice> IIRC ARM deprecated a bunch of stuff in the revision B ARMv7-AR ARM, and then went back on some of it for later versions of the document after they received too many complaints from the community (I think this may have been related to the use of SP as a general-purpose register).
Now that would be an erratum in the documentation if it says something works which doesn’t.
The pseudocode in the ARMv7 ARM says this: if wback && (n == 15 || n == t) then UNPREDICTABLE; So yes, it’s expected for that instruction to not work. But as we all know different processors react in different ways when faced with instructions that are architecturally unpredictable (some may store R0, some may store R0+4, some may store the answer to the meaning of life the universe and everything, some may abort) |
John Williams (567) 768 posts |
May a pedant make a plea for an erratum, many errata. I do think it’s over the top to disallow curriculums, as that is now an adopted word, but I do feel that erratum/errata is still Latin? Sorry! |
Jeffrey Lee (213) 6048 posts |
Erratum noted :-) |
Jon Abbott (1421) 2651 posts |
So now we can’t trust the documentation! The pseudocode in the ARMv7 ARM says this: Note it says UNPREDICTABLE, what it actually does is Undefined Instruction (on the Pi2 at any rate, I don’t have a Pi3 to test) |
Rick Murray (539) 13851 posts |
…UNPREDICTABLE!!!!11one11!!
As Jeffrey says:
This is a Pi(1) model B; and note that writeback with Rd=Rm being unpredictable is mentioned in the ARM ARM tome (as in, the big fat 2nd edition that got made into a real book), so it isn’t as if it’s a new issue (the book way predates UAL). |
Jeffrey Lee (213) 6048 posts |
Yes, that’s perfectly valid. Check the glossary in the ARM ARM :-) UNPREDICTABLE I think some CPU manuals also document how some of the unpredictable instructions are implemented – perhaps just whether they trigger an undefined instruction abort or not, since “UNPREDICTABLE behavior must not be documented or promoted as having a defined effect”. |
Jon Abbott (1421) 2651 posts |
Ah, my mistake, I didn’t release the word had been redefined. Getting back on topic…how on earth are we going to make RISCOS, 3rd party Modules and Software ARMv7 compatible? From the issues I’m seeing, it looks worse than previous compatibility issues as behaviour isn’t predictable between ARM core implementations. Is one generic Pi build feasible going forward, without recoding to take the worst case scenario? This would entail monumental recompiles/rewrites of the majority of high-level language software and a new APCS standard to avoid the R13 and R14/PC LDM/STM issue. SharedCLib and UnixLib are going to require a major overhaul and that’s probably the tip of the iceberg. |
Rick Murray (539) 13851 posts |
Have they really deprecated writing R13, or is it only if the base register is R13? If the latter then we ought to be okay as (from memory), APCS uses a different register as the base. IP? FP? I forget which. I believe writing PC is only informational, to aid backtrace. It could be deprecated, but given that it is there for a reason, might be better to see how the current APCS does it. |
Jeffrey Lee (213) 6048 posts |
Is that a typo for ARMv8? RISC OS has been (mostly) ARMv7 compatible for years now.
Short term? Yes. We lose out on some things (e.g. lots of code will avoid using ARMv7-only instructions in order to avoid the hassle of doing a runtime check for whether they’s supported), but on the whole ARMv6 has enough similarities to ARMv7+ that it’s possible to have one OS build that services all three architectures. Long term? (multi-core and beyond) Possibly not – the kernel could get pretty messy if it needs to deal with too many different architectures at runtime. Although if we’re going to continue to support such a large range of architectures in general (ARMv3 through to ARMv8) then we should probably focus more on restructuring the kernel source to make this easier. In which case we might find that the source restructuring would also make it easier to support different architectures at runtime (e.g. use function pointers to switch between different implementations of key routines, similar to how the ARMops are handled).
What LDM/STM issue? I’ll admit I haven’t had a chance to do much testing with the Pi 3 yet, but it’s my understanding that they haven’t changed the behaviour of LDM/STM with R13/R14/PC in the register list. It’s still deprecated, definitely, but just because it’s deprecated it doesn’t mean that it won’t work. They’re just reserving the right to remove support for it in future.
If they changed LDM/STM behaviour then yes, the SCL and any code which uses it would be an issue. I’m not quite sure what the situation is with UnixLib – I think they may have drifted away from ye olde APCS and towards newer versions.
http://infocenter.arm.com/help/topic/com.arm.doc.subset.swdev.abi/index.html There are quite a few documents to read through – I’m not sure which (if any) defines the stack frame format. |
Jeffrey Lee (213) 6048 posts |
I believe writing PC is only informational, to aid backtrace. It could be deprecated, but given that it is there for a reason, might be better to see how the current APCS does it. Answer: It looks like a mandatory stack frame layout has been dropped from the standard (along with any standard way of defining stack limits or supporting chunked stacks). Instead it’s expected that debuggers which need to unwind the stack do so via DWARF debug_frame sections, and that language exceptions (i.e. C++ exceptions) use a set of unwinding tables that’s based around those used by Itanium (and which can only deal with “soft” exceptions – hardware-triggered exceptions like floating point exceptions or data aborts aren’t catered for) |
Rick Murray (539) 13851 posts |
Serious question: Are ARM losing the plot somewhat? |
Jeffrey Lee (213) 6048 posts |
Not to my knowledge. What’s so upsetting? |
Rick Murray (539) 13851 posts |
A mere few days ago, Fred Graute said it best: Alas it’s not really RISC anymore, it’s more like a Z80 on steroids. That’s sort of what I thought when browsing the ARM64 1 documentation. Anyway, I’m not that fussed that Perhaps what is more upsetting is the removal of conditional execution. I see this as one of the things that gave ARM real strength and made it unique. The instructions that can act upon the flags following a Why are the registers Wx and Xx? Wouldn’t it have been more logical to retain the ‘Rx’ nomenclature for 32 bit access? Sure, it is Word and eXtended (couldn’t use Doubleword as D is claimed by the FP for Double precision). It looks like It seems only a number of instructions recognise “
Incidentally, instructions that don’t support Special/preserved registers are scattered around in the register allocation. One of the nice things about the 32 bit APCS-R was that it was a fairly clean implementation. The low registers were trashable. Then came ones that could be used if they were preserved. Towards the end, the frame, stack, link, and current instruction locations. Consider these ARM implementation details. Useful stuff added, to keep it balanced: More flexibility with ARM and NEON/VFP. It’s more like part of the ARM rather than a separate “co-processor”. More registers, always useful. And they’re proper registers, not the microcontroller style cop-out where they claim loads and loads of registers but they really mean the first bit of memory acting as registers. It looks as if a PC-relative address can be generated with a 4GiB range in two instructions without using a literal value loaded from memory; plus literal pool addresses can now be +/- 1MiB for loads/stored or 128MiB for jumps. So maybe no more ADR/ADRL distinctions. Finally we have integer division. ;-)
<br/ > Looking at this from a greater distance, it seems weird and alien to somebody used to writing assembler, though there is a sort of logic going on if you see it through the eyes of a compiler. I guess ARM are hoping to streamline the instruction set and chuck away some of the eccentricities to get a processor that can be better optimised for in compiled code; even if it looks and feels icky in disassembled form. Though I wonder how many people can still read machine level code or even care if it is ARM or MIPS, so long as it works well enough to make all the cool stuff users want to happen actually sort2 of happen. In terms of an assembler writer, I think the things that will cause the most pain are:
1 A rant from Linus Torvalds – expect steam and smoke… 2 Small qualifier – as it is not so useful to have HD videos if watching ‘em can pwn the device because the media library doesn’t bother to adequately sanitise inputs; plus operating systems / devices where it’s a snowball’s chance in hell you will see an update to the firmware – okay, it’s not ARM’s fault, but it does affect the “cool stuff” rating… |
Kuemmel (439) 384 posts |
@Rick: We’ve integer division already for all boards using Cortex A7/A15 and of course now A53 as it’s part of some ARMv7 implementations. It’s also supported by the inline BASIC assembler. Regarding ARMv8 and NEON/VFP I’m a bit annoyed as they stopped the mapping of Sx,Dx and Qx registers. Now they are different sets. The mapping allowed some tricks that are now obsolete and of course if one used that, all NEON code has to be rewritten for ARMv8 therefore. But the real big bonus for me is finally double precision for NEON ! I agree totally that they give up a lot of things ARM assembler is fun about…though if it really pays off in terms of speed/efficiency I’m willing to adapt my coder’s mind to it ;-) …but I guess it’s for sure that we will stick around witch AARCH32 for quite a while to enjoy the old-school ARM. |
Jeffrey Lee (213) 6048 posts |
What’s so upsetting? They’re two different instruction sets designed for two very different CPUs. What made sense for a CPU with a 3-stage pipeline and no cache doesn’t make so much sense for a superscalar CPU with ~10 pipeline stages, multiple cache levels, multiple cores, out-of-order execution, branch prediction, speculative instruction and data fetching, etc. There are also two different approaches to programming in play – when the ARM1 was being designed the only way to get acceptable performance out of a CPU was to write everything directly in assembler. But as CPUs got faster and compilers got better people realised that in most cases writing good code quickly is more valuable than writing fast and buggy code slowly. ARM may have lost it’s ARM-ness, but that’s not a sign that ARM have lost a plot – far from it. Making such a big change to the architecture is a sign that they recognise that the original architecture doesn’t fit well with the kind of optimisations that modern CPU implementations are founded on (more pipelines, caches, out-of-order execution, etc. – they’re all optimisations designed to squeeze more performance out of code). If they’d simply made AArch64 the same as AArch64 but with wider registers then that would have been a sign that they’d lost the plot. Also, be glad that they didn’t go down the Itanium route, where instructions are organised into packets which must be scheduled at compile time in order to avoid any data/pipeline hazards – I can’t imagine anyone wanting to try manually writing assembler for that. |
Pages: 1 2