armv8 - modules
Michael Grunditz (467) 531 posts |
Hi I am trying to run modules on a armv8 target in aarch32 mode. Many of them fails with abort on instruction fetch.’ RPi 3 runs on armv8. Is there anything I need to do for this? |
Michael Grunditz (467) 531 posts |
Modules that fails: and UnSquezeAIF gives me errors after bootup ,can’t run BASIC and the “supervisor” string isn’t printed. ( can this be related to printing to screen?) |
Jon Abbott (1421) 2651 posts |
Although the Pi3 is AArch32, there are some instructions which were deprecated in ARMv7 but still work. You may find the aborts you’re seeing are due to either depreciated instructions, or implementation defined behaviour. What are the instructions that are triggering the Aborts? You might have fun trying to trace them being prefetch aborts, but R14 might give you a clue if you dump the registers. |
Michael Grunditz (467) 531 posts |
Do I need a new handler for this? Or does riscos include it somehow… I don’t know where it fails so it is a bit pointless to try dumping from inside the module. |
Jeffrey Lee (213) 6048 posts |
Abort on instruction fetch implies it’s a memory management issue, rather than an invalid instruction. DADebug is one of the simplest modules you can get, so worst-case it shouldn’t be too hard to narrow down the problem by disabling bits of code. One thing I can think of which all of those modules have in common is that they all write code to memory – so maybe it’s a problem with OS_SynchroniseCodeAreas (i.e. the IMB_Range and IMB_Full ARMops). What model of ARM CPU is it? Just as we’ve had problems due to the many differences with ARMv7 implementations, it wouldn’t surprise me if we’re going to start seeing problems due to the many different ARMv8 implementations (especially since AArch32 is starting to get phased out) |
Michael Grunditz (467) 531 posts |
I did think so at first , but I added printouts in EHCI around them and they all run without aborts. A53 is the cpu. Currently I am running in EL1 , is that the correct level? I do the switch in u-boot. |
Michael Grunditz (467) 531 posts |
Ok So EHCI crahes in: (*usb_ca.ca_attach)(0, softc, bus); Like it did in the RK port. My memory is short :( |
Michael Grunditz (467) 531 posts |
With RK it was timer/irq related. But in this case timer and interrupt hal works as expected. |
Michael Grunditz (467) 531 posts |
DADebug fails right here: ADR R1, DefaultStartString01 LDRB R0, [R1], #1 |
Michael Grunditz (467) 531 posts |
was the problem, changed adr to ldr and it went thru.. However when Debug is on for USBDriver I get abort in that one and that didn’t happen before. I get abort from DADebug when it is loaded,, just every time it wants to print with ticket. |
Jon Abbott (1421) 2651 posts |
It possible line 266 is the cause and the ADR being reported is just a side effect. |
Jeffrey Lee (213) 6048 posts |
Cortex-A53 is the same CPU as in the ARMv8 Pi models, so in theory there shouldn’t be any problems. But, the errata list is quite long – so maybe there’s something there which is affecting your system but not the Pi. What revision CPU is it, and are you aware of any errata workarounds which are performed by the bootloader or by your code? The Pi uses r0p4, and I don’t think we have any errata workarounds for the Cortex-A53 in either the HAL or kernel (they’re either being dealt with by the firmware, or they’re not serious enough to cause any problems. Although I think the errata list has grown since I last looked at it, so it’s probably worth me double-checking it at some point) ADR R1, DefaultStartString Unfortunately you’ve also completely changed the operation of the function. Now instead of copying “Debug start” to the buffer it’ll be copying complete garbage. There’s a reason why “You have a strong knowledge of C and ARM assembler” is the first thing on the list of requirements when porting RISC OS to a new platform. |
Michael Grunditz (467) 531 posts |
Oki I should perhaps hide under a rock… It is just that I have used LDR R1,=string a lot. I need to read up why it doesn’t make R1 to a pointer to the string. Or for make another example, if I have a bitmap , do LDR R1,=bitmap and start working on it. Update… http://infocenter.arm.com/help/topic/com.arm.doc.dui0068b/Bgbbihdc.html I can’t see the obvius error… My very uneducated guess is that for some reason ADR gets out of range ADR can’t do long references. ALlthogh that is usualy catched by the compiler. But if that can happen during runtime it would explain some weirdness.. |
Jeffrey Lee (213) 6048 posts |
Yes, ADR range errors should be caught by the assembler. ADR is basically just a more user-friendly way of writing “ADD Rdest, PC, #xxx”, so unless the assembler is broken there’s not much which can go wrong (and it definitely won’t cause an abort) “LDR Rdest, =XXXX” will translate to “MOV Rdest,#XXXX” if the number is small representable as a constant. If not, the assembler will stick the XXXX value in a literal pool and use a PC-relative LDR to load it. Regular “LDR Rdest, XXXX” is a user-friendly way of loading PC-relative data (i.e. “LDR Rdest, [PC, #yyyy]”) – so if you simply changed “ADR R1, DefaultStartString” to “LDR R1, DefaultStartString” then it would load the first four bytes of “Debug start” into R1, which would be a complete garbage address. If you’d changed it to LDR R1, =DefaultStartString then I think it would have ended up setting R1 to a small number – because (I think) modules are assembled as if they start at address zero. Using LDR R1, =Label is fine in the HAL and kernel because the assembler is told what the correct base address will be. For other modules it generally isn’t used since it complicates things when it comes to creating softloadable builds of the module (plus for most modules the build system won’t tell the assembler the base address of the module within the ROM). N.B. LDR R1, =Label is only fine in the HAL/kernel for code that’s run after the MMU has been enabled. Before then, you’d almost always want to stick to ADR/ADRL because either you don’t know what address the ROM image is going to be loaded at, or the build system hasn’t been told the address. |
Michael Grunditz (467) 531 posts |
This was good info. But if it is a ROM module, doesn’t it have a fixed address? I mean if I get a error from a module I usually can trace it down by the address. But I still don’t understand why ADR in this case causes a abort. I don’t know if this implicates here but with armv8 it seems like treating PC as a general register is a no no.
Yes ofc. Btw I am still stubborn and keeps my rom at bottom of RAM. So if the module copies the code to random addresses the addressing can be very long when branching. The RAM is also very high up in address space , starts at &40000000 , which means if I can understand this , top of the ram is a no no. |
Jeffrey Lee (213) 6048 posts |
It does have a fixed address, but the build system usually won’t tell the assembler what that address is. It’s the difference between the “C” and “ASM” types in ModuleDB – “C” components get told the address, because the C compiler is incapable of producing read-only position-independent code. But “ASM” components don’t get told the address, and I don’t really know why. Maybe it was a limitation of the assembler that was used before everything was switched over to ObjAsm. |
Colin Ferris (399) 1814 posts |
Just not trying to teach Granny to suck eggs! But what is the ARM instruction – 2 before – the one being flagged? |
Jon Abbott (1421) 2651 posts |
The code in question is:
It’s line 277 that’s triggering the abort, changing the ADR on line 273 is simply causing line 275 to set EQ and bypass the branch. So the value of R12 is the issue. Either it’s not being set correctly, or there’s no accessible code at R12. If R12 is not being loaded correctly, SETPSR on line 266 is the likely cause. If R12 is being loaded correctly then the CPU can’t access the code at R12, either because:
|
Rick Murray (539) 13840 posts |
R12 OS a pointer to the module’s private word, which is set at module initialisation. All calls to modules should set R12 (you don’t explicitly load it). It is indirected to get the workspace address (the LDR R12, [R12]). So the question becomes – what is R12 before and after that LDR? If that checks out, what is the WriteC offset there pointing to? That said, if “many” modules are crashing for a similar reason, it sort of implies that the OS may not be handling R12 correctly – either seeing it up for a call, or retrieving it after module initialisation…? |
Colin Ferris (399) 1814 posts |
How are you debugging? Is there any kind of screen output? [edit 1] 275: TEQ R0, #0 |
Michael Grunditz (467) 531 posts |
Yes and yes. But I have a strange feeling about Synch code areas. what happens if it runs without error but isn’t doing anything? How does other modules communicate with it? I guess that they access the DA_WriteC. The ticket thing calls the printout function. BAH. I need to learn not to do critical debugging 3-5am :) OS_Module : Is EHCI using that to enter USBDriver? OS_Module not working can be the error. |
Jeffrey Lee (213) 6048 posts |
Try making sure that bits 19 & 20 of the system control register are clear. On ARMv7/v8 they’re used to control whether code can be executed from writable memory. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0500g/BABJAHDA.html In particular, since the OS is at least half-working, I’m guessing that bit 19 is clear but bit 20 is set. |
Michael Grunditz (467) 531 posts |
LOL CRAP , I had that commented out! WHy oh why … well everything works now . Thanks and sorry for the mess. |
Rick Murray (539) 13840 posts |
Hey, don’t sweat it. We can learn as much from our mistakes as the stuff we do right. If somebody else runs into a similar situation, it’ll now be something “we’ve seen before”. ;-) |
Michael Grunditz (467) 531 posts |
Everything works I am up and running, stay tuned for a proper anouncement. This is supercool! |