Crosscompiling to Orange Pi PC
Pages: 1 2
Tristan M. (2946) 1039 posts |
This doesn’t really go anywhere and isn’t worth putting in the Orange Pi thread. I just thought I’d share what I’ve worked out so far. First, I grabbed the arm-none-eabi gcc toolchain for building bare metal ARM binaries. blink.zip on the linked forum page is a version which has been modified from the Raspberry Pi version of the project. It makes a good sanity check for the toolchain. Just do a “make clean” and “make”. Funny think I noticed is the binary I built is twice the size of the one in the .zip. Weird. Next up I found dumping the suitable Armbian image on an SD card is the way to go for a start as U-boot already contains the commands needed for loading an arbitrary binary. Things got a little weird after this. The type of program I need hasn’t really been made in maybe 20 years. A terminal that handles per character tx / rx, and has X/Y/Zmodem and Kermit protocols. I have an FLT232 USB UART connected to the debug UART pins of the Orange Pi PC. the computer talking to it needs to be set to 115200bps. What I had to do was turn on the OPi PC whilst bashing a key in PuTTY furiously to interrupt boot. Then I had to close PuTTY and open CuteCom, type in “xload 0×41000000 115200” (I think that’s the right way round) then upload blink.bin using XMODEM. Y and ZMODEM commands are supported in the way you’d think. It’s also possible to put a .bin in the boot partition of the SD card and load that, but it’d involve either card swapping or extra rebooting / loading the OS. Pretty sure U-Boot can be set to wait indefinitely but I also use Armbian so I don’t want to mess with that. it’d probably avoid needing the PuTTy stage though. Step 0 complete. Booting to an arbitrary binary. I didn’t work out anything new. Just collated a few bits of scattered information and got it to work. |
Rick Murray (539) 13850 posts |
RISC OS is reasonably well endowed in this respect: Nettle, Hearsay, Connector… |
Tristan M. (2946) 1039 posts |
Rick, I agree totally! Problem is I’m working with the gcc toolchain I mentioned, which I do not have on RISC OS. I mean I could set up an nfs share on the PC (or Pi) running Linux, then use that, but it just seems a little convoluted. My Raspberry Pi 3 grew too big for it’s case so it’s elastic banded on to the top of the acrylic Pi 3 case that I mangled to hold the Orange Pi PC. The Pi 3 has a PiFi DAC+ v2.0 on it now. That’s why it’s too tall. I suppose if someone were using the correct toolchain to build RISC OS for the Orange Pi PC it’d be a very nice and compact dev environment. Pocket sized even. Looking at the H3 datasheet again yesterday there is a little more information that I thought. It’s just hard to unpick because it involves jumping around the document a lot. Between that and a few discussions around on the framebuffer there seems to be enough to possibly glean how to initialise it for HDMI and possibly composite. There is one thing that has been troubling me from the beginning. CPU throttling. It’s a capable chipset but it needs some hand holding. From what I’ve read, legacy kernels run the SoC @ 1.6GHz maximum. It’s really 1.2GHz, but can be taken up to 1.6GHz. Plus it’s just a hot chip. The correct pll(s) should probably be set low to be safe initially. |
Tristan M. (2946) 1039 posts |
I tried putting a daily mainline kernel build of Armbian on the SD card instead of the legacy sunxi kernel. It was an interesting experiment. The framebuffer terminal was running at the correct resolution for my DVI monitor on first boot as detected by U-boot. This gives me some hope. It doesn’t seem to be able to run an X server but that’s no big deal. |
Tristan M. (2946) 1039 posts |
A little update. Never got around to trying to write to the fb in Linux. Instead I used the “mw”(memory write) command in the U-boot prompt to draw a horizontal line on the screen! I really have to grab the source for one of the OMAP ports and see how much would actually need to be changed for the initial pre-boot to work. |
Tristan M. (2946) 1039 posts |
I forgot where I took note of the fb address range. Such a shame. Over the past few days I spent a few minutes on a little idea of building a raw binary for the Orange Pi PC using gcc on RISC OS. I used the “blink” demo program I found on the Orange Pi forum as a basis for it as I know it compiles and works using arm-eabi-none. Why am I doing this? Good question. Not entirely sure myself. I think it may be useful somehow. It’d be nice if gcc could be invoked for building the HAL but I’m not even sure that it could be made to generate something that could be talked to by RO, or even do the start call. The example program that I’m messing with outputs “Hello, World” to the UART. It doesn’t use any libraries. Just some simple routines for the UART. If anyone is out there screaming “DDE could do this!” please tell me how! I haven’t been able to find anything on building raw executable code with Norcroft C. When I get a chance I’ll check the binary on the OPi PC. I predict it will collapse in a heap because of the hand waving I did to avoid the stack checking issue, but if it manages anything it’s a great start. |
Tristan M. (2946) 1039 posts |
After revisiting the stack overflow issue and implementing a subset of the ARM recommended method I loaded the binary on the OPiPC via the U-Boot serial terminal. This is a good thing. It means that !GCC can build a raw OS-less program for the Orange Pi PC. Incidentally the means I used to handle stack overflows is an infinite loop. There’s no real way to report an error if the stack has been smashed in a “Hello World” demo. |
Tristan M. (2946) 1039 posts |
I’ve been leaving a trail of small, mangled projects in my wake for the sake of self education. There’s so many different ways to do things! Most of them use the “blink” program as a template because it works. Putting this here as much for my future reference as anyone else’s. My goal with this mess is to use it to get everything needed for the HAL working, using the premade simple hardware accessor functions as much as possible, and avoiding magic numbers so I can focus on getting all the hardware to talk to me. |
Alan Robertson (52) 420 posts |
Very interesting. Sounds like you are making good progress towards your goal. Looking forward to seeing how you get on. |
Tristan M. (2946) 1039 posts |
Thanks! I hope quite well. My plan is to do what I normally do. Restrict direct hardware / memory address access to as few places in the code as humanly possible. It’s way easier to debug. |
Tristan M. (2946) 1039 posts |
A little update. I’ve found I can’t use cortex A7 compiler flags among other things with the RISC OS C compiler. Oh well. The Orange Pi PC “Blink” demo has just about been completely replaced by other code and extended. It’s modelled off the RO HAL, albeit a little loosely. I’m trying to get some basic functions of some of the hardware happening. I’m trying to use HAL function names and reasonably similar file structure. One weird thing has got me confused. When I chaned “clean” in the makefile to work with RO, I got it working, then it didn’t. It said that rm wasn’t found. |
Jeffrey Lee (213) 6048 posts |
Apparently there’s an ‘rm’ utility in the RISC OS build system (Library.Unix.rm); maybe it’s that which you were using? (if you’d selected a build environment via !Builder then it would have added it to your path) |
Tristan M. (2946) 1039 posts |
Hmm. Perhaps that was the case. I can’t remember if I had !Builder open before I rebooted. It may explain why the syntax which worked for rm broke later. My last post got interrupted and I lost my train of thought. What I was intending to say is once the original example code is banished and I’ve made sure the little bit I’ve done works I’ll throw it in a zip and upload it for the curious. Thinking about my attempts to port the OMAP3 HAL, I think I’m dead wrong. Perhaps some of the things like USB can be used with some modification, but the core of the H3 is so very, very different. Speaking of cores, it only just dawned on me that both the RPi3 and OPiPC2 are A53 based. |
Tristan M. (2946) 1039 posts |
I’ve got the OPi PC and the RPi B UARTs directly connected. It took a bit fo time to track an issue but comms work in both directions now. It seemed to start working after I turned on the pullup on the RPi’s UART Tx pin. |
Tristan M. (2946) 1039 posts |
Can’t work out the !Connector issue. It’s sending the right number of bytes and reports no errors. I guess all I can do is transfer a .bin then dump that memory region to see what’s changed. I reconfigured the setup to use the USB UART and PuTTY / CuteCom again. Tried a few different versions of the .bin that failed via !Connector. They all worked. Over the last couple of days I’ve been looking at gcc intermediate assembly files, objdump disassemblies and misc other disassemblies after compiling with different parameters. It may be quite possible to have relocatable code generated by the RO gcc. In my testing I also tried loading the code to other sane addresses and running it. It almost worked. The disassembly showed why. |
Tristan M. (2946) 1039 posts |
I can’t remember what I said and it’s all tl;dr really. I haven’t really been near it in a couple of weeks at least. No time. However progress has been made. What I have learned: !GCC can be used in the RO build system with some compiler option fiddling. The resulting objects are compatible and the symbols seem to be compatible, however the DDE ‘Link’ is a little less flexible with object formatting. asasm in the current version of ~GCC is broken. It has problems with vfp flags. An version from an older version of !GCC needs to be copied in. gcc ld works fine to link the objects. In my port attempt I have both link and ld outputting their own version so I can compare code. I managed to get the source tree to work via nfs. Big explanation but really the important thing is to make Sunfish add the ,xxx extensions to all files, make sure it’s set up to be case insensitive, and have symbol conversion set up. gcc was playing nice with the dependency file generation and perl script to parse the dependencies. It’s kind of cool to see the makefile being populated by dynamic dependencies from a few dissimilar compilers / assemblers. Just now I noticed some things were still pointing to my USB drive and not the correct mapped nfs directory so I re-ran prepare. Now I need to fix the module databases etc again. Did I mention because I have it stored on my NAS (Orange Pi Zero) I also have git running on it, and a !Nettle ssh session so I can do git commits etc? Very useful for this. Something else to note. It seems that having some static dependencies set that can be dynamic may interfere with other dynamic dependencies being generated. I commented out the static ones in the makefile and ran it to see what would happen and a way more complete set of dynamic dependencies was generated. |
Michael Grunditz (467) 531 posts |
Link can give you pure binaries. But I never tried to do baremetal c with riscos only assembler. ObjAsm is great! The free clone also does work, but I don’t know if all spiffy features are there. I use !Connector for my serial terminal needs. |
Tristan M. (2946) 1039 posts |
I do suspect that my issues with direct SBC to SBC connection is the need for pullups. I’ve considered getting a couple of 3v3<→RS232C adapters or even connecting the serial pins to an esp8266 and connecting to it via a network port using telnet via WiFi(It does work. I used to do it with the Pi Zero in Linux until I connected the SPI ethernet adapter, then the USB ethernet adapter.) There’s something unknown wrong with X/Y/Zmodem transfers directly between RPi and OPi. Aaaand it just dawned on me that I’m using my Pi3 currently because the Zero failed and the B is pretty slow and kind of unstable. My Pi 3 has a DAC board on it. It already has an RTC solder piggybacked to the DAC so I guess I could go and piggyback a couple of wires for UART too. Getting the RISC OS gcc to work like the “none” gcc only really required routines for stack overflows to be defined and a few flags to stop it including a runtime library etc. I’m slowly progressing getting a HAL up and running using my mess as my extremely limited time allows. Given that the RO build tree is still pretty mysterious to me it has been a challenge and a learning experience. I’ve been relying on the source trees from multiple ports and the comments in the source to work out how things work. |
Tristan M. (2946) 1039 posts |
Frustration! I can’t seem to build anything successfully. It took days before I could get the RPi SMP branch to build. I can’t build any of my code successfully any more, or even the untouched OMAP3 source. Then there’s my file transfer issue. Both !FTDI and SerialUSB seem to load drivers, or something, but I can’t do anything useful with them. I could actually talk to the U-Boot prompt using SerialUSB and the demo program via an ANSI session on !Nettle (if I recall correctly), but that’s nearly useless because I have no means of doing a file transfer. I could copy the HAL binary to the boot partition, then re/boot the OPiPC and work from there but it’s fiddly at best. Interestingly, I had better success building RO using the Linux port on aarch64 than I did on the Pi 3. The ROM join failed with egrep, but I couldn’t have been bothered going through doing all the grep -E substitutions. It’s all very frustrating, especially because I’ve started to untangle the HAL code I wrote so far, now that I’ve felt out the limits of what can and can’t be achieved. I know it’s reinventing the wheel trying to do the HAL with GCC. I have however seen that I can use RO to write pre-init code with extremely minimal assembly. What I’m doing now is using ObjAsm, DDE Linker, and GCC together. The DDE linker isn’t as versatile as ld, however it can accept objects generated with ObjAsm and use them if some care is taken. Why did I disable ld… hmm. My source was generating a DDE and ld linked binaries. I had to disable ld for some reason. e: I got “irritated” and purchased “SystemDisc” early this morning. When the order is processed I’m going to make a nice big fresh OS image with all the build tools on it. If things go a bit peculiar a new copy can just be dumped to card. Trying t track down what obscure thing is tripping over a complex build is maddening. |
Jeffrey Lee (213) 6048 posts |
You may be stuck with a choice between “fiddly” and “slow”. Transferring a HAL over serial shouldn’t take too long, but once you’re building full ROM images it would be madness to use plain serial. If the OPi supports USB device mode then you might be able to use a direct USB CDC connection to U-Boot – although my past experience with that hasn’t been great Other than that, the fastest option is likely to be using a card reader. |
Rick Murray (539) 13850 posts |
No, madness would be using a parallel port JTAG interface. ;-) |
Tristan M. (2946) 1039 posts |
You say this to a person who installed Windows 95 via serial cable.
That would be an absolute last resort. I’m not fond of handling MicroSD cards, and often resort to using tweezers, especially when it comes to getting them out of an RPi! At least the OPi has a spring loaded launch mechanism.
It’s also wrong. I can’t provide figures but I know they have a much higher data throughput. Even using one with an X1641 cable over the slowest BUS in history the throughput is still higher. I also have a Trantor parallel – SCSI adapter somewhere that used to perform acceptably. Oh! then there’s parallel Zip drives. Just saying is all. Probably doing a network copy into linux, dumping into the boot partition and rebooting would be the fastest “acceptable” method. The USB CDC connection is one I didn’t really bother considering. I know it supports it, but it seems a bit of a pain. |
Jeffrey Lee (213) 6048 posts |
Actually, there’s another boot option that modern versions of u-boot support – TFTP. https://www.riscosopen.org/forum/forums/5/topics/1012 |
Rick Murray (539) 13850 posts |
I installed W95 from floppies. That probably wasn’t much faster when you factor disc changes and the fact that once you got beyond about disc 25, it seemed to request them randomly, including ones it had already asked for… :-/
It’s also wrong. I can’t provide figures but I know they have a much higher data throughput. Well, I had read that, but I wondered if it was correct. I have not located the source of my 40 minute time, however I did find this:
[source: https://wiki.openwrt.org/doc/hardware/port.jtag.cables] First up, we must consider how the interface is built. If the interface uses data pins to construct the interface, it ought to run fairly quickly but will likely need specific host support (some sort of high resolution timer) for the interface timing. I think “Wiggler” is like this. A simpler interface (like the Olimex one) will bang data out using the STROBE signal. In this case, it is fairly host agnostic as most parallel ports have a 0.5µS setup time, followed by asserting STROBE for 0.5µS. This is mandatory and performed by the interface chip (82C711 etc). The benefit to this is that since the hardware times itself, the code to drive the parallel port can be really simple as we don’t need to worry about timing, just push the data… This is the slowest option, so we’ll run with this. Now the problem is that RAM does not normally have JTAG support. As such, one must push the data into ARM registers, then use an instruction to write to registers to memory. To keep things easy, we will assume R0-R7 are data registers (8 in total, or 32 bytes). So, it looks like there are something like 20 bit transitions to change state (IR→DR etc). I will assume all state transitions take 20 bits as I really can’t be bothered reading the spec to work out how the chain actually works. I’m sure some transitions may be shorter, while others may take a parameter, so let’s just assume 20 for now to keep the maths easier. (=0) To write to the registers we need to set SCAN_N to IR (=20), then select chain 1 in DR (=40), then enable the chain in IR (=60), then set SYSSPEED=0 (=80) and then push the opcode for LDMIA xx,{R0-R7} (=102). Go to RTI for a clock to commit (=122) then set SYSSPEED=0 (=142) and push a NOP instruction into the pipeline (=162). Go to RTI for a clock to advance the earlier LDM from decode to execute state of the pipeline (=182). Clock RTI again to advance LDM to the memory stage of the pipeline (just assume no state change as we’re already in RTI). It may be that we need to set up the LDM/NOP in debug state in order to push data into the registers, and then set up LDM/NOP all over again in order to actually execute it. That sounds the more logical, but, to hell with it, this stuff is confusing, so let’s just say “about 800 bits” for each 32 bytes shoved up the JTAG bus. Feel free to refine this to a better number if you’re really bored. ;-) So, if it takes an interface using STROBE one microsecond per “transaction”, that means one can send 1,000,000 transactions per second. Now if it takes 800 such transactions to bit-bang the data we want, that means we can do this 1250 times per second. Which means if we are using R0-R7, or 32 bytes, and we can do this 1250 times per second, we can send about 40,000 bytes per second. It is worth noting that OpenWRT’s about 400,000 bits-per-second when divided by eight comes to 50,000 bytes per second. So… what gives here? If we can wobble 40,000 bytes per second (by running our 800 bit data push loop 1250 times), doesn’t this imply that one ought to be able to transfer 256K in a little under seven seconds? Now let’s assume I’m way off on all of my maths. Let’s assume it takes four times as many bits – 3200 bits to send 32 bytes. Now let’s also assume that overheads and stuff mean that our access to the parallel runs a quarter of the speed. Well, the maths here shouldn’t be too hard. Round up to seven seconds and multiply it by four, then by four again. This gives me 112 seconds. Or a perfectly reasonable value of a shade under two minutes to transfer 256K. Okay, I give up now. I think seven seconds for a bit-banged parallel port transferring 256K is a little fast (hell, sending a page to a fast Laser printer took longer, and that was with all eight bits available!). Two minutes sounds more feasible. But there’s no way I can munge these numbers into something that resembles 25 minutes. … Has anybody actually used a simple parallel port JTAG device? If so, how fast (or not) was it?
USB is a lot faster than parallel.
Zip and your SCSI adaptor will be able to use all eight bits, maybe a few more if it does fancy still with various control lines. When you are bit-bashing, you can take whatever you know and instantly divide by eight.
That’s what I’d do if I was going to use my older Pi as a testbed or something – I’d see if I can set up ShareFS or Samba – whatever plays nice with standard TCP/IP (which rules out AUN IIRC). Easier on the hardware, no card swapping, and easier on the user (draggy-droppy to the remote machine). |
Rick Murray (539) 13850 posts |
…and in the course of writing this, I discover that wrapping some text in + symbols makes it appear in underlined green for some reason. This is mentioned in the Textile reference linked below the text area, but it’s one of these things most of us probably paid no attention to. Why green? Why underlined? <shrug> |
Pages: 1 2