Crosscompiling to Orange Pi PC

35 posts, 7 voices

Pages: 1 2

Dec 12, 2016 12:54am Tristan M. (2946) 1039 posts	This doesn’t really go anywhere and isn’t worth putting in the Orange Pi thread. I just thought I’d share what I’ve worked out so far. First, I grabbed the arm-none-eabi gcc toolchain for building bare metal ARM binaries. blink.zip on the linked forum page is a version which has been modified from the Raspberry Pi version of the project. It makes a good sanity check for the toolchain. Just do a “make clean” and “make”. http://www.orangepi.org/orangepibbsen/forum.php?mod=viewthread&tid=1934 Funny think I noticed is the binary I built is twice the size of the one in the .zip. Weird. Next up I found dumping the suitable Armbian image on an SD card is the way to go for a start as U-boot already contains the commands needed for loading an arbitrary binary. Things got a little weird after this. The type of program I need hasn’t really been made in maybe 20 years. A terminal that handles per character tx / rx, and has X/Y/Zmodem and Kermit protocols. I have an FLT232 USB UART connected to the debug UART pins of the Orange Pi PC. the computer talking to it needs to be set to 115200bps. What I had to do was turn on the OPi PC whilst bashing a key in PuTTY furiously to interrupt boot. Then I had to close PuTTY and open CuteCom, type in “xload 0×41000000 115200” (I think that’s the right way round) then upload blink.bin using XMODEM. Y and ZMODEM commands are supported in the way you’d think. Next up just type in “go 0×41000000” and I’m greeted with “Hello World!” on the terminal and possibly a GPIO pin toggling when I press a key. Admittedly I didn’t bother because the message was enough. It’s also possible to put a .bin in the boot partition of the SD card and load that, but it’d involve either card swapping or extra rebooting / loading the OS. Pretty sure U-Boot can be set to wait indefinitely but I also use Armbian so I don’t want to mess with that. it’d probably avoid needing the PuTTy stage though. Step 0 complete. Booting to an arbitrary binary. I didn’t work out anything new. Just collated a few bits of scattered information and got it to work.

Dec 12, 2016 1:25pm Rick Murray (539) 13850 posts	The type of program I need hasn’t really been made in maybe 20 years. A terminal that handles per character tx / rx, and has X/Y/Zmodem and Kermit protocols. RISC OS is reasonably well endowed in this respect: Nettle, Hearsay, Connector…

Dec 12, 2016 9:39pm Tristan M. (2946) 1039 posts	Rick, I agree totally! Problem is I’m working with the gcc toolchain I mentioned, which I do not have on RISC OS. I mean I could set up an nfs share on the PC (or Pi) running Linux, then use that, but it just seems a little convoluted. Oh wait… Is there a driver for USB UARTs? My Raspberry Pi 3 grew too big for it’s case so it’s elastic banded on to the top of the acrylic Pi 3 case that I mangled to hold the Orange Pi PC. The Pi 3 has a PiFi DAC+ v2.0 on it now. That’s why it’s too tall. I suppose if someone were using the correct toolchain to build RISC OS for the Orange Pi PC it’d be a very nice and compact dev environment. Pocket sized even. GCC is fine for me for now because it means I have a way to poke at the hardware. Speaking of hardware, I grabbed that vector above from the thread in the link. I really don’t think it’s the correct base address though. Could have sworn I saw a different one somewhere else. Looking at the H3 datasheet again yesterday there is a little more information that I thought. It’s just hard to unpick because it involves jumping around the document a lot. Between that and a few discussions around on the framebuffer there seems to be enough to possibly glean how to initialise it for HDMI and possibly composite. There is one thing that has been troubling me from the beginning. CPU throttling. It’s a capable chipset but it needs some hand holding. From what I’ve read, legacy kernels run the SoC @ 1.6GHz maximum. It’s really 1.2GHz, but can be taken up to 1.6GHz. Plus it’s just a hot chip. The correct pll(s) should probably be set low to be safe initially.

Dec 13, 2016 12:15pm Tristan M. (2946) 1039 posts	I tried putting a daily mainline kernel build of Armbian on the SD card instead of the legacy sunxi kernel. It was an interesting experiment. The framebuffer terminal was running at the correct resolution for my DVI monitor on first boot as detected by U-boot. This gives me some hope. It doesn’t seem to be able to run an X server but that’s no big deal. dmesg was kind enough to have the address, mapped address (not all that useful), pixel format, buffer size etc for simple-framebuffer. When I get some time I’m going to try to mess with it with ioctl() and if that’s successful, mess with it in bare metal and see what happens. If I can toggle a pixel it’ll be a step in the right direction.

Dec 15, 2016 5:46am Tristan M. (2946) 1039 posts	A little update. Never got around to trying to write to the fb in Linux. Instead I used the “mw”(memory write) command in the U-boot prompt to draw a horizontal line on the screen! I really have to grab the source for one of the OMAP ports and see how much would actually need to be changed for the initial pre-boot to work.

Apr 29, 2017 12:35am Tristan M. (2946) 1039 posts	I forgot where I took note of the fb address range. Such a shame. Over the past few days I spent a few minutes on a little idea of building a raw binary for the Orange Pi PC using gcc on RISC OS. I used the “blink” demo program I found on the Orange Pi forum as a basis for it as I know it compiles and works using arm-eabi-none. It seems to work in arm-unknown-riscos with some minor tweaking. The binary looks sane from a quick look at it in Zap. Right start address etc. Somehow I mislaid what I did and had to recreate it today. It only took a few minutes. Still trying to work out if I can get it to build without nedding to make a stub for __rt_stkovf_split_small. Again not sure what happened to my work but I know last night -fstack-check=no didn’t seem to help. I tried it again today and the resultant binary was a slightly differnet size. The stubs are still in the .s file so I haven’t tested that yet. Why am I doing this? Good question. Not entirely sure myself. I think it may be useful somehow. It’d be nice if gcc could be invoked for building the HAL but I’m not even sure that it could be made to generate something that could be talked to by RO, or even do the start call. The example program that I’m messing with outputs “Hello, World” to the UART. It doesn’t use any libraries. Just some simple routines for the UART. If anyone is out there screaming “DDE could do this!” please tell me how! I haven’t been able to find anything on building raw executable code with Norcroft C. When I get a chance I’ll check the binary on the OPi PC. I predict it will collapse in a heap because of the hand waving I did to avoid the stack checking issue, but if it manages anything it’s a great start.

May 4, 2017 3:24am Tristan M. (2946) 1039 posts	After revisiting the stack overflow issue and implementing a subset of the ARM recommended method I loaded the binary on the OPiPC via the U-Boot serial terminal. It displayed “Hello, World” as it should via UART. This is a good thing. It means that !GCC can build a raw OS-less program for the Orange Pi PC. Incidentally the means I used to handle stack overflows is an infinite loop. There’s no real way to report an error if the stack has been smashed in a “Hello World” demo.

May 6, 2017 10:54pm Tristan M. (2946) 1039 posts	I’ve been leaving a trail of small, mangled projects in my wake for the sake of self education. There’s so many different ways to do things! Most of them use the “blink” program as a template because it works. Last night I wrote some C source which consists of peek and poke functions of different widths. Easy to do in assembly and definitely way more efficient but the point was to write code that does something in C, try to get it into the makefile, and to build the whole lot. Putting this here as much for my future reference as anyone else’s. What I need to do in gcc is to build each C and asm source file with gcc individually as a .o object, using the -c flag so no linking occurs. After that they can be slapped together with the linker, including memmap and vectors. Because they are linked manually later on, it pays to have the header for bar.c #included in foo.c My goal with this mess is to use it to get everything needed for the HAL working, using the premade simple hardware accessor functions as much as possible, and avoiding magic numbers so I can focus on getting all the hardware to talk to me.

May 7, 2017 3:11am Alan Robertson (52) 420 posts	Very interesting. Sounds like you are making good progress towards your goal. Looking forward to seeing how you get on.

May 11, 2017 2:32am Tristan M. (2946) 1039 posts	Thanks! I hope quite well. That brings me to another minor update. Lack of time is the big issue here but I did test the all C POKE32 function by directly substituting it with the assembly PUT32 function. It worked, so it means that an arbitrary memory write with C functions as expected. Of course reading and writing to memory is trivial, but it showed me what I needed to see. That arm-unknown-riscos complies with what I want to do. This should make messing with the hardware easy. My plan is to do what I normally do. Restrict direct hardware / memory address access to as few places in the code as humanly possible. It’s way easier to debug.

May 23, 2017 10:09am Tristan M. (2946) 1039 posts	A little update. I’ve found I can’t use cortex A7 compiler flags among other things with the RISC OS C compiler. Oh well. The Orange Pi PC “Blink” demo has just about been completely replaced by other code and extended. It’s modelled off the RO HAL, albeit a little loosely. I’m trying to get some basic functions of some of the hardware happening. I’m trying to use HAL function names and reasonably similar file structure. One weird thing has got me confused. When I chaned “clean” in the makefile to work with RO, I got it working, then it didn’t. It said that rm wasn’t found. So ran !CoreUtils but the syntax is different and it doesn’t work! What was I using first time round, and where did it go?

May 23, 2017 10:44am Jeffrey Lee (213) 6048 posts	What was I using first time round, and where did it go? Apparently there’s an ‘rm’ utility in the RISC OS build system (Library.Unix.rm); maybe it’s that which you were using? (if you’d selected a build environment via !Builder then it would have added it to your path)

May 23, 2017 11:44am Tristan M. (2946) 1039 posts	Hmm. Perhaps that was the case. I can’t remember if I had !Builder open before I rebooted. It may explain why the syntax which worked for rm broke later. My last post got interrupted and I lost my train of thought. What I was intending to say is once the original example code is banished and I’ve made sure the little bit I’ve done works I’ll throw it in a zip and upload it for the curious. Somehow I suspect I’m the only person on here with an Orange Pi PC, but I still find it interesting that the RISC OS toolchain can be used to write bare metal C without any libraries. Thinking about my attempts to port the OMAP3 HAL, I think I’m dead wrong. Perhaps some of the things like USB can be used with some modification, but the core of the H3 is so very, very different. Speaking of cores, it only just dawned on me that both the RPi3 and OPiPC2 are A53 based. Sure the ARM core is like a growth on the VC bus, but it does have me wondering if my OPiPC2 could have a use beyond ARM build server.

May 28, 2017 1:14pm Tristan M. (2946) 1039 posts	I’ve got the OPi PC and the RPi B UARTs directly connected. It took a bit fo time to track an issue but comms work in both directions now. It seemed to start working after I turned on the pullup on the RPi’s UART Tx pin. !Connector talks nicely to it. However there is an issue I can’t pin down. File transfers seem to work but they crash on the OPi PC. Even a known good version of the .bin doesn’t work. It means more hardware rearranging to test with CuteCom and PuTTY again because while a pain it did work.

May 29, 2017 10:50pm Tristan M. (2946) 1039 posts	Can’t work out the !Connector issue. It’s sending the right number of bytes and reports no errors. I guess all I can do is transfer a .bin then dump that memory region to see what’s changed. I reconfigured the setup to use the USB UART and PuTTY / CuteCom again. Tried a few different versions of the .bin that failed via !Connector. They all worked. Over the last couple of days I’ve been looking at gcc intermediate assembly files, objdump disassemblies and misc other disassemblies after compiling with different parameters. It may be quite possible to have relocatable code generated by the RO gcc. In my testing I also tried loading the code to other sane addresses and running it. It almost worked. The disassembly showed why. Long story short, I was using StrongEd to do editing. Some code stayed neatly folded so I didn’t see code that I should have changed. It was calling assembly routines by absolute address for some reason.

Jul 18, 2017 9:36am Tristan M. (2946) 1039 posts	I can’t remember what I said and it’s all tl;dr really. I haven’t really been near it in a couple of weeks at least. No time. However progress has been made. What I have learned: !GCC can be used in the RO build system with some compiler option fiddling. The resulting objects are compatible and the symbols seem to be compatible, however the DDE ‘Link’ is a little less flexible with object formatting. asasm in the current version of ~GCC is broken. It has problems with vfp flags. An version from an older version of !GCC needs to be copied in. gcc ld works fine to link the objects. In my port attempt I have both link and ld outputting their own version so I can compare code. I managed to get the source tree to work via nfs. Big explanation but really the important thing is to make Sunfish add the ,xxx extensions to all files, make sure it’s set up to be case insensitive, and have symbol conversion set up. One file in the RO tests will break. It’s named with something like up arrow, down arrow, pound. The symbols really don’t translate. gcc was playing nice with the dependency file generation and perl script to parse the dependencies. It’s kind of cool to see the makefile being populated by dynamic dependencies from a few dissimilar compilers / assemblers. Just now I noticed some things were still pointing to my USB drive and not the correct mapped nfs directory so I re-ran prepare. Now I need to fix the module databases etc again. Did I mention because I have it stored on my NAS (Orange Pi Zero) I also have git running on it, and a !Nettle ssh session so I can do git commits etc? Very useful for this. Something else to note. It seems that having some static dependencies set that can be dynamic may interfere with other dynamic dependencies being generated. I commented out the static ones in the makefile and ran it to see what would happen and a way more complete set of dynamic dependencies was generated.

Jul 31, 2017 8:50pm Michael Grunditz (467) 531 posts	Link can give you pure binaries. But I never tried to do baremetal c with riscos only assembler. ObjAsm is great! The free clone also does work, but I don’t know if all spiffy features are there. I use !Connector for my serial terminal needs.

Aug 1, 2017 7:38am Tristan M. (2946) 1039 posts	I do suspect that my issues with direct SBC to SBC connection is the need for pullups. I’ve considered getting a couple of 3v3<→RS232C adapters or even connecting the serial pins to an esp8266 and connecting to it via a network port using telnet via WiFi(It does work. I used to do it with the Pi Zero in Linux until I connected the SPI ethernet adapter, then the USB ethernet adapter.) There’s something unknown wrong with X/Y/Zmodem transfers directly between RPi and OPi. Aaaand it just dawned on me that I’m using my Pi3 currently because the Zero failed and the B is pretty slow and kind of unstable. My Pi 3 has a DAC board on it. It already has an RTC solder piggybacked to the DAC so I guess I could go and piggyback a couple of wires for UART too. Getting the RISC OS gcc to work like the “none” gcc only really required routines for stack overflows to be defined and a few flags to stop it including a runtime library etc. I’m slowly progressing getting a HAL up and running using my mess as my extremely limited time allows. Given that the RO build tree is still pretty mysterious to me it has been a challenge and a learning experience. I’ve been relying on the source trees from multiple ports and the comments in the source to work out how things work. It’s been a while since I last tested anything. A good couple of months at least. But it’s going through that phase where a few things have to be gotten into basic working order before it can be tested.

Aug 16, 2017 10:29pm Tristan M. (2946) 1039 posts	Frustration! I can’t seem to build anything successfully. It took days before I could get the RPi SMP branch to build. I can’t build any of my code successfully any more, or even the untouched OMAP3 source. Then there’s my file transfer issue. Both !FTDI and SerialUSB seem to load drivers, or something, but I can’t do anything useful with them. I could actually talk to the U-Boot prompt using SerialUSB and the demo program via an ANSI session on !Nettle (if I recall correctly), but that’s nearly useless because I have no means of doing a file transfer. I could copy the HAL binary to the boot partition, then re/boot the OPiPC and work from there but it’s fiddly at best. Interestingly, I had better success building RO using the Linux port on aarch64 than I did on the Pi 3. The ROM join failed with egrep, but I couldn’t have been bothered going through doing all the grep -E substitutions. The build of the linux port is based off older source too. I can’t seem to get the current head to build. It’s all very frustrating, especially because I’ve started to untangle the HAL code I wrote so far, now that I’ve felt out the limits of what can and can’t be achieved. I know it’s reinventing the wheel trying to do the HAL with GCC. I have however seen that I can use RO to write pre-init code with extremely minimal assembly. What I’m doing now is using ObjAsm, DDE Linker, and GCC together. The DDE linker isn’t as versatile as ld, however it can accept objects generated with ObjAsm and use them if some care is taken. Why did I disable ld… hmm. My source was generating a DDE and ld linked binaries. I had to disable ld for some reason. e: I got “irritated” and purchased “SystemDisc” early this morning. When the order is processed I’m going to make a nice big fresh OS image with all the build tools on it. If things go a bit peculiar a new copy can just be dumped to card. Trying t track down what obscure thing is tripping over a complex build is maddening.

Aug 17, 2017 11:45am Jeffrey Lee (213) 6048 posts	I could copy the HAL binary to the boot partition, then re/boot the OPiPC and work from there but it’s fiddly at best. You may be stuck with a choice between “fiddly” and “slow”. Transferring a HAL over serial shouldn’t take too long, but once you’re building full ROM images it would be madness to use plain serial. If the OPi supports USB device mode then you might be able to use a direct USB CDC connection to U-Boot – although my past experience with that hasn’t been great Other than that, the fastest option is likely to be using a card reader.

Aug 17, 2017 11:59am Rick Murray (539) 13850 posts	once you’re building full ROM images it would be madness to use plain serial. No, madness would be using a parallel port JTAG interface. ;-) Parallel ports often self-clock to something akin to 5uS, so they don’t run too fast for the printer, each transaction on the port represents a single bit of data shifted into the chain…I think I read somewhere that a 256K transfer would take something in the region of 40 minutes. That’s madness.

Aug 17, 2017 12:52pm Tristan M. (2946) 1039 posts	once you’re building full ROM images it would be madness to use plain serial. You say this to a person who installed Windows 95 via serial cable. the fastest option is likely to be using a card reader. That would be an absolute last resort. I’m not fond of handling MicroSD cards, and often resort to using tweezers, especially when it comes to getting them out of an RPi! At least the OPi has a spring loaded launch mechanism. No, madness would be using a parallel port JTAG interface. ;-) I have one lying around here somewhere that I used to use for uploading and downloading ROM images, and other very useful JTAGgy stuff. I think I read somewhere that a 256K transfer would take something in the region of 40 minutes. That’s madness. It’s also wrong. I can’t provide figures but I know they have a much higher data throughput. Even using one with an X1641 cable over the slowest BUS in history the throughput is still higher. I also have a Trantor parallel – SCSI adapter somewhere that used to perform acceptably. Oh! then there’s parallel Zip drives. Just saying is all. Probably doing a network copy into linux, dumping into the boot partition and rebooting would be the fastest “acceptable” method. The USB CDC connection is one I didn’t really bother considering. I know it supports it, but it seems a bit of a pain.

Aug 17, 2017 2:56pm Jeffrey Lee (213) 6048 posts	Actually, there’s another boot option that modern versions of u-boot support – TFTP. https://www.riscosopen.org/forum/forums/5/topics/1012 https://www.riscosopen.org/forum/forums/5/topics/2177#posts-27002

Aug 17, 2017 5:56pm Rick Murray (539) 13850 posts	You say this to a person who installed Windows 95 via serial cable. I installed W95 from floppies. That probably wasn’t much faster when you factor disc changes and the fact that once you got beyond about disc 25, it seemed to request them randomly, including ones it had already asked for… :-/ I think I read somewhere that a 256K transfer would take something in the region of 40 minutes. It’s also wrong. I can’t provide figures but I know they have a much higher data throughput. Well, I had read that, but I wondered if it was correct. I have not located the source of my 40 minute time, however I did find this: In fact, the JTAG spec allows for up to 25 million bits-per-second transfers. With a parallel port cable, however, you will be lucky to achieve more than about 400,000 bits-per-second. With these speeds it is not unusual to spend 25 minutes writing a mere 256 KB of data over a JTAG cable. Programming an entire 2 MB or 4 MB flash chip can literally take hours. [source: https://wiki.openwrt.org/doc/hardware/port.jtag.cables] First up, we must consider how the interface is built. If the interface uses data pins to construct the interface, it ought to run fairly quickly but will likely need specific host support (some sort of high resolution timer) for the interface timing. I think “Wiggler” is like this. A simpler interface (like the Olimex one) will bang data out using the STROBE signal. In this case, it is fairly host agnostic as most parallel ports have a 0.5µS setup time, followed by asserting STROBE for 0.5µS. This is mandatory and performed by the interface chip (82C711 etc). The benefit to this is that since the hardware times itself, the code to drive the parallel port can be really simple as we don’t need to worry about timing, just push the data… The downside, however, is that it’s slow. 1 microsecond at a time. This is the slowest option, so we’ll run with this. Now the problem is that RAM does not normally have JTAG support. As such, one must push the data into ARM registers, then use an instruction to write to registers to memory. To keep things easy, we will assume R0-R7 are data registers (8 in total, or 32 bytes). So, it looks like there are something like 20 bit transitions to change state (IR→DR etc). I will assume all state transitions take 20 bits as I really can’t be bothered reading the spec to work out how the chain actually works. I’m sure some transitions may be shorter, while others may take a parameter, so let’s just assume 20 for now to keep the maths easier. (=0) To write to the registers we need to set SCAN_N to IR (=20), then select chain 1 in DR (=40), then enable the chain in IR (=60), then set SYSSPEED=0 (=80) and then push the opcode for LDMIA xx,{R0-R7} (=102). Go to RTI for a clock to commit (=122) then set SYSSPEED=0 (=142) and push a NOP instruction into the pipeline (=162). Go to RTI for a clock to advance the earlier LDM from decode to execute state of the pipeline (=182). Clock RTI again to advance LDM to the memory stage of the pipeline (just assume no state change as we’re already in RTI). Now we need to go to DR (+20) and write 32 bits of data into R0 (+32), then go to RTI for a clock to commit the data to the register (+20). This is done eight times, which adds up to 576 bits (=758). But now we need to come out of debug speed to system speed (SYSSPEED=1) (=778), then RTI to commit the instruction to the pipeline (=798), then set RESTART into the IR (=798) to get the instruction running. That’s because we need to be running at debug speed to talk to the processor and set up the registers, but at system speed to have the processor talk to memory. It may be that we need to set up the LDM/NOP in debug state in order to push data into the registers, and then set up LDM/NOP all over again in order to actually execute it. That sounds the more logical, but, to hell with it, this stuff is confusing, so let’s just say “about 800 bits” for each 32 bytes shoved up the JTAG bus. Feel free to refine this to a better number if you’re really bored. ;-) So, if it takes an interface using STROBE one microsecond per “transaction”, that means one can send 1,000,000 transactions per second. Now if it takes 800 such transactions to bit-bang the data we want, that means we can do this 1250 times per second. Which means if we are using R0-R7, or 32 bytes, and we can do this 1250 times per second, we can send about 40,000 bytes per second. It is worth noting that OpenWRT’s about 400,000 bits-per-second when divided by eight comes to 50,000 bytes per second. So… what gives here? If we can wobble 40,000 bytes per second (by running our 800 bit data push loop 1250 times), doesn’t this imply that one ought to be able to transfer 256K in a little under seven seconds? Now let’s assume I’m way off on all of my maths. Let’s assume it takes four times as many bits – 3200 bits to send 32 bytes. Now let’s also assume that overheads and stuff mean that our access to the parallel runs a quarter of the speed. Well, the maths here shouldn’t be too hard. Round up to seven seconds and multiply it by four, then by four again. This gives me 112 seconds. Or a perfectly reasonable value of a shade under two minutes to transfer 256K. Okay, I give up now. I think seven seconds for a bit-banged parallel port transferring 256K is a little fast (hell, sending a page to a fast Laser printer took longer, and that was with all eight bits available!). Two minutes sounds more feasible. But there’s no way I can munge these numbers into something that resembles 25 minutes. … Has anybody actually used a simple parallel port JTAG device? If so, how fast (or not) was it? the slowest BUS in history the throughput is still higher. USB is a lot faster than parallel. Oh! then there’s parallel Zip drives. Just saying is all. Zip and your SCSI adaptor will be able to use all eight bits, maybe a few more if it does fancy still with various control lines. When you are bit-bashing, you can take whatever you know and instantly divide by eight. Probably doing a network copy into linux, dumping into the boot partition and rebooting would be the fastest “acceptable” method. That’s what I’d do if I was going to use my older Pi as a testbed or something – I’d see if I can set up ShareFS or Samba – whatever plays nice with standard TCP/IP (which rules out AUN IIRC). Easier on the hardware, no card swapping, and easier on the user (draggy-droppy to the remote machine).

Aug 17, 2017 6:00pm Rick Murray (539) 13850 posts	…and in the course of writing this, I discover that wrapping some text in + symbols makes it appear in underlined green for some reason. This is mentioned in the Textile reference linked below the text area, but it’s one of these things most of us probably paid no attention to. Why green? Why underlined? <shrug>

Pages: 1 2

Reply

To post replies, please first log in.

Forums → Aldershot →

Crosscompiling to Orange Pi PC

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options