Orange Pi
Rick Murray (539) 13840 posts |
Clive – what? STM with a single register? |
Clive Semmens (2335) 3276 posts |
That’s what I was thinking of, yes. Don’t tell me I’ve forgotten something, or that they have decided to re-use it since I left? Surely not… Not all that tiny, I suppose, but not exactly enormous. Very tangled to decode – okay if you make it all off limits and then use odd bits of it for odd things that each use small chunks of instruction set space I suppose. |
Clive Semmens (2335) 3276 posts |
Another fun bit…STM (and LDM) with no registers at all… |
Rick Murray (539) 13840 posts |
…the fact that loads of people still use it for stacking/unstacking single registers? “LDM or STM of single register is probably slower than LDR or STR” makes 178 appearances in a build of RISC OS, which is pretty bloody good considering there are 156 instances of this when building Twin (which is tiny in comparison). I’m guilty too. There are times when I STM R14 because… laziness. STM is cuddly and friendly while STR is complicated, what with the #-4 offset and all. ;-)
Down a road that looks like that is the lurking ghost of x86.
Well, in theory one could use this to provide a completely new instruction.
Internal error: undefined instruction at &0000800C. It’ll throw an exception so it could be trapped. ;-) |
Chris Mahoney (1684) 2165 posts |
To be fair, it can depend on what you’re doing with it; I’ve been using it for years for Windows app development but had never needed to touch the << operator until I wrote my “Uptime” RISC OS module. A lot of the “low level” stuff is still a mystery to me… |
Rick Murray (539) 13840 posts |
To be fair, thiis is one of the thiings that really irritated me about VisualBasic, the lack of a simple binary shift. To multiply and divide by powers of two to an integer is just so horrible given that one can easily imagine the sort of chundering going on inside VB in order to perform the * or / when a shift is, what, a single instruction on pretty much every processor on the planet?
That’s not a surprise given the sorts of fruity stuff you can do in C, like: return *(void *)&this->that.something; I made that up. It’s probably wrong. But I’ve seen worse. Then there’s the “let’s make functions look like an array so we can call them programmatiically” thing. I’m sure very useful for emulators and the like, but it leads to the sort of code that makes spaghetti GOSUB of the ’80s look tame in comparison. |
Tristan M. (2946) 1039 posts |
You aren’t far off. I’ve forgotten a whole lot :( It’s incredibly frustrating. I’ll be digesting all this info properly when I have a chance. Can’t really do anything useful until next week. I did something a couple of days ago that had the results I expected unfortunately. Because of USB serial issues with RO, I set up a serial to TCP/IP WiFi bridge using an esp8266. It didn’t work well. I’m amazed it worked at all. A fun thing that I did notice though is if I used xterm-colour in Nettle, it could correctly display the extended character attributes like colours from the OPiPC. I had suspected it could, but this proved it. Because it was just a bridge it lacked Telnet extensions. One of the reasons I tried this was the Serial USB drivers for RO end up with an irritating local echo happening so 2x every character was printed. In the end it was all pointless anyway. I mean I can load binaries from ext4. It takes a few steps, including copying over the network. I would have preferred to use X/YMODEM protocols, but I just can’t seem to get Hearsay or Connector to behave. I was hoping to check for data corruption after doing a transfer with one of them instead. e: Jeffrey, you can say my code is horrible if you want. I know it is. It’s a mess. I know. I don’t even like looking at it. |
Clive Semmens (2335) 3276 posts |
Now this is where my background shows. I wasn’t aware of that at all.
Of course a clever assembler could automagically assemble any instance of LDM/STM single register as the corresponding LDR/STR. And since the instructions are the same length (!) automagically replacing them in existing code wouldn’t be that difficult either.
There’s a fair few examples of that already in ARMv7. Okay, not so extreme – hence my comment about “tangled”. Some day I’ll take a look at v8. Or maybe not. I find I’m not missing any of the apps I wrote with chunks of assembler in them (yet?) – all the ones I’m still using are BASIC only. Don’t know if I’ll ever write Assembler again. |
Tristan M. (2946) 1039 posts |
A lot’s happened recently. I’m also sick. When I’m better and get my brain back I’m going to tackle this some more. With the new useful information and time to think, i believe some may have actually penetrated my thick skull. |
Tristan M. (2946) 1039 posts |
I’m looking forward to tackling this again soon. The problem with AllWinner SBCs has been that some hardware acceleration has only been possible by using linux with a legacy Android kernel which had some nasty bugs and issues. The Mali support for mainline linux kernels comes in the form of a blob still, but at least it’s not a mess that nobody was willing to touch like it has been. I have absolutely no idea if the blob can be made to work for other OSes, but it’s still interesting. e: This is for OpenGLES btw, not for the video codecs. Not sure what the status on that is, because tbqh I find the OPiPC better for watching video than the RPi3. Edit: I’m thinking I can borrow a few GPIO pins for a simple output for debugging if I can’t catch the issue via UART, assuming it wasn’t actually the UART code. I’m hoping it’s failing where Jeffrey pointed out and not before. Jeffrey, I know the AddRAM bit I did is garbage by the way. I fully expected the code to crash before there. I did run into the issue of having no idea what areas could be considered allocatable too. |
Tristan M. (2946) 1039 posts |
The USBSerial blockdriver has sped things up immeasurably. I don’t have much time to work on this but I’v already found a few failure points. There are some anomalies. Most notably the PC is in a HAL function stub which never gets called. According to the PC the undefined instruction is a very valid looking MOV. So this is an odd one. My bet is on linker issues again. e: Yep. |
Jeffrey Lee (213) 6048 posts |
It wont.
I’d check the PSR as well; it’s possible the instruction is undefined because it’s dropped into Thumb mode. It looks like GCC does have an equivalent of Norcroft’s register void *RO_Base asm ("v8"); https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html However that may end up generating an error due to it conflicting with one of the APCS registers (I think v8 is typically the APCS frame pointer?). Possibly you’ll be able to fix this with Ultimately it doesn’t really matter what register RO_Base is stored in, since the HAL can always locate the OS image directly. But when the MMU is enabled and you need to keep track of the HAL workspace pointer (which will be in v6) it’s going to be important to have a tried-and-tested way of preserving the register (otherwise you’d need stubs for almost every call into C, to allow the workspace pointer to be supplied as a function argument). (Note also that using v6 for both RO_Base and the HAL workspace pointer should be fine, since one is only used pre-MMU and the other is only used post-MMU) |
Tristan M. (2946) 1039 posts |
It didn’t drop to thumb. Only reason I didn’t include the reg dump was I’m typing this on the PC. I think before any of these issues are tackled, the big, big issue of the landing points of branches missing their target at build time needs to be tackled. Not sure if I mentioned but I have a second build setup. It’s the same tree as my C one except the HAL is a different directory. It’s kind of my testbed. I just want some confirmation on a question I asked at an earlier point. An idea came to me. I can build a complete ROM. The ROM is large. The HAL which isn’t very capable yet needs RO to make it more than a few instructions in. While I’m here. For posterity, the machid for the Orange Pi PC is 0×1029. |
Jeffrey Lee (213) 6048 posts |
No.
Yes, that should work fine. While I’m here. For posterity, the machid for the Orange Pi PC is 0×1029. Allwinner’s codename for the H3 is sun8i. Meanwhile, http://www.arm.linux.org.uk/developer/machines/ shows that ID 0×1029 (i.e. 4137) is for the sun6i. So it sounds like they’ve just re-used a generic ID of theirs. As more and more platforms start to make use of device trees, it should become less and less necessary for devices to be given unique(-ish) machine IDs (for device tree-supporting OSes, at least!) |
Tristan M. (2946) 1039 posts |
I just wanted to say this isn’t dead. I’m restarting, in a sense. I’ve found more necessary information and learned some more. Also I worked out how to use tftp for loading binary images in U-Boot, which has simplified everything. A couple of days ago I transferred some of my work to a new OS build tree. Something was fatally damaged in the old one preventing it from building. No loss. The source from that period was pretty finicky. Although I’ve only just started, what I’ve got builds and executes without issue. This is a very good thing. Would it be possible for someone knowledgeable in the HAL to do a small update to the RISCOS_InitARM page, please? |
Jeffrey Lee (213) 6048 posts |
There aren’t any defined flags. SBZ = “Should be zero” |
Tristan M. (2946) 1039 posts |
…Oh. I see now. I feel silly. I saw that, but completely misunderstood what I was looking at.
Serves me right for doing this while I’m sick. I guess I interpreted it as:
As always Jeffrey, thanks for putting me back on the right path. |
Tristan M. (2946) 1039 posts |
Using the OMAP3 source as a guide is a double edged sword. It’s useful, but unpicking what’s going on with all the macros is difficult because of all the variants, and the complexity of the OMAP SoCs. On the other end of things, the AllWinner SoCs are what I’d call a flat, monolithic design. No layers, no weird memory tricks like the BCMnnnn. It’s very simple. |
Jeffrey Lee (213) 6048 posts |
It’s worth bearing in mind that you don’t need the UART entry points in order to boot the system.
For “minimal set of ROM modules”, I think you’ll want the following:
It’s probably easiest to start with a known-good components file (OMAP, Pi, doesn’t really matter), and when disabling the unwanted modules just mark them as “-type EXP” so that they’re still involved in the header/lib export phase. That way you won’t have to deal with build errors due to missing headers. Bearing in mind that the OMAP3 port is the only port where I’ve been there from day one, it’s possible the above isn’t completely accurate. But I’m pretty confident that you can get quite far (or at least far enough for the OS to start giving you error messages) with just HAL_Init, HAL_DebugTX, HAL_DebugRX and HAL_KbdScanDependencies, and that above list of modules. And if/when you do get things working, report back here so that any documentation can be updated accordingly. |
Tristan M. (2946) 1039 posts |
1. Good to know. Already got that for HAL_UART* entries, but nothing else yet. The rest are just NullEntry, which now I look may actually be exactly that anyway. 2. That’s very useful information. As it is I can send my code blindly into OS_Start where there is a perceptible pause before the CPU resets. I haven’t bothered with anything on the far side of OS_Start yet. It’s mostly just there as an end point for the things I’m trying to get working which never gets executed because of an infinite loop I put in beforehand. 3. That doesn’t seem too hard. I’ve already got code that does most of #1, just not within the HAL framework. Do you mean that HAL_KbdScanDependencies can just be a stub which returns -1, or does there need to be more going on? 4. I’m genuinely surprised it can make it as far as the command prompt at all without these. My current tree is based off OMAP3 from about a week ago. Most of the modules are disabled. I think I can strip a little further but not much. It seems silly but to make things easier for myself I made a simple Obey file that pops open filer windows for all the necessary component / module / etc directories so I can tweak all that in a hurry if I want to. Right now my ROMlet isn’t being copied anywhere at boot. I’m leaving it near the bottom of RAM at 0×41000000 (I could probably shift it down near 0×40000000) where it’s safe from being clobbered by anything in U-Boot or when I turn the MMU back off I was chasing a nasty build bug since last night. Finally got it today. Long story short, I removed my version of the board.* files from my code. It’s just not needed. I am a little hung up on how to enumerate the UARTs in a neat way. Also, How do I treat SRAM? |
Jeffrey Lee (213) 6048 posts |
It’s probably worth concentrating on getting OS_Start working, if it’s currently failing. No point having a UART driver which will never be called!
That should be fine, at least for now. Later on you might find you have limited screen memory available, due to the way the kernel sets aside the chunk of RAM that’s used for video memory (unless told otherwise, it takes
Ignore it for now. The kernel does try to (half-heartedly) prioritise using fast RAM over slow RAM, but with most systems I don’t think there’s enough SRAM for it to make any significant difference. There are also some limits on how small memory chunks can go until the kernel breaks – and I don’t think anyone knows for certain what those limits are (probably somewhere around 256KB). |
Tristan M. (2946) 1039 posts |
Who knows if it’s working or failing at this point. It has nowhere to go, and no way to show what it’s doing.
Is there something besides OS_AddRAM that defines the location of video memory? I was just going to slap it where U-Boot has it defined for the time being. It’s not like any of it is hard to shift around later. I guess maybe the stack could be shoved in SRAM. Possibly. |
Jeffrey Lee (213) 6048 posts |
There are two kinds of video memory – there’s the kernel-managed video memory, and there’s driver-managed video memory. The location of kernel-managed VRAM will be taken from one of the chunks passed into OS_AddRAM. You can influence the kernel’s logic by specifically flagging a chunk as VRAM (bit 0), or as unsuitable for DMA (bit 7). Theoretically bit 1 does something as well, but I’m not sure if that’s working (nothing uses it). Pages of kernel-managed VRAM which aren’t currently in use by the screen can be reused as regular RAM, which was obviously very useful on early machines (both 32bit and 8bit) where memory was much scarcer. The location of driver-managed VRAM is indicated via GraphicsV 9 / HAL_VideoFramestoreAddress. Spare pages of driver-managed VRAM can’t be used as regular RAM, since as far as the kernel is concerned it isn’t actually RAM (the kernel treats it as arbitrary IO space, like the hardware registers used by all the CPU’s peripherals). However the key thing is that the kernel will always reserve a chunk of RAM as kernel-managed VRAM, even if no video drivers make use of it. (This is on my long list of things to fix)
Yes, placing the initial stack in SRAM would be a good way of making sure RISC OS doesn’t clobber it. You could also use it to pass information over the OS_Start call, since there’s no proper way of doing that at the moment. E.g. for OMAP machines without physical NVRAM, the U-Boot script is set up to load the CMOS file into SRAM. Then in HAL_Init the HAL maps in that memory and checks to see if valid-looking CMOS is there so that it can offer it to the OS. |
Michael Grunditz (467) 531 posts |
The debug tool *Memory is very cool and useful! |
Tristan M. (2946) 1039 posts |
I’ve been stuck for a bit. It goes in to OS_Start and never comes out. No idea what happens when it goes in there. I guess I’ll need to sprinkle some debug characters around in kernel.s to work out where it’s choking. I wanted to ask about OS_AddRAM. The end address of a block is exclusive. So would that mean as a quick example
I realised my static workspace was actually an ethereal workspace too. Not that it has much use at this point. Does it? At least it’s shackled to an address now. I fed AddRAM a chunk of sacrificial VRAM just in case. Actually it is the location of the U-boot framebuffer which I don’t imagine would change unless acted upon. Interestingly it’s parked in the top of RAM with a little gap up top where the U-Boot stack lives. |