VFP advice/tutorial
Chris Gransden (337) 1207 posts |
The VFP/NEON support in gcc 4.7.4 for RISC OS isn’t complete yet. A further patch is required. Even with the patch there is no support for the VFP/NEON ‘hard’ abi as this is already used by the old FPE/FPA support in the compiler. You’re getting that error as the option you’ve specified compiles the source as ‘hard-float fpa’ but it is linking to the soft-float fpa version of unixlib. It’s normally used when compiling and linking to CLIB. |
Jan Rinze (235) 368 posts |
Ok. good to know. Hopefully VFP support and NEON will arrive soon. There are a couple of apps that can benefit from that. FPE/FPA support seems a bit outdated i.m.h.o. There are very few systems with FPA anyway. Soft float should perform quite well on old systems and newer systems would have VFP/NEON. Since the compiler does allow outputting to assembler, is it possible to use VFP regardless of the support in ld? |
Jan Rinze (235) 368 posts |
Drat! this surely must be a misunderstanding with gcc somewhere.. |
Chris Gransden (337) 1207 posts |
If you want to build an executable that uses VFP the following will work, gcc -mno-unaligned-access -mfloat-abi=softfp -mfpu=vfp -o demo demo.c You’ll probably get an abort when you run it as the necessary support in unixlib is missing. |
Jan Rinze (235) 368 posts |
hmm.. and adding the context calls mentioned in this thread? would that make any difference? hmm.. and adding the context calls mentioned in this thread? would that make any difference?Building with: hmm.. and adding the context calls mentioned in this thread? would that make any difference?Building with:gcc -O3 -march=armv7-a -mcpu=cortex-a9 -mlibscl -o demo demo.c works. Not sure if it matters much but at least it is something :-) |
Jan Rinze (235) 368 posts |
gee.. textile is weird. |
Jan Rinze (235) 368 posts |
adding the context calls mentioned in this thread? would that make any difference? Building with: gcc -O3 -march=armv7-a -mcpu=cortex-a9 -mlibscl -o demo demo.c works. Not sure if it matters much but at least it is something :-) |
Chris Gransden (337) 1207 posts |
I’ve not tried it but I assume so.
That works but it is using the hard float FPA/FPE. |
Steve Drain (222) 1620 posts |
No. I can seem to do VMOV VLDR, but anything that actually manipulates numbers causes ‘undefined instruction’ errors at the next normal ARM instruction. Looking at Michael Kubel’s code, written for ExtASM, he seems to be stacking registers and storing R13, but I cannot work out the logic of this. In the absence of information about the VFP instructions supported by the BASIC assembler, I took the hint and now have a copy of the UAL Quick Reference Card, rather than the earlier one. However, StrongED disassembles using the earlier one; is Fred looking at this? ;-) Also, can the “VFPSupport” SWIs be called at the start and end of the BASIC program, as Rick had it, or must they be called from the assembled code, as Michael has it? I have tried both, but there does not seem to be a difference when I am just using the instructions that I can get to work. Again, after the diversions into Rick’s work and the C compiler, can I ask for some example BASIC assembler that works. Thanks |
Fred Graute (114) 645 posts |
Yep. :-) StrongED simply uses SWI Debugger_Disassemble to disassemble instructions so the Debugger module needs to support the new format. A quick look at the sources suggests that VFP disassembly hasn’t been updated yet. |
Rick Murray (539) 13840 posts |
+1 What he said… |
Jan Rinze (235) 368 posts |
For the record, Unixlib using softfloat is a bit faster than SharedClib with gcc 4.7.4. P.S. the demo is a graphics demo which runs in 1280×720 at a reasonable framerate (2x Utah Teapot with zbuffer and gouraud shading and some spheres.) |
Jan Rinze (235) 368 posts |
somehow gcc and ld don’t understand each other properly.. ..ld: error: demo.o usues VFP instructions whereas demo does not
|
Jan Rinze (235) 368 posts |
Intersting work-around for the above: compile with -S to get assembler and use the .s files to build the executable.
gcc -o demo demo.s works well and results in faster executable :-) (there are some complaints in the output about the assembler though) |
Jeffrey Lee (213) 6048 posts |
Either should be fine.
Correct. A new disassembler is in the works, but isn’t quite ready yet. However since everyone seems to have come down with a case of VFP fever, I could prioritise it and probably get VFP/NEON disassembly in and working sometime in this coming week? (The full disassembly engine will have to wait a bit longer, as it is a bit chunky, and Sprow’s recently updated the original code with most of the missing ARM instructions anyway) |
Jan Rinze (235) 368 posts |
For those interested, the demo but no FP yet. when / if VFP arrives this may get a bit faster. Also with NEON it could improve quite a bit. |
Jan Rinze (235) 368 posts |
On a side note, it runs at about 25 fps on my Panda and about 17 fps on the Beagleboard. Due to performance measurement wait-for-vsync is not enabled. |
Jan Rinze (235) 368 posts |
oh, the obligatory screenshot: |
Chris Gransden (337) 1207 posts |
I get screen mode not available when I run it. If you can send me the source code I can compile it for VFP/NEON. |
Jan Rinze (235) 368 posts |
@Chris: it runs in 1280×720. So you need at least on 1280×720 16M colour mode. |
Chris Gransden (337) 1207 posts |
My monitor aspect ratio is 16:10 so I didn’t have a 1280×720. Only 1280×800.
I assumed you were using single or double precision floating point and/or all the code was in C.
I’m running it on a Pandaboard ES.
I just use GCCSDK on Linux patched so that it produces working executables using VFP/NEON. |
Jan Rinze (235) 368 posts |
@Chris: due to no VFP/NEON i decided to write it in fixed-point math. running the demo in 1920×1080 shows that although the same amount of pixels are rendered the framerate drops dramatically. |
Jan Rinze (235) 368 posts |
Updated !teapot to do triple buffering. Now 640×480 looks ok and does 42 fps without flicker. If the download doesn’t work, i might be testing the app on the beagle (also webserver). |
Chris Gransden (337) 1207 posts |
I get 54fps at 1280×720 and 32fps at 1920×1200. It flickers and tears a lot. More so the higher the frame rate. I’ve sent the patch via email. |
Jan Rinze (235) 368 posts |
@Chris: the latest version triple buffers so won’t flicker anymore. |