Safeguarding the past, present and future of RISC OS for everyone

News | Downloads | Bugs | Bounties | Forums | Library

Forums → Aldershot →

FracNEON on 64Bit Linux

4 posts, 2 voices

Jun 12, 2021 8:36am Kuemmel (439) 384 posts	Meanwhile I ported my good old FracNEON single precision benchmark from aarch32 to aarch64 to be run on 64 Bit Linux. It uses some C++/SDL2 to display the results but the benchmark itself is written in assembler. Now it’s 3 different versions as they show 3 different optimisation possibilites on single core coding. Hope to implement multihreading some day also. I did not only port the initial FracNEON which is now calles ‘opt1’, I also created new versions from scratch with 2 and 3 indepedent instruction blocks (opt2 and opt3), still calculating the same exact same thing (iterations/result). The double amount of available NEON registers make this possible. This ended up in more than 1000 lines of assembler to catch up on all events when you iterate 12 Mandelbrot pixels in 3 NEON paths in the main loop at the same time…got me some nightmares when bug hunting ;-) …but especially the results (thanks to Chris) on Virtual Ubuntu on an Apple M1 show what a huge potential is there using this coding technique on the lowest level on the latest cores. ‘opt3’ is like 137% faster than ‘opt2’. On the RPI4 only little gains. You’ll find everthing on my homepage including some graphs and source here If you got some other 64Bit Linux ARM device I’m looking forward for other measurement data. Still hoping to do that someday on Risc OS :-) though I can’t complain about my first experience coding on Linux.

Jun 12, 2021 10:49am Colin Ferris (399) 1814 posts	I wonder if anyone has managed to jump between 32bit and 64bit – You know the British are mad – but we can but try :-)

Jun 16, 2021 7:37pm Kuemmel (439) 384 posts	Updated the results with numbers from Odroid N2+ Cortex A73 and thanks to Chris with that 80 core monster Ampere Altra Neoverse N1. The Neoverse is initially best for ‘opt1’ variant and then loses big time against the Apple M1, but generally shows the same trend for the optimisations and a big step compared to A72/73. Kind of funny to use only 1 of 80 cores :-) I guess it runs almost idle ;-) I’ll probably transform the benchmark to double precision as ARMv8 supports that on NEON in contrast to ARMv7 and then I’ll see if I can get my mind into multicore coding on Linux some day…while I wait for RISC OS.

Aug 4, 2021 7:27am Kuemmel (439) 384 posts	Meanwhile I created a NEON double precision version for Linux (a feature only possible with 64 Bit aarch64). It results more ore less in half the speed of the single precision NEON version. But of course it’s more ‘realistic’ to use double floats with Mandelbrot’s as if you go deeper into the set you’ll need the precision. You’ll find it on the same page here I also added text-only versions for single and double, so people could test even if they only got a command line Linux running. Thanks again to Crhis for testing on Apple M1 and Neoverse. When I look at the double precision results and compare it to similar code/results I did for x86 it’s even more clear that Apple reached or even exceeded the performance of x86 in terms of floating point efficiency. Next step is to implement threading/multicore…

Reply

To post replies, please first log in.

Forums → Aldershot →

Search forums

Social

Follow us on

and

ROOL Store

Buy RISC OS Open merchandise here, including SD cards for Raspberry Pi and more.

Donate! Why?

Help ROOL make things happen – please consider donating!

RISC OS IPR

RISC OS is an Open Source operating system owned by RISC OS Developments Ltd and licensed primarily under the Apache 2.0 license.

Description

Everything with nothing particularly or remotely to do with ROOL.

Voices

Options

Forums
Login

Contact Us | About Us

The RISC OS Open Beast theme is based on Beast's default layout
Site design © RISC OS Open Limited 2024 except where indicated

Hosted by Arachsys

Powered by Beast © 2006 Josh Goebel and Rick Olson
This site runs on Rails