RISC OS Open: Forum: VIDC1 / RO3.1 frame store emulation

Dec 2, 2013 2:13pm

Jon Abbott (1421) 2651 posts

One solution I suppose is to trigger my own VSync’s, at 50Hz. How can I block the GPU from passing it’s VSync through to the OS though?

Dec 2, 2013 2:28pm

Jeffrey Lee (213) 6048 posts

You’ll receive the GraphicsV 1 call before the OS does, so all you need to do is to claim it (e.g. set R4 to zero) and that should stop the OS from seeing it. Just remember to add a flag to your code so that you can pass through the call that your code generates for the fake VSync!

You’ll probably have to use a fake VSync for the Pi anyway, as we don’t have any control over the mode which the hardware uses – all that happens when we change screen mode is the GPU resizes the overlay that we use as our framebuffer. So we don’t have any control over the mode timings.

Dec 2, 2013 2:40pm

Jon Abbott (1421) 2651 posts

I’ll go the fake VSync route then as it allows VSync to be triggered at the same timing a VIDC1 / 20 would, based on the requested pixelrate etc.

Dec 2, 2013 7:35pm

Rick Murray (539) 13840 posts

Now that we’ve got rid of CRT monitors, does it make any significant difference?

:rolls eyes:

A resource I have (for knocking VGA out of a microcontroller) times it for 640×480 as the VSync starting a minimum of 0.45ms after the last line. The pulse stays low for 64μs, and then the first line of the next frame will begin a minimum of 1.02ms after VSYNC ends. Obviously this changes depending on the resolution/refresh rate.

LCD monitors using analogue VGA use exactly the same mechanism as CRT monitors.

Here’s a shocker. So do digital monitors – the HDMI spec describes HSYNC and VSYNC signalling (in the TMDS data, not as discrete signals), as it is the device’s way of knowing when new lines and frames begin. Even with 480p (525 lines, NTSC style) an HDMI signal has a VBI period.

Looking at the above, 0.45ms prior and 1.02ms after, it would seem logical that the VYSNC would be triggered close to the end of the current frame. It serves a purpose lost in history (there’s no need to wait for an electron beam to fly back, and indeed most modern CRT monitors could do it a lot quicker than the sync allowed for), however it is also useful for the programmer as a time to draw into screen buffers or switch buffering or… Without this, computer games would look really poor.

When, and how, the sync signals occur is an important thing, and is precisely defined in the relevant standards.
If one is dealing with things synchronised to provide a smooth display, whether cinematic-style video frames or a game; one would mess with or disregard this stuff at their peril…

tl;dr summary: yes, it’s important.

Dec 2, 2013 8:05pm

Rick Murray (539) 13840 posts

Can you simply claim the first timer not in use?

Yes – although there isn’t actually an API to allow them to be claimed yet. The only hard rule is that timer 0 is used by the OS, for the other timers it’s a bit of a free for all.

?

The code says:

; The BCM2835 has timers on both the GPU and ARM side, but the ARM timer speed
; varies in power saving mode, making it practically useless, except perhaps as
; a cycle counter for code optimisation?
;
; The datasheet recommends the use of the GPU side timer (system timer), but to
; be unhelpful, all 4 interrupts from the system timer are officially reserved
; for VideoCore use. There's no way we can do without a reliable monotonic
; interrupt, so we have no option but to steal one or more of the channels from
; VideoCore.
;
; By inspection, at the time of writing, start.elf appears to only use Timer0
; and Timer2. In fact Timer0 even seems to be a centisecond interrupt, but we
; need our own version, partly because NetTime will want to bend the exact
; interrupt interval.

And the Pi forum says:

Note the GPU uses timers 0 and 2. 3 is reserved for linux, so would be most
suitable for a bare metal OS. 1 is currently unallocated, so could be used.

Which might restrict your options slightly…

Dec 3, 2013 12:55am

Theo Markettos (89) 919 posts

Rick, there are two sets of timings: GTF, which is designed for CRTs, and CVT which is intended for LCDs.

The advantage of CVT is it means you have more time plotting pixels and less dead time. That is quite handy because it can reduce the required video/pixel bandwidth for a given frame rate. While this is bad news for the programmer, it means machines that are limited by bandwidth (like the Risc PC and perhaps BeagleBoard) can squeeze out larger modes. I’m not sure if it’s useful for VIDC1 machines as they’re typically limited by video RAM size anyway (unless you fancy black and white ‘high res’ modes).

Dec 3, 2013 2:53pm

Jeffrey Lee (213) 6048 posts

Can you simply claim the first timer not in use?

Yes – although there isn’t actually an API to allow them to be claimed yet. The only hard rule is that timer 0 is used by the OS, for the other timers it’s a bit of a free for all.

?

Nothing I’ve said contradicts the sources. The HAL exposes GPU timers 1 and 3 to the OS. But since the HAL API requires timers to be numbered sequentially starting from zero, the HAL renumbers them such that HAL timer 0 corresponds to GPU timer 1, and HAL timer 1 corresponds to GPU timer 3. On all systems HAL timer 0 is used by the OS for the centisecond timer, and the other timers (how ever many there may be) are free for other software to use.

Dec 3, 2013 6:39pm

Rick Murray (539) 13840 posts

the HAL renumbers them such that HAL timer 0 corresponds to GPU timer 1, and HAL timer 1 corresponds to GPU timer 3

Thank you for the clarification.
I was not certain if the HAL fudged the timers to its own view of the world, or if the timers available mapped directly to those present on the underlying hardware.
[ it might be worth noting this on the wiki? ]

Dec 3, 2013 6:45pm

Rick Murray (539) 13840 posts

Rick, there are two sets of timings: GTF, which is designed for CRTs, and CVT which is intended for LCDs.

Ah, the “reduced blanking” option, the Pi uses that in monitor-style modes to run an insanely high (120Hz?) refresh.

(unless you fancy black and white ‘high res’ modes)

No thanks. Used to do DTP work in MODE 23 on a rather underpowered machine way back when. Mmm… Wasn’t that the one where the screen would blank out while accessing the floppy disc? ;-)
By comparison, 1280×1024 seems almost obscene.

Dec 15, 2013 11:38pm

Jeffrey Lee (213) 6048 posts

FYI: I’ve just thrown a bit of a spanner in your works by checking in this set of changes. Specifically, you’ll have to watch out for the following things stopping your code from working:

The OS prefers GraphicsV 17 over the list of pixel formats returned by GraphicsV 8, so you’ll have to start override GraphicsV 17 to return that 1/2/4 bpp modes are supported before anything will work
I’ve also tightened up mode vetting in a few of the drivers, so make sure any vet mode calls hide 1/2/4 bpp modes from the underlying driver

Once that’s done you should probably consider making ADFFS register itself as a proper GraphicsV driver instead of just intercepting calls to the original one. This will be the best for long-term compatibility, although in the short term I suspect there’ll still be some changes I’ll be making which will break things (e.g. for multiple head support your code will probably need to be aware of which head it should be writing stuff to). There’s a brief overview at the bottom of the GraphicsV page for how the registration/deregistration process works.

Dec 17, 2013 2:36am

Jon Abbott (1421) 2651 posts

I should really get the code over to you to look at, you could potentially include it in RO. I’m not actually doing much, the OS does all the work as it switches the frame buffer to DA2. All my code does is force the GPU to mirror the OS MODE, but in 8 bits, and copy the frame buffer from DA2 to the GPU frame buffer with conversion to 8bit when the GPU triggers a VSync.

There’s a bit of code to copy palette changes, to speed up the conversion, but apart from that the OS is doing all the work.

The one thing it does need, is a means to convert the physical address of the GPU frame buffer to its logical address and visa-versa. Extending existing SWI’s to support IO memory would be the most sensible route.

I’ve not actually touched the code for a few week. I put it on hold whilst I code the ARM3 JIT, as I need some overscan games running on the Pi to check the blitter routines take account of the VIDC1/20 registers correctly and can add mid-frame palette change support.

The JIT is now up to StrongARM compatibility, so I’m not far off full 32-bit compatibility. I have a few games running on the Pi that use 4-bit modes for you to look at. Terramex is working, Pac-mania is running on StrongARM although crashing on the Pi, I’m busy debugging the 32-bit code to track the issue down at the minute.

Dec 17, 2013 3:29am

Jeffrey Lee (213) 6048 posts

Sure, feel free to send the code my way. I’m interested in seeing exactly why you’re having trouble getting the GPU framebuffer logical address! As I’ve said before, I believe it’s your responsibility to map in the memory using OS_Memory 13, so I’m interested in seeing why that’s not working, or why you might have implemented things different to how I would have done it (or would have tried doing it).

Having said that, I don’t think we’re yet at the stage where including the code in the OS would make sense. At the moment handling of screen memory is too primitive – ideally we’d need OS-level support for multiple pools of screen memory (so that the OS can handle allocation of both the GPU framebuffer and the emulation framebuffer), and an improved data abort handler (to allow the mode emulation code to track page writes so it knows which pages need translating). Both of those features would also help with other things, so I’m hoping to implement them at some point.

Dec 17, 2013 12:10pm

Jon Abbott (1421) 2651 posts

Sent…along with the ARM3JIT and the original Terramex floppy image, to test both.

The GPU’s HAL layer is mapping the memory, I don’t need to touch it. You’ll see from the code it’s doing very little – it’s less than 100 instructions if you ignore the blitter code.

You’ve given me an idea though, instead of waiting for OS_Memory 0 to be extended to support IO memory, I could monitor OS_Memory 13 and create my own IO physical>logical map.

Is there a reason OS_Memory 0 doesn’t cover IO memory? I can’t think why it wouldn’t, unless there was either a problem with it doing so, or was it simply and oversight and not extended when OS_Memory 13 was added?

Dec 17, 2013 1:19pm

Jon Abbott (1421) 2651 posts

Actually, I don’t think that will work, as the GPU framestore will have already been mapped during POST. I’ll put my thinking cap on, I’m sure there is a reliable way of converting a physical address to logical address, direct from L1/2PT I guess.

Dec 18, 2013 6:56pm

Jon Abbott (1421) 2651 posts

Jeffrey – in reply to your message (eMail is down, so can’t reply), firstly I’m sorry for the CMOS issue. I forgot to mention that you’ll need SparkFS or similar loaded before using “Boot floppy” as the Boot scripts are in a ZIP file.

The floppy normally contains the script as well, but I’ve not added it to the floppy I sent you…sorry.

Regards CMOS protection, ADFFS does that already and the floppy is flagged as requiring CMOS protection…I’ve yet to add the code to force it on when the floppy is mounted though.

Dec 18, 2013 7:50pm

Jeffrey Lee (213) 6048 posts

No problem!

Dec 26, 2013 8:39pm

Jon Abbott (1421) 2651 posts

“is GraphicsV 1 (VSync occurred) triggered at the top or bottom of the frame on the Pi?”

Answering my own question, it starts at the top of the frame. By the time the Pi has copied 80kb for MODE 13, it’s advanced by ~6 rasters. I’ll probably have to implement dual frames on the GPU to get around that, as it will affect games that palette swap.

Jan 4, 2014 3:13am

Jon Abbott (1421) 2651 posts

This is now all coded and a beta available on the JASPP site. It won’t work on RO5.21 alphas past 14-12-13 though due to the GraphicsV changes – that will be corrected in a later beta.

Pac-mania is also available and runs under this and the ARM3 JIT on RO5.

Jan 15, 2014 2:33pm

Jon Abbott (1421) 2651 posts

Does the DAG start address have to be aligned on the Pi?

I’m trying to frameswap, but the second buffer is always shifted – as if the DAG has to be aligned. I’ve tried 1K, 4K, 32K, 64K, they all seem to produce the same result.

Jan 15, 2014 3:02pm

Jeffrey Lee (213) 6048 posts

The only requirement for the Pi is that the address needs to be aligned to the start of a scanline, relative to the start of the GPU screen memory. This is because we can’t directly specify the DAG addresses to the GPU; instead we’re limited to displaying a 2D subrectangle of a 2D framebuffer. So what we actually do is request a buffer that’s N times taller than the mode RISC OS wants, and then we adjust the vertical offset of the displayed rectangle according to which screen bank needs to be displayed.

Jan 15, 2014 5:04pm

Jon Abbott (1421) 2651 posts

Next problem, does a call to GraphicsV 6 trigger a VSync? When I make the call I’m seeing the VSync rate double!

Jan 15, 2014 5:14pm

Jeffrey Lee (213) 6048 posts

Try setting BCMVideo$ScreenBanksEnabled to 2. Remember that:

BCMVideo$ScreenBanksEnabled = 0 means disable screen banks
BCMVideo$ScreenBanksEnabled = 1 (or unset) means enable screen banks, but wait for the GPU when performing bank switches (i.e. wait until the next VSync, as that seems to be when the GPU processes its messages)
BCMVideo$ScreenBanksEnabled = 2 means enable screen banks, but don’t wait for the GPU. You might have to use triple buffering to avoid any tearing caused by the game writing to a bank it thinks isn’t visible anymore.

Jan 15, 2014 7:04pm

Jon Abbott (1421) 2651 posts

BCMVideo 2 fixes the bizarre VSync issue, however palette changes then stop working!

Jan 16, 2014 7:29pm

Jon Abbott (1421) 2651 posts

What triggers RISC OS to reset the palette? Is it a GraphicsV 1 call, Event 4 or something else?

To abstract the GPU VSync from the VSync software gets, I use the (true VSync) call to GraphicsV 1 to blit the frame from DA2 to the GPU and then return claiming the call. This as far as I can tell prevents RO from seeing the VSync event. I then issue a GraphicsV 1 call at 50 Hz which my code exits as unclaimed, letting RO do Its thing. RO should then reset the palette and trigger Event 4 etc at 50hz – which it does and once RO has done its thing, games change the palette back to their own, music plays at the correct speed as do games.

This worked okay until I added frame swapping on the GPU to avoid tearing. The immediate issue was that VSync went from 50Hz to 100Hz, causing everything to go too quickly – the visible frame rate however halved due to the VSync delay in the HAL drivers when changing the DAG. Using BCMVideo 2 resolved these issues, however RO is now changing the palette after the game has.

One of two things may be happening:

1. The palette change code in RO isn’t triggered by GraphicsV 1
2. The HAL is dropping one of the palette changes

The first thing I did was download the latest alpha and see if anything has changed post GraphicsV updates, unfortunately the build was so unstable I had to go back and didn’t get to really test it, I will try again in a few days though.

EDIT: Could the HAL buffer be filling up perhaps? How many palette changes, frame swaps etc can it buffer when BCMVideo is set to 2?

EDIT2: It looks like the buffer is reaching it’s limit, if I cache all the palette changes and then issue a GraphicsV 11 before switching frame, I don’t see the problem. However, this introduces more problems:

1. The palette change doesn’t occur until the frame after next (that’s with the GraphicsV 11 call before or after GraphicsV 6)
2. Issuing a GraphicsV11 call every frame seems to add delays, even with BCMVideo 2

Feb 20, 2015 1:40pm

Jon Abbott (1421) 2651 posts

The only requirement for the Pi is that the address needs to be aligned to the start of a scanline, relative to the start of the GPU screen memory. This is because we can’t directly specify the DAG addresses to the GPU; instead we’re limited to displaying a 2D subrectangle of a 2D framebuffer. So what we actually do is request a buffer that’s N times taller than the mode RISC OS wants, and then we adjust the vertical offset of the displayed rectangle according to which screen bank needs to be displayed.

There seems to be another alignment requirement in addition to being the start of a scanline on the Pi. When in 320×256×32, if I try to get the DAG to +&50500 (ie 320*257) the display is shifted. Is there a restriction on where the 2nd screen buffer can start in the vertical buffer?

I’ve had a quick look at the Pi’s GraphicsV driver although the code contains no comments, so it’s not clear what it’s actually doing.

EDIT: Ignore me, I was adding size to get the 2nd buffer instead of size*4.

Could GraphicsV be extended so graphics driver restrictions can be discovered? The start alignment and width modulus differ across video drivers, but there’s no way of discovering what they are.

Likewise, minimum screen width and height would be useful.

And a GraphicsV call to return the logical DAG address would be useful, instead of having to re-map the memory via OS_Memory 13 to discover it.

VIDC1 / RO3.1 frame store emulation

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Dec 2, 2013 2:13pm Jon Abbott (1421) 2651 posts	One solution I suppose is to trigger my own VSync’s, at 50Hz. How can I block the GPU from passing it’s VSync through to the OS though?

Dec 2, 2013 2:28pm Jeffrey Lee (213) 6048 posts	You’ll receive the GraphicsV 1 call before the OS does, so all you need to do is to claim it (e.g. set R4 to zero) and that should stop the OS from seeing it. Just remember to add a flag to your code so that you can pass through the call that your code generates for the fake VSync! You’ll probably have to use a fake VSync for the Pi anyway, as we don’t have any control over the mode which the hardware uses – all that happens when we change screen mode is the GPU resizes the overlay that we use as our framebuffer. So we don’t have any control over the mode timings.

Dec 2, 2013 2:40pm Jon Abbott (1421) 2651 posts	I’ll go the fake VSync route then as it allows VSync to be triggered at the same timing a VIDC1 / 20 would, based on the requested pixelrate etc.

Dec 2, 2013 7:35pm Rick Murray (539) 13840 posts	Now that we’ve got rid of CRT monitors, does it make any significant difference? :rolls eyes: A resource I have (for knocking VGA out of a microcontroller) times it for 640×480 as the VSync starting a minimum of 0.45ms after the last line. The pulse stays low for 64μs, and then the first line of the next frame will begin a minimum of 1.02ms after VSYNC ends. Obviously this changes depending on the resolution/refresh rate. LCD monitors using analogue VGA use exactly the same mechanism as CRT monitors. Here’s a shocker. So do digital monitors – the HDMI spec describes HSYNC and VSYNC signalling (in the TMDS data, not as discrete signals), as it is the device’s way of knowing when new lines and frames begin. Even with 480p (525 lines, NTSC style) an HDMI signal has a VBI period. Looking at the above, 0.45ms prior and 1.02ms after, it would seem logical that the VYSNC would be triggered close to the end of the current frame. It serves a purpose lost in history (there’s no need to wait for an electron beam to fly back, and indeed most modern CRT monitors could do it a lot quicker than the sync allowed for), however it is also useful for the programmer as a time to draw into screen buffers or switch buffering or… Without this, computer games would look really poor. When, and how, the sync signals occur is an important thing, and is precisely defined in the relevant standards. If one is dealing with things synchronised to provide a smooth display, whether cinematic-style video frames or a game; one would mess with or disregard this stuff at their peril… tl;dr summary: yes, it’s important.

Dec 2, 2013 8:05pm Rick Murray (539) 13840 posts	Can you simply claim the first timer not in use? Yes – although there isn’t actually an API to allow them to be claimed yet. The only hard rule is that timer 0 is used by the OS, for the other timers it’s a bit of a free for all. ? The code says: ; The BCM2835 has timers on both the GPU and ARM side, but the ARM timer speed ; varies in power saving mode, making it practically useless, except perhaps as ; a cycle counter for code optimisation? ; ; The datasheet recommends the use of the GPU side timer (system timer), but to ; be unhelpful, all 4 interrupts from the system timer are officially reserved ; for VideoCore use. There's no way we can do without a reliable monotonic ; interrupt, so we have no option but to steal one or more of the channels from ; VideoCore. ; ; By inspection, at the time of writing, start.elf appears to only use Timer0 ; and Timer2. In fact Timer0 even seems to be a centisecond interrupt, but we ; need our own version, partly because NetTime will want to bend the exact ; interrupt interval. And the Pi forum says: Note the GPU uses timers 0 and 2. 3 is reserved for linux, so would be most suitable for a bare metal OS. 1 is currently unallocated, so could be used. Which might restrict your options slightly…

Dec 3, 2013 12:55am Theo Markettos (89) 919 posts	Rick, there are two sets of timings: GTF, which is designed for CRTs, and CVT which is intended for LCDs. The advantage of CVT is it means you have more time plotting pixels and less dead time. That is quite handy because it can reduce the required video/pixel bandwidth for a given frame rate. While this is bad news for the programmer, it means machines that are limited by bandwidth (like the Risc PC and perhaps BeagleBoard) can squeeze out larger modes. I’m not sure if it’s useful for VIDC1 machines as they’re typically limited by video RAM size anyway (unless you fancy black and white ‘high res’ modes).

Dec 3, 2013 2:53pm Jeffrey Lee (213) 6048 posts	Can you simply claim the first timer not in use? Yes – although there isn’t actually an API to allow them to be claimed yet. The only hard rule is that timer 0 is used by the OS, for the other timers it’s a bit of a free for all. ? Nothing I’ve said contradicts the sources. The HAL exposes GPU timers 1 and 3 to the OS. But since the HAL API requires timers to be numbered sequentially starting from zero, the HAL renumbers them such that HAL timer 0 corresponds to GPU timer 1, and HAL timer 1 corresponds to GPU timer 3. On all systems HAL timer 0 is used by the OS for the centisecond timer, and the other timers (how ever many there may be) are free for other software to use.

Dec 3, 2013 6:39pm Rick Murray (539) 13840 posts	the HAL renumbers them such that HAL timer 0 corresponds to GPU timer 1, and HAL timer 1 corresponds to GPU timer 3 Thank you for the clarification. I was not certain if the HAL fudged the timers to its own view of the world, or if the timers available mapped directly to those present on the underlying hardware. [ it might be worth noting this on the wiki? ]

Dec 3, 2013 6:45pm Rick Murray (539) 13840 posts	Rick, there are two sets of timings: GTF, which is designed for CRTs, and CVT which is intended for LCDs. Ah, the “reduced blanking” option, the Pi uses that in monitor-style modes to run an insanely high (120Hz?) refresh. (unless you fancy black and white ‘high res’ modes) No thanks. Used to do DTP work in MODE 23 on a rather underpowered machine way back when. Mmm… Wasn’t that the one where the screen would blank out while accessing the floppy disc? ;-) By comparison, 1280×1024 seems almost obscene.

Dec 15, 2013 11:38pm Jeffrey Lee (213) 6048 posts	FYI: I’ve just thrown a bit of a spanner in your works by checking in this set of changes. Specifically, you’ll have to watch out for the following things stopping your code from working: The OS prefers GraphicsV 17 over the list of pixel formats returned by GraphicsV 8, so you’ll have to start override GraphicsV 17 to return that 1/2/4 bpp modes are supported before anything will work I’ve also tightened up mode vetting in a few of the drivers, so make sure any vet mode calls hide 1/2/4 bpp modes from the underlying driver Once that’s done you should probably consider making ADFFS register itself as a proper GraphicsV driver instead of just intercepting calls to the original one. This will be the best for long-term compatibility, although in the short term I suspect there’ll still be some changes I’ll be making which will break things (e.g. for multiple head support your code will probably need to be aware of which head it should be writing stuff to). There’s a brief overview at the bottom of the GraphicsV page for how the registration/deregistration process works.

Dec 17, 2013 2:36am Jon Abbott (1421) 2651 posts	I should really get the code over to you to look at, you could potentially include it in RO. I’m not actually doing much, the OS does all the work as it switches the frame buffer to DA2. All my code does is force the GPU to mirror the OS MODE, but in 8 bits, and copy the frame buffer from DA2 to the GPU frame buffer with conversion to 8bit when the GPU triggers a VSync. There’s a bit of code to copy palette changes, to speed up the conversion, but apart from that the OS is doing all the work. The one thing it does need, is a means to convert the physical address of the GPU frame buffer to its logical address and visa-versa. Extending existing SWI’s to support IO memory would be the most sensible route. I’ve not actually touched the code for a few week. I put it on hold whilst I code the ARM3 JIT, as I need some overscan games running on the Pi to check the blitter routines take account of the VIDC1/20 registers correctly and can add mid-frame palette change support. The JIT is now up to StrongARM compatibility, so I’m not far off full 32-bit compatibility. I have a few games running on the Pi that use 4-bit modes for you to look at. Terramex is working, Pac-mania is running on StrongARM although crashing on the Pi, I’m busy debugging the 32-bit code to track the issue down at the minute.

Dec 17, 2013 3:29am Jeffrey Lee (213) 6048 posts	Sure, feel free to send the code my way. I’m interested in seeing exactly why you’re having trouble getting the GPU framebuffer logical address! As I’ve said before, I believe it’s your responsibility to map in the memory using OS_Memory 13, so I’m interested in seeing why that’s not working, or why you might have implemented things different to how I would have done it (or would have tried doing it). Having said that, I don’t think we’re yet at the stage where including the code in the OS would make sense. At the moment handling of screen memory is too primitive – ideally we’d need OS-level support for multiple pools of screen memory (so that the OS can handle allocation of both the GPU framebuffer and the emulation framebuffer), and an improved data abort handler (to allow the mode emulation code to track page writes so it knows which pages need translating). Both of those features would also help with other things, so I’m hoping to implement them at some point.

Dec 17, 2013 12:10pm Jon Abbott (1421) 2651 posts	Sent…along with the ARM3JIT and the original Terramex floppy image, to test both. The GPU’s HAL layer is mapping the memory, I don’t need to touch it. You’ll see from the code it’s doing very little – it’s less than 100 instructions if you ignore the blitter code. You’ve given me an idea though, instead of waiting for OS_Memory 0 to be extended to support IO memory, I could monitor OS_Memory 13 and create my own IO physical>logical map. Is there a reason OS_Memory 0 doesn’t cover IO memory? I can’t think why it wouldn’t, unless there was either a problem with it doing so, or was it simply and oversight and not extended when OS_Memory 13 was added?

Dec 17, 2013 1:19pm Jon Abbott (1421) 2651 posts	Actually, I don’t think that will work, as the GPU framestore will have already been mapped during POST. I’ll put my thinking cap on, I’m sure there is a reliable way of converting a physical address to logical address, direct from L1/2PT I guess.

Dec 18, 2013 6:56pm Jon Abbott (1421) 2651 posts	Jeffrey – in reply to your message (eMail is down, so can’t reply), firstly I’m sorry for the CMOS issue. I forgot to mention that you’ll need SparkFS or similar loaded before using “Boot floppy” as the Boot scripts are in a ZIP file. The floppy normally contains the script as well, but I’ve not added it to the floppy I sent you…sorry. Regards CMOS protection, ADFFS does that already and the floppy is flagged as requiring CMOS protection…I’ve yet to add the code to force it on when the floppy is mounted though.

Dec 18, 2013 7:50pm Jeffrey Lee (213) 6048 posts	No problem!

Dec 26, 2013 8:39pm Jon Abbott (1421) 2651 posts	“is GraphicsV 1 (VSync occurred) triggered at the top or bottom of the frame on the Pi?” Answering my own question, it starts at the top of the frame. By the time the Pi has copied 80kb for MODE 13, it’s advanced by ~6 rasters. I’ll probably have to implement dual frames on the GPU to get around that, as it will affect games that palette swap.

Jan 4, 2014 3:13am Jon Abbott (1421) 2651 posts	This is now all coded and a beta available on the JASPP site. It won’t work on RO5.21 alphas past 14-12-13 though due to the GraphicsV changes – that will be corrected in a later beta. Pac-mania is also available and runs under this and the ARM3 JIT on RO5.

Jan 15, 2014 2:33pm Jon Abbott (1421) 2651 posts	Does the DAG start address have to be aligned on the Pi? I’m trying to frameswap, but the second buffer is always shifted – as if the DAG has to be aligned. I’ve tried 1K, 4K, 32K, 64K, they all seem to produce the same result.

Jan 15, 2014 3:02pm Jeffrey Lee (213) 6048 posts	The only requirement for the Pi is that the address needs to be aligned to the start of a scanline, relative to the start of the GPU screen memory. This is because we can’t directly specify the DAG addresses to the GPU; instead we’re limited to displaying a 2D subrectangle of a 2D framebuffer. So what we actually do is request a buffer that’s N times taller than the mode RISC OS wants, and then we adjust the vertical offset of the displayed rectangle according to which screen bank needs to be displayed.

Jan 15, 2014 5:04pm Jon Abbott (1421) 2651 posts	Next problem, does a call to GraphicsV 6 trigger a VSync? When I make the call I’m seeing the VSync rate double!

Jan 15, 2014 5:14pm Jeffrey Lee (213) 6048 posts	Try setting BCMVideo$ScreenBanksEnabled to 2. Remember that: BCMVideo$ScreenBanksEnabled = 0 means disable screen banks BCMVideo$ScreenBanksEnabled = 1 (or unset) means enable screen banks, but wait for the GPU when performing bank switches (i.e. wait until the next VSync, as that seems to be when the GPU processes its messages) BCMVideo$ScreenBanksEnabled = 2 means enable screen banks, but don’t wait for the GPU. You might have to use triple buffering to avoid any tearing caused by the game writing to a bank it thinks isn’t visible anymore.

Jan 15, 2014 7:04pm Jon Abbott (1421) 2651 posts	BCMVideo 2 fixes the bizarre VSync issue, however palette changes then stop working!

Jan 16, 2014 7:29pm Jon Abbott (1421) 2651 posts	What triggers RISC OS to reset the palette? Is it a GraphicsV 1 call, Event 4 or something else? To abstract the GPU VSync from the VSync software gets, I use the (true VSync) call to GraphicsV 1 to blit the frame from DA2 to the GPU and then return claiming the call. This as far as I can tell prevents RO from seeing the VSync event. I then issue a GraphicsV 1 call at 50 Hz which my code exits as unclaimed, letting RO do Its thing. RO should then reset the palette and trigger Event 4 etc at 50hz – which it does and once RO has done its thing, games change the palette back to their own, music plays at the correct speed as do games. This worked okay until I added frame swapping on the GPU to avoid tearing. The immediate issue was that VSync went from 50Hz to 100Hz, causing everything to go too quickly – the visible frame rate however halved due to the VSync delay in the HAL drivers when changing the DAG. Using BCMVideo 2 resolved these issues, however RO is now changing the palette after the game has. One of two things may be happening: 1. The palette change code in RO isn’t triggered by GraphicsV 1 2. The HAL is dropping one of the palette changes The first thing I did was download the latest alpha and see if anything has changed post GraphicsV updates, unfortunately the build was so unstable I had to go back and didn’t get to really test it, I will try again in a few days though. EDIT: Could the HAL buffer be filling up perhaps? How many palette changes, frame swaps etc can it buffer when BCMVideo is set to 2? EDIT2: It looks like the buffer is reaching it’s limit, if I cache all the palette changes and then issue a GraphicsV 11 before switching frame, I don’t see the problem. However, this introduces more problems: 1. The palette change doesn’t occur until the frame after next (that’s with the GraphicsV 11 call before or after GraphicsV 6) 2. Issuing a GraphicsV11 call every frame seems to add delays, even with BCMVideo 2

Feb 20, 2015 1:40pm Jon Abbott (1421) 2651 posts	The only requirement for the Pi is that the address needs to be aligned to the start of a scanline, relative to the start of the GPU screen memory. This is because we can’t directly specify the DAG addresses to the GPU; instead we’re limited to displaying a 2D subrectangle of a 2D framebuffer. So what we actually do is request a buffer that’s N times taller than the mode RISC OS wants, and then we adjust the vertical offset of the displayed rectangle according to which screen bank needs to be displayed. There seems to be another alignment requirement in addition to being the start of a scanline on the Pi. When in 320×256×32, if I try to get the DAG to +&50500 (ie 320257) the display is shifted. Is there a restriction on where the 2nd screen buffer can start in the vertical buffer? I’ve had a quick look at the Pi’s GraphicsV driver although the code contains no comments, so it’s not clear what it’s actually doing. EDIT: Ignore me, I was adding size to get the 2nd buffer instead of size4. Could GraphicsV be extended so graphics driver restrictions can be discovered? The start alignment and width modulus differ across video drivers, but there’s no way of discovering what they are. Likewise, minimum screen width and height would be useful. And a GraphicsV call to return the logical DAG address would be useful, instead of having to re-map the memory via OS_Memory 13 to discover it.