Proposed GraphicsV enhancements
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Steve Pampling (1551) 8154 posts |
Poke nemo to wake him up and there may be an opinion on what should be in place. mind you you did mention the magic “gamma curve” phrase:
Me?, I’m just happy seeing things happen faster. |
Jeffrey Lee (213) 6048 posts |
I now have a rough prototype of this in BASIC, so expect to see it soon! |
Jon Abbott (1421) 2641 posts |
“(r12 + g12 + b12) – (r22 + g22 + b22) != ((r1-r2)2 + (g1-g2)2 + (b1-b2)2)”, it did in my head yesterday morning! LOL “(abs(r1-r2) + abs(g1-g2) + abs(b1-b2))” that’s manhattan distance and is a useful approximation. Might be worth trying on a 256 palette to see what the result is like. I did consider a tree structure, but thought it was probably going to be slower for 256 values. Perhaps the many variants of the equation need to be centralised and a user controlled weighting put into the Screen configuration. Considering the amount of MUL’s going on, would a 256 entry SQR table help? It should all fit in the cache although would be slow initially. Screen caching will make a big difference, the “chocolate” code is still in the source from memory. The OS_ScreenMode to turn it off doesn’t work though, not in RO4 at any rate and I doubt it’s been touched since then. IIRC it put the screen memory into another domain and watched aborts to know if a cache flush was needed at VSync – a nasty hack to stop all the visual problems screen caching causes. |
Jeffrey Lee (213) 6048 posts |
My gut says it’ll be about the same, but I haven’t looked at the instruction timings to be sure. One downside to using a lookup table is that you’d have to find a spare register to hold its address, which could be tricky for some of the code. |
Jon Abbott (1421) 2641 posts |
Pre-weight three SQR tables for R,B,G and drop the loading registers to remove 3 MUL’s and get back 3 registers? You’d probably have to drop the table to 8bit though to keep the size down and increase the cache hit ratio, which would band low intensity colours. You could use 16bit values, but a 1.5kb table is probably pushing it speed wise, 6 MUL’s may be quicker than 3 LDRH’s due to the cache hit. “(abs(r1-r2) + abs(g1-g2) + abs(b1-b2))” might be a good compromise. |
Jon Abbott (1421) 2641 posts |
Speaking of weighting, the ColourDistance function near the bottom of the page below looks like an interesting approximation of Luv and might be worth investigating: |
Jeffrey Lee (213) 6048 posts |
My implementation of the tiled OS_SpriteOp is now in CVS, along with Wimp and Pinboard updates to make use of it where possible. On the Iyonix and Pi it’ll make use of hardware acceleration, resulting in significant performance gains (particularly for the Iyonix – redrawing an empty desktop in 16M colours now takes 4cs instead of 50cs, and the “window full of files” test in 16M colours has dropped from 70cs to 20cs). However on OMAP (or at least on a BB-xM) I found that the hardware acceleration made things slower – the DMA engine/memory bus just doesn’t seem to be designed for high-throughput transfers. So on OMAP (and IOMD) it avoids trying to use hardware acceleration and just renders all the sprites manually. I also found a couple of bugs with sprite rendering in general, which was causing some common sprite operations/types to render significantly slower than they should. |
Sprow (202) 1155 posts |
Should the “EXIT VC” be outside the “standalonemessages” switch, otherwise it’d plot twice, once via the tiled sprite op then again by falling through (in the ROM case)? |
Jeffrey Lee (213) 6048 posts |
Well spotted! For the ROM case it was actually meant to be exiting immediately, under the assumption that OS_SpriteOp 65 would always be available and therefore retrying with a different sprite op would be futile and result in the same error. But you’ve also lead me to a deeper issue – the code was relying on r1 being a pointer to the redraw block, despite the source comments indicating that the function takes no arguments. So for the ROM case it looks like it was successfully drawing using OS_SpriteOp 65 and then skipping the manual redraw loop due to using a duff redraw block pointer. |
Rick Murray (539) 13806 posts |
Oh, well… That’ll be TIs wonderful graphics capabilities. <mumble> <mumble> <mumble> |
Jeffrey Lee (213) 6048 posts |
Doug: Re: Jeffrey’s submission 15th Dec. I’ve now added a new *Configure item, *Configure NVidia. This will show up in tomorrow’s ROM and allow you to control the red/blue swapping that the driver/OS is performing. E.g.:
The settings are stored in CMOS on a per-card basis, so if you’ve got multiple cards which need different settings you can find the PCI device number (the first column in *PCIDevices) and then do something like ‘*Configure NVidia -device 123 -auto’ to change its settings (*Configure NVidia without specifying the device will set the settings for all cards). Also note that new settings will only take effect after a mode change. At some point I’ll be updating the screen setup plugin in !Configure to allow these settings to be controlled, but I’m not sure if that will be before or after RISC OS 5.22 is released. However there shouldn’t be any more ROM changes needed to get the plugin working, so it doesn’t matter too much if it’s only available once 5.22 is released. |
Doug Webb (190) 1158 posts |
Hi Jeffrey Thanks for the update. With my non hardware swapped colours card I get the following: With it set to -manual then with Geminus loaded 256, 16M colours are OK but 64K colours have a funky look still and 32k are swapped. With the manual additional -swap options then I correctly get 256 colours swapped but as expected 64K is still garbled/funky colours and 32K and 16M are swapped colours but the jpeg background is garbled in monochrome colours. So I have a number of options due to the update but I think I’ll stick with built in swapping as it seems more consistent with my non hardware modified card though I lose the jpeg accelerations. Thanks once again for the update and in general for the great additions you have made to RISC OS. |
Jeffrey Lee (213) 6048 posts |
As requested quite a while ago, I’ve now updated the GraphicsV documentation with a summary of the contexts under which each driver entry point may be called. Most of this is based around how the OS calls the drivers – basically anything which the OS calls from an interrupt handler is flagged as a background call, and all the others I’ve flagged as being foreground-only in order to ease driver implementation. Let me know if you spot any glaring errors, or think things could do with being explained a bit clearer. |
Malcolm Hussain-Gambles (1596) 811 posts |
Thanks for that Jeffrey, I read your post thinking “yeah, who’s going to be able to actually read and understand that ****”. Suprisingly enough I’m one of them, although I’ll admit I don’t understand how it all hangs together – which is entirely my lack of time to bother reading. |
Jon Abbott (1421) 2641 posts |
Looks good. My only suggestion would be to detail if IRQs are enabled/disabled during the calls. What caused me lots of issues was the fact GraphicsV 2 executes with IRQ’s enabled – games were crashing when writing to screen memory whilst it was being remapped, if they updated the screen via an IRQ. |
Jeffrey Lee (213) 6048 posts |
Done. Basically:
|
Jeffrey Lee (213) 6048 posts |
There are a couple of issues coming to light due to the EDID bounty that could do with some discussion here. Interlace revisited At the moment we don’t really have any drivers which honour the interlaced flag properly, and in the past I’ve been pretty confused as to why we’ve got two ways of specifying interlace – one via control list setting, and one via the sync polarity flags (https://www.riscosopen.org/forum/forums/3/topics/309#posts-23312). With us about to start pulling interlaced mode definitions from EDID we need to come up with some hard and fast rules as to how interlace should and shouldn’t be handled. E.g. if a display supports 1080i but not 1080p (or vice-versa) then we need to make sure the driver honours the interlace flag it’s given and doesn’t start doing its own thing like doubling the pixel rate and vertical timings to convert an interlaced mode to a progressive one. As far as I can see, we need to decide what to do about the following:
EDID extension blocks To read anything beyond the first 256 bytes of EDID you need to do an IIC write to a device at a different address to the main EDID EPROM in order to select the bank that the EPROM will return for future reads. We don’t have any clear way of handling this with the current GraphicsV IIC API – if it becomes the driver’s responsibility to write to the bank register then it means that the driver will end up doing much more than just accessing the IIC address that the caller provides it. On the other hand, if it’s the caller’s responsibility to write to the bank register then there’s the possibility for conflict if two callers try to access different banks. The lower-level OS_IICOp API doesn’t have this problem as you’re able to provide your own sequence of bus transactions which other callers won’t be able to interrupt. But with the GraphicsV API the only operations that are possible are the basic write-read or write-write sequences that are used to access the EPROM-like device that’s assumed to be on the other end of the bus. Personally I’d like to see the current API replaced (or extended) to support a lower-level OS_IICOp-like API. But as we currently make light use of GraphicsV IIC ops I’m happy with a quick fix interim solution of making it the caller’s responsibility to set the bank register correctly. Anyone else have any thoughts on the above two topics? |
Dave Higton (1515) 3497 posts |
I have experience with IIC but not EDID. If you simple keep reading, you read through the entire device, don’t you? (Writes wrap within the write page, reads don’t, in general.) So why is it necessary to write an address at all? Assuming that is is necessary, though, I’d suggest adding a function to set the read base address, rather than adding a function with arbitrary write capability. The idea is to limit the damage that can be done – blitzing the EDID info – by an inexpert call. |
Jeffrey Lee (213) 6048 posts |
Because that’s what the spec says! Basically the EPROM at bus address &50 only allows access to up to 256 bytes of data at a time, using the standard protocol (write one byte containing the start address, then read/write N bytes to access the data). For devices with more than 256 bytes of EDID there’s a second device at address &30 which has a ‘segment pointer’ register which controls which 256 byte page you’re accessing. Also, after double-checking the spec, it doesn’t look like it is possible to use the current GraphicsV API to program the segment pointer – it doesn’t use the standard register-based addressing scheme that most IIC devices use, and it needs to be a repeated-start transfer because the value resets to zero at the end of each transaction.
Good point, I’ve been bitten by that in the past. So it looks like the options are:
AFAIK EDID is the only interesting thing available via DDC (there’s DDC/CI, but it doesn’t appear to be widely supported by monitors). So although a full OS_IICOp-like interface would be nice, it probably isn’t worth implementing, and we’d be better off going down the route of option 1. |
Dave Higton (1515) 3497 posts |
Disallow them. There is no reason ever to write anything other than a segment pointer to the EDID device. All anything else will do is cause damage to the information stored there. A GraphicsV_IICOp operation can by all means specify the segment address, and handle writing it internally. |
Sprow (202) 1155 posts |
If (1) is chosen, in GraphicsV 14 the offset is handily defined as 16 bit so the page register poke could happen by magic so the caller assumes the address space is flat, or bits 16-23 could accept writes at 0×30, or one of the 8 reserved bits could be used to flag something or other. Options options. |
Jeffrey Lee (213) 6048 posts |
I’d be in favour of simplify specifying it in bits 0-15. Requiring two separate calls (one to set the segment pointer, one to read the EDID) won’t really work due to the way that the segment pointer is implemented in hardware. So it would be a case of ‘if bits 16-23 == 0xa1 and bits 8-15 != 0 then write the segment pointer at the start of the transfer’. |
Rick Murray (539) 13806 posts |
+1 Tale of woe. I normally run my Pi hooked to an analogue monitor (1280×1024) by way of an HDMI→VGA adaptor. There’s a tiny MCU in the adaptor that reads status information via the IIC lines of the monitor cable (I presume it transposes these into an EDID block). So the other weekend, I hook up a crappy Vista machine. The owner, a woman from work, uses MSIE, has no anti-virus, and complains that her mouse freezes when she visits Facebook. Somehow it was decided that I was the best guy to fix it (or maybe my French and/or politeness is not up to replying “casse toi!”). Whatever. I plug the Pi back in.
This is supposed to force a DMT mode that is 1280×1024 at 75Hz. Only, it looks as if the GPU is receiving something completely different from the monitor. I power everything down, leave it for fifteen minutes, then power up. 800×600. I hate Microsoft. Thankfully there is a “fix”. I need to set my Pi’s configuration as follows:
This instructs the GPU to ignore any EDID that is being read, and to specify that an HDMI display is connected. Power up like that, I’m back to RISC OS in a 1280×1024 mode. Comment out the additional two lines, reboot, it stays in 1280×1024, so I guess either the monitor has sorted itself out, or the Pi’s GPU has updated the EDID? I kind of wish the EPROM used was an old-fashioned one so I can just pop out the R/!W leg and pull it high so stuff leaves the settings the hell alone. Maybe this weekend, I have to put that Vista box back onto the network (which means I’ll be stuck with just the iPad – no way I’m having that on the intranet at the same time as any of my PCs) to install Avast! and Firefox. If the mouse doesn’t mess up, I’ll send the machine back. I’d like to run a check on the box using a Pendrive Linux with A/V, but the box is too dumb to know how to boot off of that sort of media. Pffft!
Can anybody here (and given what I have described above) come up with a useful real-world case for having the EDID writeable?1 1 Discount the case of “manufacturer is an idiot and messed up all the timings”; this is a production fault at best. It shouldn’t be up to random operating systems to “fix” what it perceives as being incorrect. |
Chris Evans (457) 1614 posts |
I’ve just realised that for our Pi Laptop project it will be a lot easier for us and users, if we do include an EDID EEPROM on the HDMI bus and are able to write to it. So please include a way of writing to it. |
Dave Higton (1515) 3497 posts |
Chris: you, as manufacturer, are in a different position from all the users. I believe the correct way forward is:
|
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13