What interfaces do you feel are important to RISC OS development and testing?
Pages: 1 2
Charles Ferguson (8243) 427 posts |
I’m trying to work out which areas of RISC OS Pyromaniac I should focus on. I have been updating the APIs and implementations according to the interfaces which are important and required for testing and developing RISC OS components. This has been very successful on many applications and modules so far. But obviously this is based on my own experience, and so might not match up to the expectations of others. So a quick query here may help identify areas of interest. What RISC OS API or interface do you require for your development and testing of RISC OS components? Doesn’t matter whether it’s common or obscure – although there are a lot of common interfaces that are provided, there are also some that are missing as they haven’t been needed up to now. Responses…
|
|||||||||||||||
Paolo Fabio Zaino (28) 1882 posts |
Hey Charles, It’s common, but I think it would be very useful to many, not just me. Obviously one could write their own little routine to print out or use report to show data passed to the Wimp SWI, but that also means adding extra checks all over the code when debugging something, so having an easy way to turn “I want to see all datastructure/register set/blocks I am passing to the WIMP” (and for that matter to other APIs as well) would make things easier I think. Just my 0.5c |
|||||||||||||||
Steffen Huber (91) 1953 posts |
An easy way to provide “hardware simulators” would be good. I.E. something like device simulators for SCSI/ATAPI/IDE/USB stuff. But ISTR that you did something in the CDFS area, so maybe this is already possible – I solved the problem for my past development needs one (or two?) layer(s) closer to the application, but having it on the OS level would be good. |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
Paolo asked for the ability to report on Wimp SWIs. This is just a specific instance of the general case of reporting on SWI interfaces. In RISC OS Pyromaniac the SWIs can be traced easily, by enabling the For example, using the example Wimp_ReadSysInfo 3 call – as I’ve got that to hand on one of my tests ( 7001a54: SWI Wimp_ReadSysInfo => r0 = &00000003, r1 = &deaddead, r2 = &deaddead, r3 = &00000000 r4 = &00000000, r5 = &00000003, r6 = &00000000, r7 = &00000000 r8 = &00000000, r9 = &00000000, r10 = &00000000, r11 = &00000000 r12 = &07001bbc, sp = &07001fbc, lr = &04107fe0, pc = &07001a54 CPSR= &00000010 : USR-32 ARM fi ae qvczn <= r0 = &00000000, r1 = &deaddead, r2 = &deaddead, r3 = &00000000 r4 = &00000000, r5 = &00000003, r6 = &00000000, r7 = &00000000 r8 = &00000000, r9 = &00000000, r10 = &00000000, r11 = &00000000 r12 = &07001bbc, sp = &07001fbc, lr = &04107fe0, pc = &07001a54 CPSR= &00000010 : USR-32 ARM fi ae qvczn Alternatively, with the argument decoding with a similar command line ( 7001a54: SWI Wimp_ReadSysInfo => r0 = &00000003 3 constant 3 <= r0 = &00000000 0 state And of course this works for any SWIs which have information available. This example comes from the Font metrics tests (invoked with 7001ff0: SWI Font_FindFont => r1 = &07001e64 117448292 pointer to string font_name "Homerton.Medium" r2 = &00000100 256 xsize r3 = &00000100 256 ysize r4 = &00000000 0 xres r5 = &00000000 0 yres <= r0 = &00000001 1 font r4 = &0000005a 90 xres_out r5 = &0000005a 90 yres_out ... 7002178: SWI Font_ReadInfo => r0 = &00000001 1 font <= r1 = &fffffffe 4294967294 x0 r2 = &fffffff6 4294967286 y0 r3 = &00000028 40 x1 r4 = &00000026 38 y1 Of course, you may not care about all SWI calls, but specific ones. In which case a ‘SWI trap’ can be used. This is trap which can print out information about the SWI call at the point of its invocation. For this example, I’ve chosen the ‘rtime’ tool, and trapping the SWI Connecting to pool.ntp.org for UDP NTP request a2a0: <SWIin> &61209 (Socket_Sendto) r0 = &00000000, r1 = &0000d620, r2 = &00000030, r3 = &00000000 r4 = &00000000, r5 = &00000000, r6 = &0000b49c, r7 = &0000b49c r8 = &0000d648, r9 = &ffffffff, r10 = &0000c994, r11 = &0000d5dc r12 = &0000d5f0, sp = &0000d5c8, lr = &0000a1dc, pc = &0000a2a0 CPSR= &20000010 : USR-32 ARM fi ae qvCzn Locations: r1 -> [&fa04001b, &00000000, &00000000, &00000000] in DA 'Application Space' r6 -> [&00000000, &00000000, &00000001, &00000001] in DA 'Application Space' r7 -> [&00000000, &00000000, &00000001, &00000001] in DA 'Application Space' r8 -> [&fb44bae6, &000000b0, &00000000, &00000000] in DA 'Application Space' r10 -> [&00000000, &00000000, &00000000, &00000000] in DA 'Application Space' r11 -> [&0000a298, &00000000, &0000d620, &00000030] in DA 'Application Space' r12 -> [&00000000, &00000000, &00000000, &0000d620] in DA 'Application Space' pc is DA 'Application Space': Function sendto+&18 lr is DA 'Application Space': Function send+&34 C backtrace: a2a0 function sendto Arg1: &00000000 0 Arg2: &0000d620 54816 [&fa04001b, &00000000, &00000000, &00000000] Arg3: &00000030 48 Arg4: &00000000 0 Arg5: &00000000 0 a1dc function send Arg1: &00000000 0 Arg2: &0000d620 54816 [&fa04001b, &00000000, &00000000, &00000000] Arg3: &00000030 48 Arg4: &00000000 0 93ec function sntp Arg1: &0000b574 46452 "pool.ntp.org" 98e4 function main Arg1: &00000003 3 Arg2: &0000db70 56176 [&0000db88, &0000db99, &0000db9f, &00000000] 3824cb8 function _main Arg1: &0000d740 55104 "$.Commands.RTime -sntp pool.ntp.org" Arg2: &000097ec 38892 Function: main a358 anonymous function a2a0: <SWIex> &61209 (Socket_Sendto) r0 = &00000030, r1 = &0000d620, r2 = &00000030, r3 = &00000000 r4 = &00000000, r5 = &00000000, r6 = &0000b49c, r7 = &0000b49c r8 = &0000d648, r9 = &ffffffff, r10 = &0000c994, r11 = &0000d5dc r12 = &0000d5f0, sp = &0000d5c8, lr = &0000a1dc, pc = &0000a2a0 CPSR= &20000010 : USR-32 ARM fi ae qvCzn Locations: r1 -> [&fa04001b, &00000000, &00000000, &00000000] in DA 'Application Space' r6 -> [&00000000, &00000000, &00000001, &00000001] in DA 'Application Space' r7 -> [&00000000, &00000000, &00000001, &00000001] in DA 'Application Space' r8 -> [&fb44bae6, &000000b0, &00000000, &00000000] in DA 'Application Space' r10 -> [&00000000, &00000000, &00000000, &00000000] in DA 'Application Space' r11 -> [&0000a298, &00000000, &0000d620, &00000030] in DA 'Application Space' r12 -> [&00000000, &00000000, &00000000, &0000d620] in DA 'Application Space' pc is DA 'Application Space': Function sendto+&18 lr is DA 'Application Space': Function send+&34 C backtrace: a2a0 function sendto Arg1: &00000000 0 Arg2: &0000d620 54816 [&fa04001b, &00000000, &00000000, &00000000] Arg3: &00000030 48 Arg4: &00000000 0 Arg5: &00000000 0 a1dc function send Arg1: &00000000 0 Arg2: &0000d620 54816 [&fa04001b, &00000000, &00000000, &00000000] Arg3: &00000030 48 Arg4: &00000000 0 93ec function sntp Arg1: &0000b574 46452 "pool.ntp.org" 98e4 function main Arg1: &00000003 3 Arg2: &0000db70 56176 [&0000db88, &0000db99, &0000db9f, &00000000] 3824cb8 function _main Arg1: &0000d740 55104 "$.Commands.RTime -sntp pool.ntp.org" Arg2: &000097ec 38892 Function: main a358 anonymous function Local internal time is: Wed Aug 31 21:37:15.72 2022 Remote time (UTC) is: Wed Aug 31 21:37:15.72 2022 New internal time is: Wed Aug 31 21:37:15.72 2022 That’s the full output from the command, untruncated. There are two reports present – one for the SWI entry, and one for the SWI exit state. In both cases a full report is given on the registers and state of the stack. The registers are listed, followed by a description of where the registers point, if they refer to areas of memory. The link register and PC have, in this case, also been decoded to show the functions they are in. If there is more information available – such as the AMB that was active at the time – this will be listed here as well, but when there’s only a single AMB present this information is omitted. Finally, if there is a APCS stack frame present, it will be traversed, showing the function names and the parameters as preserved on the stack. From this it is easy to see that the point at which the SWI was called was a function called ‘sendto’, which was called from a function ‘send’, which was called from ‘sntp’, and that from ‘main’ and ‘_main’. Such tracing is valuable to being able to recognise what it happening within the system. These traps are configurable at run time – if you feel so inclined – so it is possible to enable and disable SWI traps from the command line using a command like In some cases, SWI interfaces in RISC OS Pyromaniac have a lot of diagnostics already present. For example, the ColourTrans module provides Of course, that’s not the only way to see such information. To make it possible to see all the information that is being passed through the SWIs you can provide a Python implementation of the module you’re interested in (or just for the SWI you are interested in) and make it report what you want, and return whatever values you want. This is useful if you wanted to see how your program functions when used in odd circumstances to ensure that it will behave in the correct manner. Whilst a whole Python module might be more involved, providing just a single SWI call is relatively simple. For example, the implementation of @handlers.swi.register(swis.OS_IntOff) def swi_OS_IntOff(ro, swin, regs): """ OS_IntOff <= Interrupts disabled """ if not ro.config['kernel.interrupt_control_swis']: raise RISCOSSyntheticError(ro, "OS_IntOff disabled - change kernel.interrupt_control_swis to enable") if ro.config['apiwarnings.intonoff']: ro.trace.warning(name='API warning', label="OS_IntOff SWI should not be used in normal applications") regs.cpsr_i = True This implementation does what you would expect when none of the conditions fire – it turns off interrupts – but also it can be configured to raise an API warning to notify you that it shouldn’t normally be used. API warnings are just like the trace reports shown above and include all the same information. Alternatively, this SWI can return an error. This can be useful to see the behaviour of the program you are testing is still good in the face of errors. |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
Steffen Huber asked about a way to provide hardware simulators. RISC OS Pyromaniac does not care about hardware – in the sense of memory mapped and directly accessible devices. If you want to see how a particular piece of memory mapped interface works, the best way (only way?) to do it is to do so on the hardware it was intended for. That’s the only way you’ll know how that device functions. What RISC OS Pyromaniac does care about, as Steffen implied in his request, is the RISC OS interface that those drivers and APIs offer. This means that you cannot just run the drivers for an alien piece of hardware within RISC OS Pyromaniac. Instead modules and drivers within RISC OS Pyromaniac respond as they would for some given implementation. In some cases, they respond in different ways because different implementations have done different things. This better explained by way of example – Steffen mentioned the CDFS interface, so let’s have a look at that. RISC OS Pyromaniac provides a CDFSDriver module and a CDFSSoftPyromaniac module. The former is a reimplementation of the CDFSDriver module. It implements most of the SWI calls, including the CDFSSoft* module registration and deregistration and dispatch. This means that when you call those CDFSDriver SWIs, they will be passed on to the soft driver just like they would be with the regular module. But of course, they can be debugged by This means that you can see exactly how the calls are made and what’s going on. The CDFSSoftPyromaniac driver, as mentioned, a driver for CDFS, but which uses a file on disc for its data. This means that it can return data from a disc image very easily, and you can replace the disc image on the fly. The values returned by the Inquiry SWI can be modified so that the device looks like a different type of system. Its audio support is limited, but it does support the implicit pausing of audio when data is accessed (audio read and playback isn’t supported). The CDFS system is a good example of a case where providing some error injection would be good because the different software drivers have returned odd or incorrect responses from some calls over the years. Being able to say that when you call ReadData on a particular format you get back a different error, or that the data is offset by an amount at times, would be useful. The driver as supplied doesn’t do this. But it would be trivial to add in new configurations to allow the driver to report errors at certain times, or to change the behaviour of some of the SWIs to make it more like the aberrent (or abhorrent, depending on your view :-) ) interfaces. Similarly, the interfaces for SCSI, ATAPI, etc can be created, but they don’t exist at the moment – I’ve not got around to them. SCSI might be interesting to provide different implementations for. ATAPI less so. I’ve got an intention to add a MIDI interface, and improve the sound system. GPIO is provided, using the original API, with a couple of the non-master version interfaces. GPIO is a good example of an interface where what’s connected to it really matters, but it’s hard to simulate, so I’ve not yet got any way to ‘attach’ to the pins of the GPIO devices. However, RISC OS Pyromaniac offers a number of implementations of GPIO which can be switched in. Where the CDFSSoftPyromaniac module in the above example had only a single implementation and configuration, the GPIO module can be backed by many different implementations. Two of these are intended for use as debug and diagnostics. The ‘null’ implementation will always fail everything. I guess its use is pretty obvious. The ‘static’ implementation can report any number of pins, and can be configured to report a given value on each pin. This allows the interface to provide a (very rudimentary) set of static responses. Not very impressive, but enough to demonstrate how you might begin to provide a set of responses to test your interface. Simple example: pyrodev --basic --config gpio.implementation=static --config gpiostatic.pin_constants=10 BASIC V version 1.36 © RISCOS Ltd Starting with 1044732 bytes free >SYS "GPIO_WriteOE",0,1:SYS "GPIO_WriteOE",1,1: REM Set pin 0 and 1 to input >SYS "GPIO_ReadData",0 TO value:P."Pin 0: ";value Pin 0: 1 >SYS "GPIO_ReadData",1 TO value:P."Pin 1: ";value Pin 1: 0 > There are also GPIO implementations which will talk to real hardware. 7 different implementations deal with real hardware – CH341, MCP2221, MCP23008, MCP23017, PCF8574, PiGPIO and RPI.GPIO. The last two of these are for use with a Raspberry Pi and provide access to its GPIO ports. In this way you can test against real hardware, whilst using the debug functions provided by the system – using IIC is similar in how it’s managed. There are 4 implementations of real hardware access – CH341, CP2112, MCP2211 and PiGPIO. There is a ‘null’ implementation which responds negatively to all requests, and there is an ‘internal’ implementation which allows registered device implementations to respond. The internal implementations are registers Python classes which provide interfaces to read and write what looks like an IIC device. In this way you can configure RISC OS Pyromaniac to have a fake device connected, and the IIC requests will respond as if that device were present with appropriate data transfers. There are a number of IIC implementations currently present – AHT10, DS1307, DS3231, LM75, MAX30205, MCP4725, MCP9808, PCF8563, and PCF8583. The latter of these, the PCF8583, is an implementation of the responses of the NVRAM/clock chip in the Risc PC. If you use the RISC OS Select RTCHW module for the RiscPC, it will operate exactly as if it were on a RiscPC because the IIC communication works as it expects. The LM75 is a temperature sensor. Its fake implementation returns, by default, 22.5 degrees. pyrodev --config iicinternal.devices=lm75 --command BASIC BASIC V version 1.36 © RISCOS Ltd Starting with 1044732 bytes free >DIM b% 2 >b%?0=1:b%?1=0:SYS "IIC_Control",&a0,b%,2:REM Enable device >b%?0=0:SYS "IIC_Control",&a0,b%,1:REM Select temperature register >SYS "IIC_Control",&a1,b%,2:REM Read temperature >value=((b%?0)<<8) + b%?1 >value=(value>>7)*0.5 >PRINT "Temperature: ";value;" degrees" Temperature: 22.5 degrees > Those internal implementations here are more like what I think you would want for a SCSI type system. It would be nice to be able to provide a means by which you could say “there’s a tape device on the SCSI bus, connected to XYZ”, and it would respond as you would like it to. And similarly for other types of buses. Debugging such interfaces is specialised, so it’s more a matter of making it possible to test the behaviour of those cases, than of actually providing all the facilities. This is why the GPIO and IIC devices have been implemented – I found them interesting and wanted to provide a means to support them in a useful. Part of the point of RISC OS Pyromaniac is to make it possible to provide the interfaces that you want to use, either as a representative version of the system you’re accessing, or to provide abherrent responses so that you can exercise your code in different ways to be sure that it works properly. So whilst that doesn’t mean I’ve got those sorts of simulators, but I can at least say that I’ve got some of the same class of simulation system present. |
|||||||||||||||
Paolo Fabio Zaino (28) 1882 posts |
@ Charles
Thanks a lot, this is very very useful! :) |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
No other takers? I’d kinda expected people to say that things like some of the filing system interfaces were important, or the subtleties of FontManager’s string scanning, or maybe some smart alec saying that it was vital that top to bottom written text was important at the VDU 4 cursor. Nobody else interested? |
|||||||||||||||
Rick Murray (539) 13840 posts |
What’s important is that existing software continues to function. Which means an API can be extended, but there’s an awful lot of crap that should be retained. Like that torturous VDU sequence to disable the cursor, as there’ll be something that uses it instead of the friendly SWI call that does the same thing. In other words, tread lightly. |
|||||||||||||||
Steve Pampling (1551) 8170 posts |
Well, the keyboard handling is sorta, and people complain about the current inability(or extreme difficulty) to make new keyboard maps but unless you’re already halfway insane it’s not nice to point anyone there. |
|||||||||||||||
Rick Murray (539) 13840 posts |
Well, if we’re looking at wishlists of stuff to fix 1, may I suggest a few itty bitty 2 tweaks to FontManager that could greatly enhance the future development of RISC OS. Firstly, if FontManager is in UTF-8 mode and it runs into an invalid UTF-8 code, instead of plotting boxes or bursting into tears or whatever, switch to an eight bit character set (Latin1 by default, but might be nice to have it configurable) and render it as such. This will mean we can all switch to UTF-8 the following day, and all óûr åççèñts will still work, and things can be UTF8ified in due course. And, secondly, if a UTF-8 string is given to FontManager, it should do what is necessary in order to render it. It really shouldn’t be up to the application to detect “hey, this part is in hiragana / hangul / greek” to then split it apart, work out which fonts contain the relevant characters, and paint them piece by piece… because we all know every application will likely do it just differently enough (assuming that there aren’t weird quirks and/or bugs 3) that it looks a mess, so it would be best for FontManager to be on top of this so that it “just works” for everybody. 1 I’m going to gloss over Territory and the keyboard, my thoughts on both are well known. 2 Typical British understatement. 3 I wonder how many such usages would think to correctly shift Hebrew over to the right margin and plot from the right towards the left? |
|||||||||||||||
Paul Sprangers (346) 524 posts |
A long standing dream, mine at least, would come true. |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
No. I’m not looking for things to fix. As I stated, I’m asking what interfaces people rely on in their development and testing of RISC OS components, so that I can ensure that they are present and debuggable.
No, that would be stupid because it will produce unpredictable results for some things, mojibake for others and you’ll still end up in the same circumstance where you’ve not got something presentable. What you’re asking for is ‘I want to have everything work perfectly, knowing what I mean’, which isn’t sensible. However, feel free to prove me wrong by implementing such a translation algorithm and and demonstrating its use. Having done so, dropping said algorithm into any code such as font manager (in Pyromaniac or classic) should be trivial. Providing such an algorithm does not require anything special from me… or from RISC OS for that matter. |
|||||||||||||||
Rick Murray (539) 13840 posts |
Such as? UTF-8 sequences are extremely regular in their arrangements, and by nature of how the data is laid out, do not resemble actual real life words. It would be a contrived case that mixes them up.
What I’m asking for is a coherent manner of dragging some parts of the OS kicking and screaming into the 21st century. Because as it stands the big impediment to using UTF-8 is that doing so breaks all of the messages, icons, and texts based around the 8 bit character set. Change isn’t going to happen overnight (hell, it hasn’t happened in the decade or so I’ve been around here) so it’ll need a little bit of persuasion.
You know, if I could find my way around the insides of FontManager, I would have done this by now.
DADebug. That’s pretty much what I use. Spew lots of information to the buffer, then call it up in a TaskWindow and read it. Hardly sophisticated. ;) |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
The question I was asking was precisely that – what APIs are important to people. I’ve made a lot of decisions about APIs that must be provided to perform testing, but there are many that just aren’t important, and shouldn’t be retained. I disagree that ‘software continues to function’ is the most important however. If you want everything to continue functioning then nothing can change and you’re left with a stagnant OS where everyone wants everything to be exactly as it was. Which is precisely the situation that you’re in. I’ve never held this belief – it is important to keep as much as possible working, but the rigid adherence to everything that went before results in the user base contracting as users move away due to it not keeping up with ‘modern’ demands, or they stop using the system entirely. The system has to continue to develop and to take on new features, and drop old ones the weight of which is crippling. However, that was not in the slightest what I was discussing – I’m not asking for ‘what new features do you want’, but ‘what features must exist for the development and testing of RISC OS components?’
Those sequences are a requirement because things use it widely and there is no replacement interface. Changing those takes time, and must be done with a suitable upgrade path. Initially provide a new recommended interface. Mark the old as deprecated. Move the deprecated function to optional component. Remove the optional component. Stop supplying the optional component. The interface you mention for the cursor is an example of one that might be important for testing software – things don’t quite work right if you cannot disable the cursor, although it’s pretty unimportant to most things when you might want to test unless you were checking the bahaviour of the screen. The cursor control sequence is a hang over from the BBC where it was a control for the 6845 – registers 10 and 11. Usually this was programmed through The behaviour of these controls is implemented in RISC OS Pyromaniac, and The ANSIText implementation isn’t as free in its definition of the shape as the graphics implementation, but it provides the flashing state, a translation of underline/block cursor and the ability to disable the cursor. The graphics implementation is relatively accurate although it works off the ticker rather than the VSyncs – and configuration options can be used to disable its rendering entirely. The ability to disable the rendering of the cursor is important when testing to be able to check that the system does not leave a random cursor lying around when you’re capturing screenshots to compare to ‘gold masters’. Plus having the cursor flashing is a waste of time if you’re not checking the behaviour of the cursor. In any case, the VDU control sequence is swallowed and internal state updated appropriately even if you don’t have any graphics output system, or you’re using the plain text output. The cursor interface is retained largely because it allows you to test that your regular rendering code is correctly calling OS_RemoveCursors and OS_RestoreCursors correctly. If you don’t call it properly, the cursor will get in the way, but you wouldn’t know if the cursor was not implemented at all. Looking at it, there doesn’t appear to be a test for the cursor being able to be turned on and off in the current test suite. That is something that’s missing. |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
Inside Pyromaniac, keyboard handling is either…
The keyboard system is unfun, and isn’t handled exactly like the RISC OS Classic system because it’s so insane. I didn’t see any point in retaining that interface because it was so awful and unmanageable. The benefits in doing so are near non-existant – really the only thing you’d gain by retaining the old behaviour would be to aid people working with that interface. It should be put out to pasture and shot. A replacement interface would, however, be able to be integrated into the system, or even tested by replacing the KEYV handler which is currently present. In that sense all the development and testing of a replacement would be possible without having to worry about the legacy of the old keyboard input system. Which is a very good thing because is should have been replaced long ago and forcing people to continue to work with it is an example of why the OS should change and evolve rather than retaining the terrible interfaces from the past. So.. um… no, it doesn’t really help with the Classic input system. Because it’s awful and I despise it so much, and see no reason to continue its use. I’d say sorry for that but, I’m not really sorry. Really it’s that awful. So.. um… no, it doesn’t really help with the Classic input system. Because it’s awful and I despise it so much, and see no reason to continue its use. I’d say sorry for that but, I’m not really sorry. Really it’s that awful.But it could help you develop a new one. |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
I think you’ve missed the point. You don’t need to know anything about FontManager to implement an algorithm. You know what you want to do – take arbitrary input and produce useful output. You can now take those inputs and produce some code, in whatever language you care to do it in, to produce the outputs. You write the algorithm in a way that you can try it out with all the forms of input you care about and check that the outputs come out properly. And thus you prove the algorithm works. Having done that you merely transplant that code into the language and system of your choice – whether it be the FontManager in RISC OS or the rendering system that passes to Cairo. That’s a trivial job once you’ve worked out the complexities of whatever translation you need. |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
If you say you want to render a UTF-8 string, you want that string rendered. You don’t want to random decisions to be made out of your control. UTF-8, as you say, is regular, but it’s also strict. If you say you want to render something in UTF-8 and what you give isn’t UTF-8 and is detected as such, it should not be making other decisions to draw something else. That’s a solid “no” to making up other things. The fact that ‘a contrived case mixes them up’ is pretty much irrelevant. There is a defined way for UTF-8 to be rendered and going against that is stupid. If that means that your 20 year old software doesn’t work in UTF-8 mode then maybe the manner in which the system has been designed wasn’t very well thought out – maybe there should have been better thought out way to transition the system. But changing the font manager to guess what you meant is definitely not the way to do it. |
|||||||||||||||
Rick Murray (539) 13840 posts |
Actually, I’m setting the alphabet to UTF-8 and the system is assuming it’s all correct UTF-8, but as we all know this isn’t the case.
Yes, there should probably be a flag to say “this really is UTF-8, don’t mess with it”.
Coulda shoulda woulda doesn’t help when we have a system that works in a certain way, a huge amount if legacy software (a fair bit of which isn’t ever going to be updated) and a system which is capable of UTF-8 but has zero impetus to transition because at the moment it’s mostly a binary choice.
I’m happy to hear other options. Anybody? |
|||||||||||||||
Rick Murray (539) 13840 posts |
No apology needed. Anybody who has gone near it has run away screaming. |
|||||||||||||||
Clive Semmens (2335) 3276 posts |
Or just with their head in their hands. One becomes somewhat numb to these things. |
|||||||||||||||
Rick Murray (539) 13840 posts |
Hmm, might explain #10… |
|||||||||||||||
Clive Semmens (2335) 3276 posts |
I presume #10 means the tenth post in this thread? You mean I have to count…??? |
|||||||||||||||
Steve Pampling (1551) 8170 posts |
+1 |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
I’m sorry, my sarcasm about it being unfinished clearly didn’t make its way through to the post. Apologies. The example you cite is quite different, though, because there /are/ different APIs and they coexist with the original APIs that used configured codepage for the system. Since AppLocale stopped working reliably (and I believe has been removed from distribution), Windows can only use system configuration for the legacy code page when you use legacy APIs. One way in which RISC OS could have gone would have been to use new APIs for the other UTF-8 aware rendering calls, but of course the encoding is intended for exactly this sort of thing, so allowing the encoding to specify how the text is encoded is a better way to do it. The problem lies with the assumption that this will be just fine for all the applications that use the system encoding and assume that this will work – or the user for assuming that their legacy applications will continue to work when their assumptions (eg that the language is latin-1) are violated. POSIX systems tend to use the locale environment variables to define the behaviour of the command line tools (and to some extent the desktop ones, depending on the libraries in use). Because of this, the globality behaviour isn’t present, and you can fix things up on a per process basis. Because each process has its own context and can be configured accordingly. And because applications and tools on those systems are updated more regularly. But I’m diverging from what I thought was an obvious point… which was that it’s a half finished feature, without any transitionary interface provided. You suggestion was effectively saying that because nobody did the proper work to allow things to work well together, it’s fine to botch more things into the system, because that’s the only way to get things to change. The botch you’re suggesting is based on the misfounded belief that by changing the FontManager to ignore what is configured and instead use some heuristics to decide that the caller has lied to it about the content being UTF-8 is a good idea. It is not. It means that you cannot trust what you want to send to the FontManager to be rendered correctly. Your application (and anything you pass strings to) will not understand that the thing that you claim to be UTF-8 is not actually UTF-8 but has been guessed to be something else. And if the FontManager is guessing at these things, then what about when you’re rendering text at the command line, because not everything uses the FontManager. Do they also get changed too? Regardless of how you manage that process, UTF-8 is UTF-8 and what you’re proposing is not UTF-8 – it’s something else. If you were to do something like that, it should be given a different encoding name and alphabet. That would make it ‘ok’ in the sense that at least it would be clear that it wasn’t actually UTF-8, and thus not a violation of the standard. Go about it that way and you might have a way to allow you to progress – but it would be a hack because the introduction of UTF-8 is incompete, rather than a solution to the problem.
Personally I see this as kinda a no brainer. The problem is that you have a number of components which have different ideas about what the capabilities of the system are, because they’re older and aren’t being updated, and those that understand and can handle the cases of a more complex environment. The fact that the interfaces are global means that you cannot have both present and not have conflict. So the way to approach a solution is to isolate the components use of those capabilities in such a way that the global state that they are relying on becomes local to them, within their context. That is – remove the globalness of the state. This allows components to have different ideas about what the alphabet is so that they can claim compatibility with new features – rather than having that compatibility guessed at. Give a context to the components so that the alphabet isn’t global. Allow components to be able to say that they are aware of UTF-8 handling, and thus will use it, but otherwise they use the system global setting, which is whatever non-UTF-8 legacy system you might expect (Latin-1, et al). Ensuring that a context can have a different configuration means that you can have applications that coexist with one another but which know the system to have different settings. There will be a lot of corner cases, of course, but it smooths out the edges in a way that ensures that when you say ‘I want this plotting in the current alphabet’ that is what you get. This process is significantly simpler when you have a process model into which such information can be placed. This is the way that most systems get around such issues – you allow the interfaces to function in the old manner so that the components are unaware that there is any difference to what they are expecting, but new components can say ‘hey yeah, I’m good with that’. |
|||||||||||||||
Charles Ferguson (8243) 427 posts |
Not sure what happened to that post – the content I wrote (and which is visible when I click the ‘edit’) doesn’t repeat the ‘So.. um… no, it doesn’t really help …’ paragraph. I think it’s a processing but converting some of the textile to HTML, although I didn’t think I did anything special in there. As for ‘in a dark corner gibbering’… honestly that’s not a bad reaction to the key input system. I can kinda see how it got there, and table driven key content isn’t a bad thing, but it hurts so much and it seems (to me) to be so very fragile. A restart on the input methods would really help and maybe allow on the fly reconfiguration – someting that’s not easy to do with the current system. Well, you could but you’d be putting more pain into the system. |
Pages: 1 2