Chars error handling
Chris (121) 472 posts |
I’m looking into ticket #472 at the moment, where Chars chokes on a converted TrueType font. At present, whenever Chars hits an error while painting a character in its viewer pane, it reports the error and reverts straightaway to the System Font. This presumably made sense when font files were smaller and originated on RISC OS. Now that fonts can have lots more characters, and may well have been converted from files designed to work on other OSs, the prospect of errors is probably higher, so I’m wondering about how Chars ought to handle errors in rendering. The actual error in question is More generally, I’m thinking of rejigging things so that if Chars comes across an error when painting a character, it marks it as bad and moves on. The character will be rendered as something like a red rectangle in the viewer pane, and won’t be transmitted if clicked on. If lots of errors are picked up during redraw (say, more than 32?), then Chars will give up and revert to the System Font. Does that sound sensible? Or is there a good reason why Chars currently gives up straightaway in the face of errors from Font_Paint (ie, are there possible errors that could cause further problems unless things bail out immediately)? |
Jeffrey Lee (213) 6048 posts |
Sounds reasonable. The way I’d probably handle character-related errors is:
Reverting to the system font if there are lots of errors feels unnecessary. |
Chris (121) 472 posts |
I was thinking that Chars ought to keep a list of all the bad characters in the font, in order to know not to transmit one when clicked over (as it’s written in BASIC, having a limit on the number of characters to store would make that easier). But it might be better not to do this at all, and perhaps do a ‘test’ Font_Paint of the character prior to transmission, out of view, just to check that it won’t cause problems. In which case there’d be no reason to count the errors within a font when displaying in the Chars viewer. |
Jeffrey Lee (213) 6048 posts |
That’s easy – all you care about are the characters which are currently visible, of which there’s a hard upper limit because they’re displayed on a regular grid in a fixed size window. During the redraw code you can check if the window’s been scrolled and discard the error status of any characters which are no longer visible. |
Chris (121) 472 posts |
Ah, yes! |
Rick Murray (539) 13850 posts |
Given BASIC’s simplistic memory management, that could be… complicated… if there’s a badly converted font with lots of issues in the CJK character set. I’d agree with others – just fail it once and thereafter draw it as a red box.
Why? There’s no saying that the destination app is even using the same font (how many times have people used Chars in system font with a text editor using Trinity or even ZapRedraw?). It’s not really up to Chars to vet what codes are good and bad, send it, let the receiver deal with it (or crash, whatever). By the way, would it be possible to have a UTF-8 override to allow the display of UTF-8 and sending of codes when the system is in Latin1? (it’s an easy change, I’ve backed mine to do it, just hoping for something a little more official ;-) ). I think complicated fonts easily overwhelm FontManager. When creating my KanaFont, it took a while as FontEd (save as Draw, then feed into D2Font) would go only so far before crashing. I don’t think it was exactly designed with several billion Asian ideograms in mind. |
Chris (121) 472 posts |
That’s true. My instinctive feeling, though, is that an app shouldn’t transmit a known duff character, hoping that the target app is using a different font where it won’t cause an issue (even if that’s a perfectly possible scenario).
My original submission to ROOL allowed this, but I was asked to take it out because there hadn’t (and hasn’t yet) been an agreed protocol for determining how to interpret the bytes transmitted in all cases. I did have some ideas for this, but to be honest a number of people expressed strong views on the whole Unicode topic a while back, and I thought it best to let the issue get hammered out by those who know what they’re talking about, after which I’d be happy to update Chars to take advantage. |
Rick Murray (539) 13850 posts |
Ah, but the character isn’t duff. That specific rendering in that specific font is what is wrong. I still think it should be the job of Chars to send clicked on characters (you aren’t doing things automatically here, the user has to specifically choose one of the red box characters). It should not be the job of Chars to start vetting whether or not it is a good character or Piers Morgan.
Well, support for this is somewhere between “kludge” and “broken”, so such discussions tend to be recursive loops. I don’t think there will be much move forward until the Wimp is capable of running UTF-8 aware applications regardless of what the current alphabet is 1, and/or FontManager is capable of detecting invalid sequences and failing back to interpreting them as Latin1 or the like. As far as I’m concerned, UTF-8 support in the desktop should be the default standard (enabled by some sort of switch to Wimp_Initialise) regardless of whatever the system alphabet is. The alphabet? That should exist purely in order to instruct things on how to interpret non UTF apps. … So, don’t hold your breath. |
Chris (121) 472 posts |
Well, it would certainly make it simpler to implement if I didn’t make a check for duffness before sending. Any other views on this? |
Jeffrey Lee (213) 6048 posts |
I don’t mind. |
Steffen Huber (91) 1953 posts |
I agree with Rick – the character is not duff, but (maybe) the font used to visualize it (or maybe only !Char’s take on it). So send the character anyway, the receiving application might really really want it and handle it OK. Think about NetSurf using a downloaded TTF instead and converting it, or one of the new browsers using FreeType rendering the original TTF. |
Chris (121) 472 posts |
If Chris Hall reads this: you mentioned on the bug ticket page that you’d encountered some fonts that Chars trips over. If you still have copies, could you send them to me? My email is chris at cdwraight dot plus dot com. Thanks! |
Chris Hall (132) 3558 posts |
The fonts concerned are GillSans and WideLatin, both of which were converted to RISC OS from Windows using !TTF2f – interestingly my conversion (using the same method) of Arial worked OK even though I had added eighths fractions and a ″ character and made up the encoding code for hex 81 to 85. There is very little help provided with TTF2f. PM on the way. |
Chris (121) 472 posts |
OK, some interesting progress with AsanaMath, the font used in the bug ticket. It turns out that the font does display correctly if I increase the Font Cache to 32MB – the largest I can drag to on my system. Up until hitting the problematic characters, the cache was about 180K in size, and the limit on the cache growing automatically was 4096K. So it looks like something odd’s going on. Even odder, on my current version of Chars which just skips duff characters in the viewer, if I keep the cache size reasonable, switch to AsanaMath, UTF-8, All, then scroll down to the bottom, about 17 characters provoke the error about about memory being full. If I then scroll up, so these characters are out of view, then down again, only one is still marked. Do the process again, and none are. So it looks as if the Font Manager can draw these characters, or at least not error, if it has more than one go at it! Frustratingly, it’s hard to tell if the problematic characters are redrawing properly in this case, since the ones around it are outsized and overlap the relevant spaces in the grid. Clicking on some of these ‘bad’ characters to insert them into Draw (with the Alphabet set to UTF-8) caused a nasty crash that shut the system down. I’ll keep at it… |
Rick Murray (539) 13850 posts |
What happens if you tweak the code that, upon error, it’ll try a few times more to redraw that character before giving up?
Hmm, somebody’s error handling isn’t as good as yours. ;-) Still, could be worse… |
Steve Pampling (1551) 8172 posts |
I’ve been reading this and ever since saw the mention of a converted font I’ve been wondering whether the problem characters are a carry over of a rather bad character rendering on the original font that then produces a silly large character in the conversion process. I recall many years ago playing with font converters a nd noting that some fonts had characters that massively increase the file site. I’d say the best idea is to extract the problem characters and examine those at extreme magnification where you will probably find a few "fractal wannabees in the outline. Whether the origin is a bad converter routine or a terminally naff TTF original is probably an interesting question.. |
Chris (121) 472 posts |
Yeah, probably so, looking at some of the sprawling characters in this font that do render. The issue for Chars is, I guess, how far (if at all) to protect the user from dodgy fonts. The errors are raised by the Font Manager, which will be used by most (all?) of any apps that want to make use of the characters. So even if Chars did lots of clever things to get strange characters to render in its own viewer without error, that wouldn’t help if those fonts were used in a document. Obviously, in an ideal world we’d have (a) plenty of home-grown RISC OS Unicode fonts, and (b) a more robust Font Manager. But neither is on the cards right now, so it’s always going to be a balance between the usefulness of these converted fonts and the possible problems they cause. I think the current behaviour is probably the right balance: Chars skips any characters that error with Font_Paint, and reports to the user that there are errors in the font and that using it in apps may cause further errors (or something like that). If the user carries on, then at least they’ve been warned, and it does mean that big fonts like AsanaMath are (mostly) usable. |
Chris (121) 472 posts |
(Aside: I wonder if there’d be any interest in a bounty to extend the character sets of Trinity, Homerton and Corpus, etc? I have no idea of the process involved in creating outline font glyphs, but it’d be a really useful step in moving towards a Unicode-ready OS, and I’m sure some peeps on these forums might know about what would be involved. I’d contribute :) |
Rick Murray (539) 13850 posts |
Don’t.
Sorry, should have written more words, but it was break time at work and not a lot of time. As I said, current behaviour is a good compromise.
Do we have any?
Ever fiddled with FontEd? It looks sort of like creating characters as if they were Draw files, only FontEd is complicated, unfriendly, and prone to crashing if you do so much as raise your voice in frustration. And don’t kid yourself here, the Asian characters number five digits. There’s no way that would be supported in any context other than importing them from somebody else’s font. The question, I suppose, is to try to understand what is going wrong because it is probably going to be much simpler and quicker to fix the conversion process than to create a new font. And, allow me to take the place of nemo and point out that FontManager is ancient and horribly limited so… ;-) Anyway, first step is to ask if anybody who understands how the font rendering actually works (from the raw data) would be willing to look at the font file to see if they can identify what is actually going wrong for the errant characters…? |
Steve Pampling (1551) 8172 posts |
My money is on something in the original font outline description has some silly complications in an otherwise sweeping curve that causes the converter to generate the “fractal” like elements I mentioned and so the buffer/cache used when displaying to overflow. In which case the fix required is in the converter utility. If Chars flagged particular character numbers as “faulty” that might help identify a means of fixing the converter. |
Chris (121) 472 posts |
Yeah, I should have been clearer – I wasn’t really thinking of extending into thousands of characters, more like extending to cover the Latin/Greek/Cyrillic characters, Maths symbols, Dingbats, etc – a few hundred extra characters, I suppose. Still a lot of work, but perhaps more within the realms of the possible. It would obviously be ideal to have truly global fonts in the OS, but getting to a stage where the bundled RISC OS fonts could cope with the requirements of every European Territory, say, would be nice. |
Steve Pampling (1551) 8172 posts |
I’d say the phrase “that ship has sailed” is partway there, but if you’re thinking who I’m thinking then the proper phrase is “P***ed off” as in being irritated by someone(s) |