webpage with riscos text
entityfree (3332) 77 posts |
how do view a webpage with text that came from a riscos machine?? the wrong characters are being displayed. fyi i set alphabet to utf 8 and composed with andika font in “stronged” app using wide character mode. and then i use keypad to type utf code for esperanto characters. |
nemo (145) 2546 posts |
There are a number of different RISC OS Alphabets (encodings), but the most common one is called “Latin1” but is actually AcornLatin1. This is an extension of ISO 8859-1 with additional characters in the &80-&9F range. So if you specify “ISO-8859-1” as the encoding, it will be almost entirely correct. The additional characters Acorn added to Latin1 are: &81:Ŵŵ You may also find arrows from &88, though they aren’t even part of AcornLatin1:←→↓↑ I am hoping to move the RISC OS world onto using UTF-8, in much the same way that a jellyfish might hope to move a supertanker. |
Rick Murray (539) 13840 posts |
:-) If, if, FontManager had some sort of sensible fallback, one could transition the desktop to UTF-8 and know that older apps using Latin1 would still work. It’s not perfect, but it’s a start. |
nemo (145) 2546 posts |
I am up to my elbows in a UnicodeSupport module that makes Unicode unpleasantness go away. eg: WHEN wimp_keypress REPEAT SYS"US_Input",wimpkeycode%,keystate% TO uni%,keystate% IF uni%>-1 PROCuse(uni%) wimpkeycode%=-1 UNTIL state%>=0 The above copes with 8-bit Alphabets, UTF-8 Alphabet, mixed-mode input, wimp keypresses, fallback using the configured FallbackAlphabet, overlong encodings, surrogates etc. while using no persistent state beyond that single keystate% integer. Basically US_Input turns anything into Unicode, so you don’t have to write two different sets of key handling. The looping is necessary to cope with fallback, which is harder to get your head around when processing a stream like keyboard input than when the FontManager is consuming a string it has complete access to. It can also be used for reading files of course. Also… ;-) SYS"US_Case",uni% TO ,lowercase%,uppercase%,titlecase% Lots more of course. |