RISC OS Open: Forum: Keyboard Handling

Apr 23, 2019 7:22pm

First step, switch the French keyboard module to Latin1
https://www.riscosopen.org/forum/forums/4/topics/10855#posts-76638

Old and annoying bug.

Apr 23, 2019 9:14pm

Rick Murray (539) 13406 posts

I propose a new Service_International reason code to select keyboard by “dialling code”

Seriously?
Given there’s like an actual keyboard right in front of you, why don’t we just do away with this inadequate dialling code rubbish and instead use some sort of textual code.

First consideration: I can’t tell you Germany’s dialling code. I can tell it it’s “de”. Same for Holland and “nl”.

Second consideration: You reference Canada being unable to be told apart from America. Well, we’re halfway there if we have “ca” versus “us-f**kyeah” or “maga”. But the question should be asked – is a keyboard layout a property of a country or of the language within the country that it pertains to?

I may be wrong, but I think Canada uses two different layouts – American, and Canadian bilingual (looks like this).
The Swiss keyboard layout has multiple accents shown, and which ones are used depend on whether the driver is providing Swiss German or Swiss French (link).

Ctrl-Alt-F1 selects the UK keyboard layout (because Rule Britannia, apparently) and Ctrl-Alt-F2 selects the configured keyboard (because Know Your Place), but F3-F11 ought to be available for binding to the user’s preferred layouts.

Probably F1 is Rule Brittania also because it’s a known layout guaranteed to be present in every version of RISC OS. Otherwise, yes, I agree. But does it need to be F3-F11? How many keyboard layouts does one man need? [especially given we have no IME for fancy wibbles]

And if you don’t need the wordy labels, it’s less than 40KB:

Sounds like something the ROM builder could deal with. The source version is the wordy labels, and something can strip ’em out before making it a part of the ROM. 40K is a lot better than ~700K!

but are worried about breaking various applications.

I’m not. Create a new API that works, make some sort of fudge that presents the old API to older applications. If it looks like a duck and quacks like a duck, they’ll think it’s a small yellow waterfowl.

As long as we’re tied to the many weird, peculiar, and just BAD decisions of the Territory/Country/Keyboard system, we will always have this broken baggage messing things up. It needs to be made a strictly legacy interface, for it is horrifically broken. I’ve pontificated for many screenfuls on my fear and loathing of it all so I’ll won’t bother rehashing the same old rants once more, suffice to say Country != Keyboard != Language != Timezone (and the whole shebang is hardwired).

Apr 23, 2019 9:29pm

Chris Mahoney (1684) 2100 posts

Country != Keyboard != Language != Timezone

Bingo. Ignoring timezone, my work PC is in New Zealand, has a Maori keyboard, and is used exclusively in English. My home Mac is also in New Zealand, has a physical US keyboard but is configured for Japanese, and displays the OS in English but allows me to type in both languages.

As for timezones, where I live it’s +12:00. If I lived on the Chatham islands then it’d be +12:45 (but I’d still be in NZ!)

Apr 24, 2019 5:10am

Clive Semmens (2335) 3130 posts

With judicious hacking, I was able to type in every European language that uses the Roman alphabet*, plus Greek, Russian, and Hindi, on RiscPCs – without having to remember silly codes for all the accented characters. But oh, what a hack it was – not something you could release into the wild. Except that I did – it was the subject of an article in Acorn User, with the software on the cover disc.

If the same keyboard handling (or one on the same principles) could deliver UTF-8 output, how happy I would be.

* and others, probably – but not Vietnamese, with its extreme multiply-accented characters.

Apr 24, 2019 7:06am

John Williams (567) 768 posts

Canadian bilingual (looks like this).

I expect the Scoll indicator light on the top right (temoin) is for poor spellers with sore feet!

Apr 24, 2019 2:48pm

nemo (145) 2437 posts

Rick misthought

Given there’s like an actual keyboard right in front of you, why don’t we just do away with this inadequate dialling code rubbish and instead use some sort of textual code.

And what are the symbols on that keyboard? Are they ABC, or are they ऄभख? How do you type “Arabic” and in what language when your keyboard has Greek keytops? This is why we use numbers or function keys.

How many keyboard layouts does one man need?

On this PC I’m running three different Latin ones, a Hindi one and a Japanese. I’ve had more but I forget the layouts.

my work PC is in New Zealand, has a Maori keyboard, and is used exclusively in English

It is extraordinary that there’s no New Zealand country code.

Now for something beefier in a separate post.

Apr 24, 2019 2:53pm

nemo (145) 2437 posts

Country codes are used for a number of things, but pragmatically it’s a language selector. I think the IANA Language/Script/Region system is the most sensible, but for backwards compatibility (and also for reasons of sanity) we should define what Country number means in terms of IANA LSR. This is an attempt at that, but also includes those languages that might be implied by the “country”, but which can’t be by the Country number.

I know this is a lot of data, but some consensus will have to be reached before this is set in stone.

Country name is in quotes if it is not a country. There are many “countries” that don’t actually define a country. And I have no idea what “Lapp” was supposed to mean. Suggestions on a postcard, or indeed here.

  Num     Name             Language          Script         IANA
-----------------------------------------------------------------------
  0       "Default"
  1       UK               English           Latin          en-GB
  2       "Master"         <varies>          <varies>       <varies>
  3       "Compact"        <varies>          <varies>       <varies>
  4       Italy            Italian           Latin          it-IT
  5       Spain            Spanish           Latin          es-ES
  6       France           French            Latin          fr-FR
                           Breton            Latin          br-FR
  7       Germany          German            Latin          de-DE
                           Sorbian (Lower)   Latin          dsb-DE
                           Sorbian (Upper)   Latin          usb-DE
  8       Portugal         Portuguese        Latin          pt-PT
  10      Greece           Greek             Latin          el-GR
  11      Sweden           Swedish           Latin          sv-SE
                           Sami (Northern)   Latin          se-SE
                           Sami (Southern)   Latin          sma-SE
                           Sami (Lule)       Latin          smj-SE
  12      Finland          Finnish           Latin          fi-FI
                           Sami (Northern)   Latin          se-FI
                           Sami (Inari)      Latin          smn-FI
                           Swedish (Fin)     Latin          sv-FI
  13      reserved
  14      Denmark          Danish            Latin          da-DK
  15      Norway           Norwegian         Latin          no-NO
                           Bokmal            Latin          nb-NO
                           Nynorsk           Latin          nn-NO
  16      Iceland          Icelandic         Latin          is-IS
  17      Canada           French (CA)       Latin          fr-CA
  18      Canada           English (CA)      Latin          en-CA
  19      Canada           <varies>          Latin          <varies>-CA
                           Inuktitut(Inuk)   Inuktitut      iu-Cans-CA
                           Inuktitut(Lat)    Latin          iu-Latn-CA
                           Mohawk            Latin          moh-CA
  20      Turkey           Turkish           Latin          tr-TR
  21      "Arabic"         Arabic            Arabic         ar-<varies>
  22      Ireland          English (IE)      Latin          en-IE
                           Irish             Latin          ga-IE
  23      Hong Kong        Cantonese (HK)    Chinese(Simp)  zh-HK
                           Cantonese         Chinese(Simp)  yue-Hans
                           Cantonese         Chinese(Trad)  yue-Hant
  24      Russia           Russia            Cyrillic       ru-RU
                           Bashkir           Cyrillic       ba-RU
                           Yakut             Cyrillic       sah-RU
                           Tatar             Cyrillic       tt-RU
  25      Russia2          Russia            Cyrillic       ru-RU
  26      Israel           Hebrew            Hebrew         he-IL
  27      Mexico           Spanish (MX)      Latin          es-MX
  28      "LatinAm"        Spanish (419)     Latin          es-419  (many!)
  29      Australia        English (AU)      Latin          en-AU
  30      Austria          German (AT)       Latin          de-AT
  31      Belgium          French (BE)       Latin          fr-BE
                           Dutch (BE)        Latin          nl-BE
  32      Japan            Japanese          Kana+Kanji     jp-JP
  33      "MiddleEast"     Arabic            Arabic         ar-<varies>
  34      Netherlands      Dutch             Latin          nl-NL
                           Frisian           Latin          fy-NL
  35      Switzerland      German (CH)       Latin          de-CH
                           French (CH)       Latin          fr-CH
                           Italian (CH)      Latin          it-CH
                           Romansh           Latin          rm-CH
  36      Wales            English (CY)      Latin          en-GB  ?
                           Welsh             Latin          cy-GB
  37      "Maori"          Maori             Latin          mi-NZ
  38-47   reserved
  48      United States    English (US)      Latin          en-US
                           Spanish (US)      Latin          es-US
  49      Wales2           Welsh             Latin          cy-GB
  50      China            Mandarin          Chinese(Simp)  zh-Hans-CN
  51      Brazil           Portuguese (BR)   Latin          pt-BR
  52      South Africa     English (SA)      Latin          en-ZA
  53      South Korea      Korean            Hangul         ko-KR
  54      Taiwan           Mandarin (TW)     Chinese(Trad)  zh-Hant-TW
  55-69   reserved
  70      "DvorakUK"       English           Latin          en-GB
  71      "DvorakUS"       English (US)      Latin          en-US
  72-79   reserved
  80      "ISO1"           <varies>          Latin          <varies>
  81      "ISO2"           <varies>          Latin          <varies>
  82      "ISO3"           <varies>          Latin          <varies>
  83      "ISO4"           <varies>          Latin          <varies>
  84      "ISO5"           <varies>          Latin          <varies>
  85      "ISO6"           <varies>          Latin          <varies>
  86      "ISO7"           Greek             Greek          gr-<varies> (GR)
  87      "ISO8"           Hebrew            Hebrew         he-<varies> (IL)
  88      "ISO9"           <varies>          Latin          <varies>
  89-94   reserved
  95-125  reserved (but overlaps with alphabet numbers)
  126     "Special"        <varies>          <varies>       <varies>
  127     "Read" - OS_Byte 70
  128     Faroe            Faroese           Latin          fo-FO
  129     Albania          Albanian          Latin          sq-AL
  130     South Africa     Afrikaans         Latin          af-ZA
                           English (SA)      Latin          en-ZA
                           Sesotho sa Leboa  Latin          nso-ZA
                           Setswana          Latin          tn-ZA
                           isiXhosa          Latin          xh-ZA
                           isiZulu           Latin          zu-ZA
  131     "Bengal"         Bengali           Bengali        bn-<varies> (IN/BD)
  132     Bulgaria         Bulgarian         Cyrillic       bg-BG
  133     "ByeloRussian"   ByeloRussian      Cyrillic       ru-<varies>
  134     "Czech"          Czech             Latin          cs-<varies> (CZ)
  135     "Devang"         Hindi             Devanagari     hi-<varies> (IN)
  136     "Farsi"          Persian           Arabic         ar-<varies>
  137     "Gujarati"       Gujarati          Gujarati       gu-<varies> (IN)
  138     Estonia          Estonian          Latin          et-EE
  139     "Gaelic"         Gaelic (Scots)    Latin          gd-GB  ?
                           Gaelic (Manx)     Latin          gv-IM  ?
                           Gaelic (Irish)    Latin          ga-IE  ?
  140     "Ancient Greek"  Ancient Greek     Greek          grc-GR
  141     Greenland        Kalaallisut       Latin          kl-GL
  142     Hungary          Hungarian         Latin          hu-HU
  143     "Lapp"                                                 ?
  144     Latvia           Latvian           Latin          lv-LV
  145     Lithuania        Lithuanian        Latin          lt-LT
  146     Macedonia (FYR)  Macedonian        Cyrillic       mk-MK
  147     Malta            Maltese           Latin          mt-MT
  148     Poland           Polish            Latin          po-PO
  149     "Punjab"         Punjabi           Guru           pa-<varies> (IN)
  150     Romania          Romanian          Latin          ro-RO
  151     "SerboCroat"     Serbo-Croatian    <varies>       sh-<varies> (many!)
  152     "Slovak"         Slovakian         Latin          sk-SK
  153     "Slovene"        Slovenian         Latin          sl-SI
  154     "Tamil"          Tamil             Tamil          ta-<varies> (IN)
  155     Ukraine          Ukrainian         Cyrillic       uk-UK
  156     "Swiss1"         French (CH)       Latin          fr-CH
  157     "Swiss2"         German (CH)       Latin          de-CH
  158     "Swiss3"         Italian (CH)      Latin          it-CH
  159     "Swiss4"         Romansh           Latin          rm-CH

Apr 24, 2019 2:53pm

Clive Semmens (2335) 3130 posts

I’m running three different Latin ones, a Hindi one and a Japanese.

In the days when I did a lot of foreign language work, I only ever used one Latin keyboard layout, with accent keys rather than accented characters. But I did have separate keyboard layouts for Cyrillic, Greek and Hindi.

Apr 24, 2019 6:35pm

Rick Murray (539) 13406 posts

I know this is a lot of data, but some consensus will have to be reached before this is set in stone.

I notice there’s only one entry for Japan – Kana+Kanji. There is also a way of writing called Wāpuro rōmaji which enters romanised Japanese using a Western keyboard.

There is an incredibe amount of duplication in that table. Why Wales and Wales2 (what’s the difference?) when really it belongs as a category of United Kingdom. Well, Brexit might eventually fix that, but for the moment it’s a part of UK and probably ought to be treated as such.

My guess for “Lapp” is somebody got halfway to making a territory for Lapland and didn’t realise it’s sometimes called “Lappi” but often called “Sápmi” (as “Lappi” is a subset of Sápmi).
Then they realised there’s like fifteen languages, and just gave up at that point.
That’s my guess.

Shouldn’t country and language be separate and not implied? I can nominate myself here. My system should be set as “France” (rather than hacking the UK territory), but it cannot be as there’s the assumption that France = French. While it is not an unreasonable assumption, it’s an inflexible system that can’t handle “user is located [here] and speaks [this]”. Android, iOS, XP… no problems with that concept.

Apr 24, 2019 6:41pm

Steve Pampling (1551) 7932 posts

I know this is a lot of data, but some consensus will have to be reached before this is set in stone.

Am I allowed to say that the only columns containing what I consider to be sensible and meaningful data are column 1 and column 5?
The rest seems to only slightly better than composing a sentence by throw darts at scrabble tiles.

Apr 24, 2019 11:57pm

Tristan M. (2946) 1036 posts

“Honk Kong” literally the first thing I saw on the list. Sorry.

Apr 25, 2019 1:33pm

nemo (145) 2437 posts

Rick confirmed

Shouldn’t country and language be separate and not implied?

That is the central thrust of my point, yes. The purpose of the above table is to be certain what the existing Country codes mean for both country and language. As can be seen, the answer is often <varies>.

It may be that someone is using Country number n for purposes incompatible with the IANA LSR I’ve suggested, and I’d rather know that now than later!

Steve asked

Am I allowed to say

You, Steve, are allowed anything. Column 2 is just the official Acorn name for the ‘Country’, reprinted here because I don’t expect people to know that Country 151 is ‘SerboCroat’ for example.

Column 3 is the language implied by the ‘Country’. Again, someone might disagree with me – 156 is definitely French(CH), but I’ve no idea which ‘Gaelic’ 139 is supposed to be. I’m guessing Scots, but maybe it’s Irish… but then maybe Country 22 is Irish.

Column 4 is the script, which is sort of specified by the Country-to-Alphabet functionality, but not really in the case of Cyrillic (since the Acorn Cyrillic alphabets, both of them, contain Cyrillic and Latin). This is important because it informs font repertoire requirements in the Unicode world.

I’m happy you described column 5 as ‘meaningful’ – that was the intention. Lamentably the many <varies> in that column are themselves part of the problem.

I have an alphabet module here that supports (and restores, for RO5) the ‘Master’ and ‘Compact’ choices, but allows the ISO country to be separately defined. It was when I was considering extending that strategy that I started compiling this list to understand the scale of the problem.

This would allow someone using Country 139 ‘Gaelic’ to specify whether they mean GB, IM or IE and hence imply the correct IANA LSR, just as my existing code allows someone using ‘Master’ to specify ‘GR’ (because, yes, the original BFont encoding supports Greek).

No, I don’t think anyone is actually using the ‘Master’ alphabet for this reason… but that is not the point. It’s a test case. The exact same situation applies to Bengali for example, which may well be hi-IN, but could just as easily be hi-BD (or GB, if one wished).

Rick noted

There is also a way of writing called Wāpuro rōmaji which enters romanised Japanese using a Western keyboard.

Indeed, and 90% of Japanese computer users use this method. However, that affects neither the Language nor the Script… that’s a keyboard layout (or IME) issue. This is another of the problems with the euro-centric concept of “selecting keyboard by country”. It always was staggeringly parochial.

Why Wales and Wales2?

Because, as the IANA LSRs I’ve suggested make plain, one is for English in Wales and the other for Welsh. Welsh has vowels us mere mortals lack, so it benefits from the Welsh alphabet instead of Latin1 – try *Alphabet Welsh and see the small changes. This becomes unimportant in Unicode, but may still be significant in keyboard handler selection. I claim no expertise in use of Welsh, despite feeling notionally Welsh to a large degree.

My guess for “Lapp”

Is probably frighteningly close to the reality of how we got most of these. :-O

Android, iOS, XP… no problems with that concept.

Absolutely. And “bleedin immigrants” aside, you could be a 100% pureblood Breton and be offended by having to select “French”. But that’s enough politics. ;-)

Honk

Thank you! What would be good would be certainty of what language ‘Hong Kong’ is supposed to imply. I’ve no desire to provoke an international incident, but there must, historically, have been an intention to mean something… I suspect it was probably “English… but somewhere foreign!”. :-/

Apr 25, 2019 1:57pm

nemo (145) 2437 posts

Further to my earlier “MessageTrans would automatically…” arm waving. Having monitored what MT is actually asked to do, there are only a few patterns that would need to be grokked for compatibility:

• …<countryname>.<leafname>
• …Messages<countrynum>
• …<countrynum padded to 3 digits>.<leafname>

The majority however are of the “path vars make it someone else’s problem” variety:

• <a path variable>:<leafname>

However, there’s also a lot of this one, which is as close to not internationalising as it is possible to get.

• <Obey$Dir>.Messages

The same patterns apply to Templates of course, which can be intercepted through Wimp_Extend. Toolbox Resource files are more awkward though. I’ve yet to look into that.

The intention is to allow the user to specify her language preferences as a string – eg “en-GB,fr-CA,zh-HK”. This would cause MT et al to check for the following resources, in this order:

• en-GB (merged on top of en if available)
• en
• 1
• UK (this is really annoying)
• fr-CA (merged on top of fr if available)
• fr
• 6
• France
• zh-HK (merged on top of any of the following three)
• zh-yue (merged on top of any of the following two)
• zh-Hans (merged on top of zh if available)
• zh
• 23
• HongKong
• 1 (default)
• UK (default)

It’s clear that “UK” has to be retained for compatibility with a large number of existing applications. Therefore Ukrainian will have to always be “uk-UK” and never simplify.

Should the postfix pattern be “Messages-en-GB” rather than “Messagesen-GB”? I suspect so.

Ideally I’d want to check a centralised ‘Language Pack’ (eg !Territory) for pre-localised files for applications, much as happens with *IconTheme <appname> <spritefile> in the original Theme protocol… but that specifies an explicit application name. “<Obey$Dir>.Messages” really doesn’t – one can canonicalise and then work back to an enclosing appdir, but the result is flatter than the <Publisher>.<appname> used in the Theme protocol. Shame.

Oh and yes, Chinese is complicated. What the Chinese government regard as ‘dialects’ is what the rest of us would call ‘mutually unintelligible separate languages written in the same script’, but don’t tell them I said so.

So when searching for an IANA LSR:

…<countryname>.<leafname>       => …<LSR>.<leafname>
…Messages<countrynum>           => …Messages-<LSR>
…<countrynum padded>.<leafname> => …<LSR>.<leafname>
<a path variable>:<countryname> => <pathvar>:<LSR>
<a path variable>:<leafname>    => <pathvar>:<leafname>-<LSR>
<Obey$Dir>.Messages             => <Obey$Dir>.Messages-<LSR>

where <countryname> and <countrynum> are of the current country, which is the backwards-compatible bit.

And when trying an old country number:

…<countryname>.<leafname>       => …<tryname>.<leafname>
…Messages<countrynum>           => …Messages<trynum>
…<countrynum padded>.<leafname> => …<trynum padded>.<leafname>
<a path variable>:<countryname> => <pathvar>:<tryname>
<a path variable>:<leafname>    => <pathvar>:<leafname><trynum>
<Obey$Dir>.Messages             => <Obey$Dir>.Messages<trynum>

These two sets aren’t identical.

It would be tempting to make *COUNTRY fr-CA,en-CA set a country number of 126 and be done with it, but I suspect that reusing country 6 would be safer. 126 should only be used when there’s no backwards-compatible option.

Apr 25, 2019 5:19pm

Rick Murray (539) 13406 posts

What would be good would be certainty of what language ‘Hong Kong’ is supposed to imply.

Isn’t Honk (!) the enclave of Catonese speakers? I don’t know, I am simply saying that because movies from Hong Kong don’t sound the same as the likes of the various over-the-top wuxia productions that make it to the west (Jet Li, Donnie Yen, Zhang Ziyi…).

UK (this is really annoying)

The stupid numbers are really annoying.

Therefore Ukrainian will have to always be “uk-UK” and never simplify.

Should the postfix pattern be “Messages-en-GB” rather than “Messagesen-GB”? I suspect so.

What’s wrong with a subdirectory? Messages.en-GB ?

Oh and yes, Chinese is complicated.

I wouldn’t worry too much about that. I mean, it’s not as if there’s any built-in support for any squiggly ideographs…

It would be tempting to make *COUNTRY fr-CA,en-CA set a country number of 126 and be done with it, but I suspect that reusing country 6 would be safer.

And there lies the problem. Not that you’re still mixing up country and language, but mostly because country numbers are broken. Language numbers are broken too. Both should be retained in the minimal implementation required for compatibility. And then…

*Country CA
*Language fr-CA, en-CA
*Country
Canada (CA)
*Language
1. Française Canadienne (fr-CA)
2. Canadian English (en-CA)

Don’t quote me on the spelling (or gender) of the fr-CA, I didn’t bother looking it up.

This might simplify it for you too. All the UK, France, 1, 6, etc etc can be part of how the legacy stuff operates (I think most will likely do it according to whatever ResFind or the like detects).
Your version? fr-CA → fr → en-CA → en. If somebody wants to support proper country codes, it isn’t hard to rename “UK” to “en”, is it?

Apr 25, 2019 6:42pm

Steve Pampling (1551) 7932 posts

You, Steve, are allowed anything.

I’m afraid you and my wife differ on that and you’re outvoted.

I’m happy you described column 5 as ‘meaningful’ – that was the intention. Lamentably the many in that column are themselves part of the problem.

Maybe you’re looking from the wrong end? How many duplicates of the IANA code are there?
I think the answer is that an IANA code might be one of a number of distinct codes which maps backward to one entry in earlier columns.

Try resorting on column 5 and it becomes obvious that IANA may have had a clue what they were doing.

Note also that the “LatinAm” is another naffness, and is about as useful as expecting everyone in Europe to speak English (or, since we’re allegedly leaving, ask them to speak German – cos they’re more sensible)

Picking something we know is a total dogs dinner as you index field is only ever going to lead to pain.
To restate Ricks comment those fields are broken.

Let’s build things on a sensible basis, look to see what breaks and how to drop in a crutch for those items.

Apr 25, 2019 10:36pm

Steve Pampling (1551) 7932 posts

Try resorting on column 5

Upon sorting one of the first obvious items was the partial duplication of some entries:

Welsh Latin cy-GB
English(SA) Latin en-ZA
French(CH) Latin fr-CH
Italian(CH) Latin it-CH
Nynorsk Latin nn-NO
Romansh Latin rm-CH

Deleting those tidies things a little.

Apr 26, 2019 11:47am

nemo (145) 2437 posts

Steve first

How many duplicates of the IANA code are there?

Many. I’m not trying to achieve a one-to-one mapping, I’m trying to be clear about what the Country numbers actually mean, because it doesn’t seem to be clearly defined.

So where I’ve put multiple LSRs against one Country number, it’s because there may be people who are using that country number to mean any of the choices. ‘Gaelic’ for example. I realise that this is an academic question because we actually know the first names of everyone who still uses RISC OS .

“LatinAm” is another naffness

Indeed. But “Chinese” is actually worse. When you see the choice between “Chinese (Simplified)” and “Chinese (Traditional)”, this is a choice of script – it’s like being given a choice between 𝒊𝒕𝒂𝒍𝒊𝒄 and 𝖋𝖗𝖆𝖐𝖙𝖚𝖗 (you’ll need a proper browser and OS to see what I did there) – not really a choice of language. Someone from Shanghai would not write the same thing in either script that someone from Hong Kong would, colloquially.

Think of it like this: We now get to specify sprite types by colour depth, bits per component and colourant order… but first we had to be sure what the old MODE numbers actually mean in the new terms. This is the same exercise.

Nynorsk Latin nn-NO

I’m not sure why you’re saying these are duplications. My point is that Country 15 is probably going to mean Norwegian in Latin script… but it could have been used to localise into Nynorsk. Again, an academic point. If someone selects ‘nn-NO’ and runs an old program, would they prefer Norwegian or English? I suppose it would be better to use the rest of their ordered preference rather than assume that there are any Nynorsk localised applications that used 15 as the Country number. Is that what you mean?

Rick said

Isn’t Honk (!) the enclave of Catonese speakers?

Indeed, hence my zh-yue clarification, and also of Simplified rather than Traditional orthography. However, I’m not convinced (pardon my scepticism) that whoever at Acorn coined that Country didn’t actually mean “English as she is spoken in our glorious occidental colony”. Do you see what I mean?

<cough> uk-UA </cough>

…Yes well I obviously put that in as a test to see if you’re paying attention. Which you are…

What’s wrong with a subdirectory? Messages.en-GB ?

I’m thinking of the backwards-compatible behaviour. A program that tries to select its own translation will detect whether it has to fall back to English by checking for the existence of the appropriate file. Thanks to that now ancient API revision to OS_File,5, many programs get this wrong by checking R0=1 on exit. So if it is expecting ‘Messages’ to be a file, we can’t turn it into a Directory. And the idea is for this to work without modifying existing programs. New programs would use “Messages” and allow MT to select (and compose) the appropriate one (I take the point though that apps could pass the directory name to the new MT… but that wouldn’t be backwards compatible with older MTs).

If MessageTrans et al are smarter, then we can in effect ignore which localisation the app has chosen as MT will override it… but only under the most common scenario of the app asking MT to load the file. It is also permissible for the App to select and load the file itself, and present it to MT via its message block, and that can’t really be overridden. I haven’t checked how many things actually do this though, I expect it to be a small number.

And there lies the problem. Not that you’re still mixing up country and language, but mostly because country numbers are broken. Language numbers are broken too. Both should be retained in the minimal implementation required for compatibility.

My point is only that if the user selects “es-PE”, the Country number has to be set to something so that awkward applications don’t just stubbornly present English. Country 126 is guaranteed to not have a localisation, whereas 5 might (more likely than 28 anyway).

If somebody wants to support proper country codes, it isn’t hard to rename “UK” to “en”, is it?

It would be better if ‘old’ applications continued to work without having to rename bits of them.

Apr 26, 2019 2:00pm

Steve Pampling (1551) 7932 posts

I’m not sure why you’re saying these are duplications.

Half duplications – as in only some of the columns have an entry, which since you’re referencing the numeric code (absent) and the name (absent) makes them an effective null in the old/existing system does it not?

Yes well I obviously put that in as a test to see if you’re paying attention. Which you are

Whereas I was dealing with a non-maskable interrupt instead (feed the cats – several times)

Keyboard Handling

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options