ResFinder
Rick Murray (539) 13851 posts |
Windows XP SP3 with East Asian language support. It’s in Control Panel → Regional and Language Options → Languages (tab) → Install files for East Asian languages (tickbox). I did, at one point, have the IME working so I could type phonetically ( ku da sa i ) and have the correct squashed spiders come out. Now it only seems to work if I switch my keyboard to kana mode and typing in hiragana. I don’t plan on fixing this as I kind of need a little bit of impetus to actually try learning the kana. BTW, did anybody try clicking on the text? (^_^) |
WPB (1391) 352 posts |
佳奈も? Was that a deliberate joke? 仮名じゃないの? Seriously, though, what is all that junk that *Country Japan spits out. Just tried it on RPCEmu, and no such junk… |
Steve Drain (222) 1620 posts |
I was teasing. I could hardly complain about such small numbers when there are Mb around, but “small is beautiful” still. |
Rick Murray (539) 13851 posts |
<blatant lies> Yes! </> Actually, it was the result of a lot of mucking around in Google Translate to get something that sort-of made sense when reverse translated. Because… well… given the horrible horrible mess GT usually makes of translating pages written in Japanese, I figured it would be sort of the same in the other direction.
I’m so glad I thought to photograph it. Evidence. Proof, even! |
WPB (1391) 352 posts |
Not bad for a machine translation – I’m impressed, and slightly scared for my job. ;)
Yeah, for a just a second, I really believed it might be some apologetic message like you concocted explaining to the unwitting user that the experience they’re about to have will be very, very painful! |
Rick Murray (539) 13851 posts |
Oh, but this stuff matters for assembly language programmers! If a person wants to waste megabytes pointlessly, they can go use some hoity-toity scripting nonsense like PHP or VBS. ;-) Here’s a tiny snippet from the current ResFinder that I’m working on:
It will, ultimately, work out to take more bytes in the program to do this three times (12*4*3 = 144 bytes; vs (4+4+4+7)*4 = 76 bytes), but in terms of execution, all three will be 36 instructions with no stack access and no branches. Versus 90 instructions with six stack accesses and a whopping 24 branches. I’ve not bothered to calculate timings, it doesn’t seem terribly relevant on a processor clocking over half a GHz, suffice to say, I think it is quicker to write a short static string directly than wasting time calling a function. Better to use the function for strings that can change, or are long enough that direct insertion would look dumb. Yeah. I know. I have no life. |
Rick Murray (539) 13851 posts |
ResFinder (v0.06) is now available. http://www.heyrick.co.uk/software/resfinder/ (39K) Here’s what’s new: Removed the “u[language]” method of determining a UTF-8 resource and replaced it with the much tidier “[language].UTF8”. Juggled the code, so it looks for languages in UTF-8 then not-UTF-8, so it all works properly if your UTF8 directory contains only the things that need to be converted. Tidied up the code and stuff. Anything else to add? |
Rick Murray (539) 13851 posts |
I can fix that. http://translationparty.com/#11236194 ;-) If you put in other numbers, you can see other people’s ideas. Here’s one that is Made Of Fail as far as translation goes: [warning – some are NSFW and some make hentai look tame – wander at your own risk] |
WPB (1391) 352 posts |
Good point, well made. The original English looks worryingly like the sort of thing I might actually have to translate. Sometimes the original JA is so weird, I find myself seriously doubting my comprehension! Anyway, so to wander off topic… |
Steve Drain (222) 1620 posts |
I would probably do it your way, but try this for size:
Nor I. ;-) |
Rick Murray (539) 13851 posts |
Steve:
Which…being the tail end of a pathname, is not an assumption we have the luxury of making. WPB:
Yikes! What sort of translation jobs do you do that “Once upon a time a clown was very scary and scared a village of children so bad the rivers filled with pee” is even remotely similar? To me, that looks like something that might have been written by an eight year old. For those who didn’t click the link, or don’t have browser support for the site, we iterate through many comical translations of that to and from Japanese. The aim is to find something that when translated and back again, is the same. The above sentence never finds “equilibrium” (as they call it) and the attempt gives up after twenty attempts leaving us with “看護師の看護村川おしっこ時間かなり悪い怖い子。”, and the smarter people in the room would take the last sentence in English (“One of the nurses nurse village River pee time pretty bad scary child.”) and set it running again. It finds equilibrium fairly quickly to end up with “Murakawa pee time pretty bad scary child nurse.” [http://translationparty.com/#11238404] It’s a fun site to play with when you are bored. Here’s one I just thought up, with a somewhat interesting end result1: http://translationparty.com/#11238415 1 Come on, where else would you veer from childish pee gags to hardcore existentialism in the space of one sentence? |
WPB (1391) 352 posts |
For my sins, video games and manga/anime (mainly the former).
That’s brilliant! |
Rick Murray (539) 13851 posts |
ResFinder (v0.07) is now available. http://www.heyrick.co.uk/software/resfinder/ (39K) No functional changes. A few tweaks to the code, and it is now fully EUPL (v1.1). You can also download the source as an archive (I think – forget to make the archive (duh!) so built one under Windows, let me know if it is all messed up1), or you can view the source on-line with natty !Zap style colourisation. :-) 1 Probably won’t have filetypes – so, everything is Text except the licence (PDF) and the MakeFile. |
Steve Drain (222) 1620 posts |
I think we are agreed about the way the Resources directories should be organised for international use, but I would like to offer an alternative approach to setting the necessary path. I argued that this can be done flexibly in an Obey file, but it turned out that this really needed access to the Country or Territory names. I provided those with system code variables, but having moved into this area of programming for the first time, I realised that more could be done. I have written a Utility that creates an international path in a code variable. I have called it Obey$Path, so that you could include these line in the !Run file:
I doubt whether that name would get registered, but I like the symmetry. ;-) Any alternative, such as Int$Path or Res$Path, would work as well. To be clear, the Utility creating the variable needs to be run once only. In passing, this method also allows for straightforward access to international !Help, a problem which has not yet been tackled here. A !Help Obey file could contain:
and the appropriate Help file would be displayed. As it stands, Obey$Path creates a similar path to ResFinder, but only provides user flexibility through Country and Territory. Country takes precedence and UK is the default. I have kept clear of user-defined system variables such as ResFinder$Fallback or ResFind$LanguagePref and ResFind$LanguageSuff for simplicity, but it is easy to include them. If anyone would like to comment, my efforts are at: http://www.kappa.me.uk/Miscellaneous/International.zip |
Steve Drain (222) 1620 posts |
I have been thinking further about internationalisation and I think we may be barking up the wrong tree. The essential requirement seems to be having resources in a particular Language, not necessarily attached to a Country or a Territory. This would mean resource directories such as English and German rather than UK and Germany. To this end, I have been seeing how this might be done, and I have written a Language Configure plug-in to configure preferred and alternative languages from a list of 90. I also wrote a utility to create a system code variable, Language$Path, along similar lines to Obey$Path I mentioned above. To ensure this all works, the plug-in has 8 language resources and presents language choices in the currently appropriate language, with defaults as discussed above. So far, so good, but I wrote it for the Toolbox on a RiscPC with RO 4.39. Transferred to a Pi with RO 5.19 I hit a problem. Toolbox_Initialise does not like to be given a path, so I was passing it “LangSetup:$”. All worked fine on the RPC, but I got “Buffer overflow” followed shortly after by the disappearance of the Task Manager icon and a machine hangup on the Pi. No combination of path and names would get round this, although the app worked when given a directory name directly, eg “LangSetup:Resources.English”. The error was the same when setting up a path manually in the !Run file, so it should not be anything to do with Language$Path. First, does the idea of configured languages seem reasonable? Second, can any one suggest how to get Toolbox_Initialise to accomodate a path, as it does with 4.39? The plug-in can be written, more simply, to just use English resources, but that seems a shame. |
Steve Pampling (1551) 8172 posts |
Not that I have any connection with the specific areas, but a UK keyboard could quite likely be used by someone speaking/typing Welsh1 so any facility to disconnect language and country is, I believe, a good idea. 1 I often wonder why people from the west of the UK claim to be happy to be foreign – since it seems Welsh is a rather old word for foreign/other/outsider. |
Steve Pampling (1551) 8172 posts |
Is this perhaps related? https://www.riscosopen.org/forum/forums/5/topics/1754 |
Rick Murray (539) 13851 posts |
I said ages ago that, ideally, the entire thing needs to be taken out, shot, and rewritten. While selecting by country is a quick way to set up a language and a keyboard mapping (plus various Territory preferences), it does not provide flexibility for countries with several official languages (Belgium, Switzerland), nor does it permit selection of a minority language (Breton…). ResFinder was written to work with what we have now, but if you feel up to reinventing the entire internationalisation system (while keeping some sort of backwards compatibility), feel free… ;-) I’ve thought about this a lot, and it is the backwards compatibility that bites. |
WPB (1391) 352 posts |
Steve (Kappa) – sounds very interesting. Please give some consideration to different encodings, as we did above, if you haven’t already. It’s probably sufficient just to assume all resource files are Latin-1 encoded, unless the system alphabet is UTF8, in which case it would be good to point all resource paths to what they would have been anyway, but with “.UTF8” appended. Have you been using ISO 639 codes for language identification, or something similar? |
Steve Drain (222) 1620 posts |
@Steve P I did remember that topic about the directory name, but I do not think it is relevant to this problem. I can reproduce the effect of tagging on a “.” and it only causes a “file not found” type of error, not a crash. One further bit of evidence is that on a couple of occasions I saw a “data abort” at an address in the TaskManager module, but I forgot to note it and I have not seen it since. That would tie in with the disappearance of the Task Manager icon. It may also be relevant that I have the same problem if I use Toolbox v1.71, which is the public release from ROL in 2003. This is relevant to the whole topic of internationalisation, because if the Toolbox cannot use resources in directories on a path, either by language or country/territory with ResFind[er] then we have an insurmountable barrier. |
Steve Drain (222) 1620 posts |
For proof of principle I have been using full language names. I took an official list of recognised languages from Wikipedia, cut out a few I did not recognise and ended up with 90. I numbered them in their English alphabetical order. I also ran this list though Google Translate for the local lists. However, the whole matter of what languages and how to represent them is up for grabs as long as this is seen to be a “good thing” and the Toolbox problem can be resolved. It is a shame that it all works as I planned on RO 4.39, but there is such a problem here. It is not the only Toolbox hiccup that has to be worked around – I could not use my original choice of a ScrollList and had to revert to a StringSet. |
Steve Drain (222) 1620 posts |
I thought about encodings, because some of my language lists, such as Polish, do not survive in Latin1. I am not sure of the onward effects when only a few apps would be happy with Latin2, say. As for UTF8, remember that I suggested the “.UTF8” postfix, so that is accounted for in Language$Path for when such resources become available. ;-) |
Steve Drain (222) 1620 posts |
Well, creating the Language configuration is relatively simple and should be backward compatible as far as you like, as long as apps use it. But I think you mean using the resource directories that already exist. Do you recall that list you made of the different ways that apps designate their resources? I do not think we could accomodate all of them. However, if you contemplate dropping in ResFinder in place of ResFind in existing apps using Country names, you might easily think about editing a !Run file and renaming directories to use Language names. A further thought is that Language$Path could be set up to search, say, both English and UK, French and France. Have you realised that the Toolbox_Initialise problem may also occur with ResFinder? I have not checked. Anyway, there seems to be some support for the idea, so I will make a simple version of the Language plug-in for RO 5 and upload it later, along with the multi-language version for those who have RO 4. ;-) |
Steve Drain (222) 1620 posts |
I have uploaded something to http://kappa.me.uk/Miscellaneous/LangSetup.zip |
Rick Murray (539) 13851 posts |
Sorry for delays in writing and if this doesn’t make much sense. Sore throat, achy muscles, runny nose…. Please, roll on springtime! I went to bed at 8pm last night, and the same tonight. But, as you can see, sleep erratic does. Anyway:
Only Western European languages, if I remember correctly. That said, and given we mostly speak English, French, German, Dutch around these parts, it isn’t an unreasonable suggestion because the alternative makes things really complicated. If we’re going to try to talk authors in to supporting alternative character sets, I think UTF-8 is better than “English in Latin1, and Latin2, and LatinX”. Besides, how do we refer to such things? Better, IMHO to say the choice is (predominately (change at your own risk)) Latin1 or UTF-8.
How have you implemented languages? (have not had the computer on, using tablet) Ideally this would be part of an updated TerritoryManager, with a SWI to read a language code, and SWIs to convert this to/from both English name and localised name; though in the resources lookup, we’d probably want to stick to English naming (French vs Française, if for no other reason than accents in localised names wouldn’t survive Latin/UTF changes).
No, especially not if we are to look up by country and/or language. With alternatives. And fallbacks. And…
I will be happy to amend ResFinder to support the process, but it’s the last five words that are the important ones.
Is this not something of a white elephant? I mean, yes, clearly there is a problem. However the toolbox is a part of the OS so once the problem is identified and resolved, it should be fixed from that point onward – with a soft load build available for people using outdated/older versions of RISC OS. Hmm… Time to attempt to sleep again. Night… |