French without accents
GavinWraith (26) 1563 posts |
The standard ASCII character set works well enough for English. For other languages searching out top-bit-set characters for all those accents, diacritics and other decorations is a bore, and one cannot be sure what a receiving computer will do with them at the other end of one’s email. I have tried, without much success, to find out if there are any generally accepted practices for trying to write French words when no accents are available. Here follow my own suggestions. The circumflex often denotes a character, usually ‘s’, that dropped out of pronunciation many centuries ago, so the obvious shift here is simply to use the omitted character. But should it be “s’il vous plaict” or “s’il vous plaist”? Maybe it does not matter at this stage. The grave and acute accents could be signified by ‘, prepended or appended: de cette mani’ere, ‘a ton gre’. The cedilla might become an underscore: comme ci comme c_a. Diaeresis could be a hyphen perhaps? Bien qu’il soit de’goustant, est-il ne’anmoins compre’hensible? |
John Williams (567) 768 posts |
What you need, for all such languages, is a desktop utility called Custom Characters by Jonathan Rawle 1997 (RO User?). It translates to top-bit set characters. This gives a programmable window listing such characters for many selectable languages to click-on. You could even invent your own mapping if you were so inclined! I would like to see this more widely available. Informally, you can find my (an) address on CSA, or Archive mail-list. But, as I write in French quite often, it is invaluable! Très utile! Sans doute! |
Steve Pampling (1551) 8170 posts |
I think you may have missed the critical portion or at the very least misread the end portion The standard ASCII character set works well enough for English. For other languages searching out top-bit-set characters for all those accents, diacritics and other decorations is a bore, and one cannot be sure what a receiving computer will do with them at the other end of one’s email Since the net is littered with examples of text composed with what the author recognised as “correct”, which on another system appear as gibberish. Essentially, for French that means the accent free Old French – which, entertainingly for Gavin, uses additional letters in the words rather than squiggles over the individual letters. :) |
Chris Hall (132) 3554 posts |
Unfortunately the ASCII character set only works well for English where the pound symbol (which is top-bit-set) is recognised correctly – some software, Messenger included, mungles it and turns it into Australian pounds! |
John Williams (567) 768 posts |
point? Because AFAICS, there isn’t one. The real French, IME, often ignore accents in e-mail, and the text remains perfectly comprehensible. Substituting “chasteau” for “château” merely makes it less easy to read, and the “rules” are very indistinct. Gavin does say ‘often’ for his circonflex ‘s’ substitution, for example, but that is definitely not ‘always’. A trivial example: The words “sur” and “sûr” are pronounced exactly the same, and the circonflex is there merely to differentiate their meaning. Attempting to impose a false logic on the use of accents in French is as doomed to failure as attempting to rationalise the assignment of gender to French words. A bit of fun, perhaps, like the rationalisation of English spelling, but no more! |
GavinWraith (26) 1563 posts |
Of course. I was just musing on what might be acceptable in emails containing French words, and wondered if anybody had established, or even suggested, some guidelines. Genders in Indo-European languages is an interesting topic. We think of them as a somewhat archaic feature, to be found in Latin, Greek and Sanskrit. But I believe that they evolved from a system that discriminated between animate and inanimate; you can see how that might be important. Mistake a rock for a bear, no harm done. The other way round could be tricky. The animate evolved into masculine, the inanimate into neuter, and feminine evolved from neuter plurals. Abstract qualities, represented by goddesses, were expressed as collectives, e.g in Greek andreia can be bravery or manly deeds. So this gender business is not so ancient after all – just a temporary adaptation during the last three or four millennia. |
Rick Murray (539) 13840 posts |
If there is doubt, just omit the accents. |
John Williams (567) 768 posts |
Rick has reinforced my practical experience:
I suppose we could have an amusing competition to see if we could devise something which was delightfully ambiguous without the accents – any takers? |
Rick Murray (539) 13840 posts |
English? It’s accent free and lends itself to horrific ambiguity. |
Rick Murray (539) 13840 posts |
And just for fun, I would recommend written Vietnamese for the language with the most accents, often two per letter! Some random copy-pasting from the web to illustrate the point (even works in NetSurf):
|
Peter Howkins (211) 236 posts |
Just use UTF8 encoding, in the real world this is a solved problem. |
Rick Murray (539) 13840 posts |
Do you have any accents in your address? I do, and there are numerous times that I see an accented word (such as “côté”) arrive looking like “côté”. Thankfully my name has no accents, and it isn’t common around these parts. Goodness only knows what would happen if I had been called Amélie or Gaëlle. Would post ever arrive?!? UTF-8 solved a good few problems. |
Steffen Huber (91) 1953 posts |
In theory, yes. However, even Amazon.de manages to get it wrong. Completely wrong. My colleagues and I all have the “Schei? Encoding” t-shirt, because in the real world, data is almost never in the same encoding from start to end of processing. On numerous occations, we have to deal with a mixture of 1252, UTF-16, UTF-8, 1141 and 273. Now guess which of those encodings is used by the database. “Ah, the Euro sign was not printed correctly! How could that ever happen.” |
Chris Mahoney (1684) 2165 posts |
I’m no stranger to encoding issues either. We have a public Web site, and are occasionally having to remind editors to not use macronised vowels (ā, ē, ī, ō, ū) in page names. If you create a page called, say, Maketū, then it gets a URL of oursite.co.nz/maketū/. That appears to work, and you can browse to it… but then you link to it from another page and it falls over because the tool we use to check for 404s doesn’t like the characters. Another “internationalisation” issue that I see relatively frequently is an assumption that language equals location. I have my browser language set to en-GB and frequently get, eg. prices displayed in pounds, despite having an NZ IP address. |
John Williams (567) 768 posts |
Not to mention the increased risk of becoming pregnant! Would << Pièrre >> or << Raphaël >> not have served as a better example. Or is there something you have not hitherto revealed? Our local posh restaurant is la Pièrre Bleue, or the Blue Peter as we like to call it. Highly recommended! (Sorry – textile is stripping the spaces from my guillemets) |
Rick Murray (539) 13840 posts |
:-P I work with about 140 females and maybe 10 or so males; so when I needed a French name with accents, the ones that came to mind were female: Céline, Éléonore, Adélaïde, Zoé, Noémi, Océane, etc…
That’s because you’re using angle brackets instead of guillemets. Compare << Amélie >> with 《 Amélie 》. |