RISC OS Open: Forum: Chars

Jul 11, 2016 8:55pm

Steffen Huber (91) 1953 posts

It is easy to scan through the file and determine if it is or is not UTF-8 by looking at the character sequences.

You might be able to do that if there are illegal sequences (i.e. determine that it is not UTF-8), but in the generic case you cannot determine if a file is UTF-8-encoded or a single byte encoding.

Jul 11, 2016 9:48pm

Rick Murray (539) 13840 posts

You can also determine that a file is UTF-8 if you come across legal sequences.

However, as I said – if the file has no high bit set characters (such as plain English with normal punctuation), you cannot tell any difference between UTF-8 and Latin1. That said, for such a file there is no difference…

Having markers in the file adds complication. What should Edit do with such a thing? Show it? Hide it? Allow it to be edited? Insert it upon saving? The thing I like about Edit is that you see what is there – even if you’re looking at binary files.

Speaking of binary files, the reason I installed Notepad++ on my PC was because I got fed up of various bits of Windows (such as the RTF handler) “deciding” that a binary file was some sort of Unicode and thus displaying the file in bits of random Chinese. At least Notepad++ can be told what sort of file it is, so I see what is there and not what some algorithm thinks ought to be there…

Which is circular. We’re back to Edit. Showing what’s there. It can be useful, looking in a data file, you know. Days when magazine cover discs would unhelpfully provide files in Impression format. I’m an Ovation guy. So I used to dump them into Edit or Zap and just read the content straight out of the file. ;-)

Jul 11, 2016 10:33pm

Frederick Bambrough (1372) 837 posts

Chris,

Once you’ve got Chars running, does it perform as expected (ie bring up the UCS character names

Yes.

To many nested structures at line 915

Haven’t seen that. I do get File ‘<Chars$Dir>.!Help’ not found at line 1800 on selecting Help from the icon bar menu. Chars exits.

Desktop is using the standard Homerton font, though with an altered theme for the sprites.

Jul 11, 2016 11:03pm

Steffen Huber (91) 1953 posts

You can also determine that a file is UTF-8 if you come across legal sequences.

Every legal UTF-8 sequence is also a legal single byte encoding sequence.

Just witness encoding auto detection in browsers – they often get it wrong, because it is an unsolvable problem.

Jul 12, 2016 8:10am

Chris (121) 472 posts

Thanks Frederick.

I do get File ‘<Chars$Dir>.!Help’ not found at line 1800 on selecting Help from the icon bar menu.

OK, I’ve spoken to ROOL who also can’t reproduce the problems with running out of memory, etc. Could you report the results of these commands:

*Show Chars*
*Ex Resources:$.Apps.!Chars
*Show Wimp*

Are you using a standard ROM download from the site, rather than building your own?

Jul 12, 2016 9:37am

Rick Murray (539) 13840 posts

Every legal UTF-8 sequence is also a legal single byte encoding sequence.

While this is correct, you need to keep looking and not just judge based upon the first sequence found. I think if you encounter, say ten UTF sequences and no invalid high bit stuff, you may be able to have confidence in the file being UTF-8. It would surely be a very rare file that wasn’t UTF-8…while only containing valid UTF-8 sequences.

Jul 12, 2016 11:32am

Steffen Huber (91) 1953 posts

While this is correct, you need to keep looking and not just judge based upon the first sequence found. I think if you encounter, say ten UTF sequences and no invalid high bit stuff, you may be able to have confidence in the file being UTF-8. It would surely be a very rare file that wasn’t UTF-8…while only containing valid UTF-8 sequences.

Experience says: no, not rare. Especially if your decision is not only “UTF-8 or ISO-8859-1(5)”, but also includes other single byte encodings.

You can try to make an educated guess. It can be “judged”. But it cannot be determined.

Jul 12, 2016 12:03pm

Rick Murray (539) 13840 posts

But it cannot be determined.

That’s why I said confidence rather than absolute. It’s like science – it only takes one experiment to disprove something, but any number of “proofs” only increase confidence by virtue of the theory not having been disproven. ;-)

Jul 12, 2016 2:22pm

Steffen Huber (91) 1953 posts

That’s why I said confidence rather than absolute.

You said “determine”. That’s why I responded at all. “Determine” is – according to my dictionary – not the same as “guess with some confidence”.

Jul 12, 2016 3:09pm

Paul Sprangers (346) 524 posts

You can try to make an educated guess. It can be “judged”. But it cannot be determined.

But, cough… how does Windows do it then? Firefox, Thunderbird, Word – even the humblest notepad displays Unicode and I never noticed any failure.

Jul 12, 2016 3:42pm

Steffen Huber (91) 1953 posts

But, cough… how does Windows do it then? Firefox, Thunderbird, Word – even the humblest notepad displays Unicode and I never noticed any failure.

You are trying the wrong things :-)

Firefox has no problem if proper HTML is used – after all, specifying the correct encoding is part of “proper HTML”. Now place a plain text file on your server, with a single byte encoding of your choice using high-bit characters. There are very good chances that Firefox “guesses” UTF-8 content.

Thunderbird usually has no problem because modern emails usually carry the correctly specified encoding (or something like “quoted printable”). Give it an email with unspecified encoding, again single byte encoding with high-bit, and watch it fail miserably.

Word always knows which encoding to use because it is either a default (old binary format) or explicitly specified (XML formats).

Bottom line: guessing the encoding is difficult.

Jul 12, 2016 4:39pm

Paul Sprangers (346) 524 posts

Bottom line: guessing the encoding is difficult.

Then only one conclusion seems to be left over: RISC OS should be rewritten so that it expects specified encodings in text files.
(No idea if this makes sense, I finished some glasses of delightful grappa already.)

This also seems to contradict Rick’s statement, which actually was mine too. But again, grappa and all that…

Jul 12, 2016 4:40pm

Frederick Bambrough (1372) 837 posts

Chris

*show Chars*

*Ex Resources:$.Apps.!Chars
Dir. Resources:$.Apps.!Chars Option 02 (Run) 
CSD  Resources:"Unset"
Lib. Resources:"Unset"
URD  Resources:"Unset"
!Help        WR/     Text      10:27:26 09-Jul-2016    5 kbytes
!Run         WR/     Obey      10:27:23 09-Jul-2016  235  bytes

*Show Wimp*
Wimp$IconTheme : Bluberry.
Wimp$Scrap : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir.ScrapFile
Wimp$ScrapDir : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir
Wimp$State : desktop
*

After running Chars I get;

*Show Chars*
Chars$Dir : SDFS::HardDisc0.$.Public
Chars$Path : SDFS::HardDisc0.$.Public.,Resources:$.Resources.Chars.

Public being the dir I’m using for the altered !Run.

Yup, I’m using the standard ROM. I wouldn’t know how to build one!

Jul 12, 2016 5:15pm

Frederick Bambrough (1372) 837 posts

Doh! It eventually occurred to me you want the results after a clean boot and without the changed !Run. Here it is.

*show Chars*
Chars$Dir : Resources:$.Apps.!Chars
Chars$Path : Resources:$.Apps.!Chars.,Resources:$.Resources.Chars.

*Ex Resources:$.Apps.!Chars
Dir. Resources:$.Apps.!Chars Option 02 (Run) 
CSD  Resources:"Unset"
Lib. Resources:"Unset"
URD  Resources:"Unset"
!Help        WR/     Text      10:27:26 09-Jul-2016    5 kbytes
!Run         WR/     Obey      10:27:23 09-Jul-2016  235  bytes

*Show Wimp*
Wimp$IconTheme : Bluberry.
Wimp$Scrap : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir.ScrapFile
Wimp$ScrapDir : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir
Wimp$State : desktop
*

I thought cycling was supposed to improve one’s wits.

Jul 12, 2016 5:16pm

Rick Murray (539) 13840 posts

ow place a plain text file on your server, with a single byte encoding of your choice using high-bit characters. There are very good chances that Firefox “guesses” UTF-8 content.

Yes. It does. And there are often very good reasons why – sniffing the index page of this site (which I note requests two cookies to be set, but doesn’t pop up the obligatory annoying notice ;-) ), the first line is:

Content-Type:	text/html; charset=utf-8

If you serve a text file and your server is set to include that within the HTTP header, then Firefox is only doing what it was told…

I ran into this myself, which is why my site doesn’t specify any encoding in the http header. I used http://web-sniffer.net to look at the headers.

Give it an email with unspecified encoding, again single byte encoding with high-bit, and watch it fail miserably.

I don’t know about never versions of Thunderbird. Older ones never seemed to suffer too badly for receiving Latin1 emails from a RISC OS application. It would be the usual stuff (fancy quotes in a different place in CP-1252) but nothing extraordinary.

Given that I sometimes received mangled address labels, with my “é” turned into some gibberish, I’m wondering if this whole problem isn’t being made harder than it ought to be.

Bottom line: guessing the encoding is difficult.

Guessing the encoding with any level of confidence is harder, but then anybody who attempts to determine UTF-8 by looking only at the first sequence found needs a kick in the goolies. There may well be some obscure Polish word in Latin5 that actually contains a valid UTF-8 sequence, so you really need to scan through to find a few sequences to make any sort of judgement.

That said, we are really getting off the topic of how the Wimp can be expected to cater for older applications (by older, I mean “every one thus written”) and Unicode applications? Being in the UTF-8 alphabet is a non-starter as it breaks everything else for non-English users…

Jul 12, 2016 5:38pm

Steve Pampling (1551) 8170 posts

It’s like science – it only takes one experiment to disprove something, but any number of “proofs” only increase confidence by virtue of the theory not having been disproven. ;-)

Ah, the joys of misunderstanding the language, even born and bred English speakers get that one wrong.

In the context given “proof” is the result of the test and “prove” is “test” so multiple tests giving the same result do imply the theory is correct but they not categorically rule any other option out.
BTW. Since many people wrongly believe “prove” to mean demonstrate to be true there is often a good debate. They are however wrong, no matter how many of them believe otherwise (ref. the old adage about flies)

Jul 12, 2016 6:34pm

Doug Webb (190) 1180 posts

Chris

Here are my results after deleting EasyFonts from the start up menu.

*show Chars*

*ex Resources:$.Apps.!Chars
Dir. Resources:$.Apps.!Chars Option 02 (Run) 
CSD  Resources:"Unset"
Lib. Resources:"Unset"
URD  Resources:"Unset"
!Help        WR/     Text      10:27:26 09-Jul-2016    5 kbytes                                                               !Run         WR/     Obey      10:27:23 09-Jul-2016  235  bytes

*show Wimp*
Wimp$Font : Homerton.Medium
Wimp$IconTheme : PandaLand2.
Wimp$Scrap : SDFS::ARMiniX.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir.ScrapFile
Wimp$ScrapDir : SDFS::ARMiniX.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir
Wimp$State : desktop


Then after attempting to run !Chars

*show Chars*
Chars$Dir : Resources:$.Apps.!Chars
Chars$Path : Resources:$.Apps.!Chars.,Resources:$.Resources.Chars.
*

I would do it in a nice textual way if the help file was any use whats so ever :-)

Jul 12, 2016 7:33pm

Chris (121) 472 posts

Frederick: I’m the one whose wits are slow :) The reason you’re getting the error when selecting Help from the menu is that you’ve moved the !Run file, thus setting Chars$Dir to that directory. I’d forgotten you were doing that in order to get it to run with a larger wimpslot.

So that’s one thing solved. But I’m no closer to understanding why Chars on your/Doug’s system runs out of memory. I suppose it would be useful to know if it’s running as it should on OMAP3/4 ROMs generally, or whether this is something that affects all Beagle/Pandaboards.

Jul 12, 2016 8:12pm

Rick Murray (539) 13840 posts

so multiple tests giving the same result do imply the theory is correct but they not categorically rule any other option out.

Which is why it was put in quotes. A “proof” (layman’s definition) doesn’t really prove anything other than “here’s one more test that doesn’t disprove the theory”.

Jul 12, 2016 9:30pm

Steffen Huber (91) 1953 posts

Then only one conclusion seems to be left over: RISC OS should be rewritten so that it expects specified encodings in text files.
(No idea if this makes sense, I finished some glasses of delightful grappa already.)

It is the job of whatever application is showing the text file to support different encodings and, if it cannot be determined, let the user choose the correct encoding.

It would be a good idea if the OS would support conversion between different common encodings. Apart from that, the OS should be encoding agnostic. All IMHO of course.

Jul 12, 2016 10:42pm

Doug Webb (190) 1180 posts

Chris,

I think I know what is the issue and it seems to be related to the number of Fonts in the !Fonts directory in Resources.

I installed a clean !Boot and then rebooted so all the choices were set up as new and run !Chars and it worked.

I then reintroduced all of the added Fonts I had in !Fonts and rebooted and tried !Chars and got the failure.

I deleted them gradually, testing each time after a reboot, until I had 23 different font folders in !Fonts at which point !Chars worked.

To ensure it didn’t not relate to a particular Font I altered the fonts that made up the 24th entry, though I only tried another 10 different fonts not all of them, and on each occasion !Chars either gave the error.

So it does seem to be related to the number of fonts at least on this set up.

Hope helps

Jul 13, 2016 12:12am

Frederick Bambrough (1372) 837 posts

I think I know what is the issue and it seems to be related to the number of Fonts in the !Fonts directory in Resources.

This was easy for me to confirm. I keep two font directories, one for the default fonts (5) and another for fonts I’ve added (68). This made it easy for me to move the second dir out of Resources temporarily and reboot. Result same as Doug’s – Chars works.

Jul 13, 2016 8:06am

Chris (121) 472 posts

I think I know what is the issue and it seems to be related to the number of Fonts in the !Fonts directory in Resources.

In the source in CVS it looks like the code that creates the fontlist, which should grow the wimpslot to accommodate long lists, doesn’t. Not sure why – it used to. I think when I did some tidying of the source for submission I must have had an idiot moment and mangled the code. I’ll take a look at it tonight and should be able to send a fix in.

Apologies for the inconvenience, many thanks for your detective work!

Jul 13, 2016 9:46am

Rick Murray (539) 13840 posts

Apologies for the inconvenience,

That’s okay. That’s why this is not the “stable” release. Think of it as crowd sourced bug bashing. ;-)

While I’m here – is there somebody with a large font collection willing to zip up and mail me a copy? I ought to test Ovation with lots of fonts.

Jul 13, 2016 10:34am

Andrew Conroy (370) 740 posts

While I’m here – is there somebody with a large font collection willing to zip up and mail me a copy? I ought to test Ovation with lots of fonts.

Drop me an email to a.m.conroy (at) owlart.co.uk and I can send you tons of them!

Chars

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Jul 11, 2016 8:55pm Steffen Huber (91) 1953 posts	It is easy to scan through the file and determine if it is or is not UTF-8 by looking at the character sequences. You might be able to do that if there are illegal sequences (i.e. determine that it is not UTF-8), but in the generic case you cannot determine if a file is UTF-8-encoded or a single byte encoding.

Jul 11, 2016 9:48pm Rick Murray (539) 13840 posts	You can also determine that a file is UTF-8 if you come across legal sequences. However, as I said – if the file has no high bit set characters (such as plain English with normal punctuation), you cannot tell any difference between UTF-8 and Latin1. That said, for such a file there is no difference… Having markers in the file adds complication. What should Edit do with such a thing? Show it? Hide it? Allow it to be edited? Insert it upon saving? The thing I like about Edit is that you see what is there – even if you’re looking at binary files. Speaking of binary files, the reason I installed Notepad++ on my PC was because I got fed up of various bits of Windows (such as the RTF handler) “deciding” that a binary file was some sort of Unicode and thus displaying the file in bits of random Chinese. At least Notepad++ can be told what sort of file it is, so I see what is there and not what some algorithm thinks ought to be there… Which is circular. We’re back to Edit. Showing what’s there. It can be useful, looking in a data file, you know. Days when magazine cover discs would unhelpfully provide files in Impression format. I’m an Ovation guy. So I used to dump them into Edit or Zap and just read the content straight out of the file. ;-)

Jul 11, 2016 10:33pm Frederick Bambrough (1372) 837 posts	Chris, Once you’ve got Chars running, does it perform as expected (ie bring up the UCS character names Yes. To many nested structures at line 915 Haven’t seen that. I do get File ‘<Chars$Dir>.!Help’ not found at line 1800 on selecting Help from the icon bar menu. Chars exits. Desktop is using the standard Homerton font, though with an altered theme for the sprites.

Jul 11, 2016 11:03pm Steffen Huber (91) 1953 posts	You can also determine that a file is UTF-8 if you come across legal sequences. Every legal UTF-8 sequence is also a legal single byte encoding sequence. Just witness encoding auto detection in browsers – they often get it wrong, because it is an unsolvable problem.

Jul 12, 2016 8:10am Chris (121) 472 posts	Thanks Frederick. I do get File ‘<Chars$Dir>.!Help’ not found at line 1800 on selecting Help from the icon bar menu. OK, I’ve spoken to ROOL who also can’t reproduce the problems with running out of memory, etc. Could you report the results of these commands: `Show Chars Ex Resources:$.Apps.!Chars Show Wimp*` Are you using a standard ROM download from the site, rather than building your own?

Jul 12, 2016 9:37am Rick Murray (539) 13840 posts	Every legal UTF-8 sequence is also a legal single byte encoding sequence. While this is correct, you need to keep looking and not just judge based upon the first sequence found. I think if you encounter, say ten UTF sequences and no invalid high bit stuff, you may be able to have confidence in the file being UTF-8. It would surely be a very rare file that wasn’t UTF-8…while only containing valid UTF-8 sequences.

Jul 12, 2016 11:32am Steffen Huber (91) 1953 posts	While this is correct, you need to keep looking and not just judge based upon the first sequence found. I think if you encounter, say ten UTF sequences and no invalid high bit stuff, you may be able to have confidence in the file being UTF-8. It would surely be a very rare file that wasn’t UTF-8…while only containing valid UTF-8 sequences. Experience says: no, not rare. Especially if your decision is not only “UTF-8 or ISO-8859-1(5)”, but also includes other single byte encodings. You can try to make an educated guess. It can be “judged”. But it cannot be determined.

Jul 12, 2016 12:03pm Rick Murray (539) 13840 posts	But it cannot be determined. That’s why I said confidence rather than absolute. It’s like science – it only takes one experiment to disprove something, but any number of “proofs” only increase confidence by virtue of the theory not having been disproven. ;-)

Jul 12, 2016 2:22pm Steffen Huber (91) 1953 posts	That’s why I said confidence rather than absolute. You said “determine”. That’s why I responded at all. “Determine” is – according to my dictionary – not the same as “guess with some confidence”.

Jul 12, 2016 3:09pm Paul Sprangers (346) 524 posts	You can try to make an educated guess. It can be “judged”. But it cannot be determined. But, cough… how does Windows do it then? Firefox, Thunderbird, Word – even the humblest notepad displays Unicode and I never noticed any failure.

Jul 12, 2016 3:42pm Steffen Huber (91) 1953 posts	But, cough… how does Windows do it then? Firefox, Thunderbird, Word – even the humblest notepad displays Unicode and I never noticed any failure. You are trying the wrong things :-) Firefox has no problem if proper HTML is used – after all, specifying the correct encoding is part of “proper HTML”. Now place a plain text file on your server, with a single byte encoding of your choice using high-bit characters. There are very good chances that Firefox “guesses” UTF-8 content. Thunderbird usually has no problem because modern emails usually carry the correctly specified encoding (or something like “quoted printable”). Give it an email with unspecified encoding, again single byte encoding with high-bit, and watch it fail miserably. Word always knows which encoding to use because it is either a default (old binary format) or explicitly specified (XML formats). Bottom line: guessing the encoding is difficult.

Jul 12, 2016 4:39pm Paul Sprangers (346) 524 posts	Bottom line: guessing the encoding is difficult. Then only one conclusion seems to be left over: RISC OS should be rewritten so that it expects specified encodings in text files. (No idea if this makes sense, I finished some glasses of delightful grappa already.) This also seems to contradict Rick’s statement, which actually was mine too. But again, grappa and all that…

Jul 12, 2016 4:40pm Frederick Bambrough (1372) 837 posts	Chris show Chars Ex Resources:$.Apps.!Chars Dir. Resources:$.Apps.!Chars Option 02 (Run) CSD Resources:"Unset" Lib. Resources:"Unset" URD Resources:"Unset" !Help WR/ Text 10:27:26 09-Jul-2016 5 kbytes !Run WR/ Obey 10:27:23 09-Jul-2016 235 bytes Show Wimp* Wimp$IconTheme : Bluberry. Wimp$Scrap : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir.ScrapFile Wimp$ScrapDir : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir Wimp$State : desktop * After running Chars I get; Show Chars Chars$Dir : SDFS::HardDisc0.$.Public Chars$Path : SDFS::HardDisc0.$.Public.,Resources:$.Resources.Chars. Public being the dir I’m using for the altered !Run. Yup, I’m using the standard ROM. I wouldn’t know how to build one!

Jul 12, 2016 5:15pm Frederick Bambrough (1372) 837 posts	Doh! It eventually occurred to me you want the results after a clean boot and without the changed !Run. Here it is. show Chars Chars$Dir : Resources:$.Apps.!Chars Chars$Path : Resources:$.Apps.!Chars.,Resources:$.Resources.Chars. Ex Resources:$.Apps.!Chars Dir. Resources:$.Apps.!Chars Option 02 (Run) CSD Resources:"Unset" Lib. Resources:"Unset" URD Resources:"Unset" !Help WR/ Text 10:27:26 09-Jul-2016 5 kbytes !Run WR/ Obey 10:27:23 09-Jul-2016 235 bytes Show Wimp* Wimp$IconTheme : Bluberry. Wimp$Scrap : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir.ScrapFile Wimp$ScrapDir : SDFS::HardDisc0.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir Wimp$State : desktop * I thought cycling was supposed to improve one’s wits.

Jul 12, 2016 5:16pm Rick Murray (539) 13840 posts	ow place a plain text file on your server, with a single byte encoding of your choice using high-bit characters. There are very good chances that Firefox “guesses” UTF-8 content. Yes. It does. And there are often very good reasons why – sniffing the index page of this site (which I note requests two cookies to be set, but doesn’t pop up the obligatory annoying notice ;-) ), the first line is: Content-Type: text/html; charset=utf-8 If you serve a text file and your server is set to include that within the HTTP header, then Firefox is only doing what it was told… I ran into this myself, which is why my site doesn’t specify any encoding in the http header. I used http://web-sniffer.net to look at the headers. Give it an email with unspecified encoding, again single byte encoding with high-bit, and watch it fail miserably. I don’t know about never versions of Thunderbird. Older ones never seemed to suffer too badly for receiving Latin1 emails from a RISC OS application. It would be the usual stuff (fancy quotes in a different place in CP-1252) but nothing extraordinary. Given that I sometimes received mangled address labels, with my “é” turned into some gibberish, I’m wondering if this whole problem isn’t being made harder than it ought to be. Bottom line: guessing the encoding is difficult. Guessing the encoding with any level of confidence is harder, but then anybody who attempts to determine UTF-8 by looking only at the first sequence found needs a kick in the goolies. There may well be some obscure Polish word in Latin5 that actually contains a valid UTF-8 sequence, so you really need to scan through to find a few sequences to make any sort of judgement. That said, we are really getting off the topic of how the Wimp can be expected to cater for older applications (by older, I mean “every one thus written”) and Unicode applications? Being in the UTF-8 alphabet is a non-starter as it breaks everything else for non-English users…

Jul 12, 2016 5:38pm Steve Pampling (1551) 8170 posts	It’s like science – it only takes one experiment to disprove something, but any number of “proofs” only increase confidence by virtue of the theory not having been disproven. ;-) Ah, the joys of misunderstanding the language, even born and bred English speakers get that one wrong. In the context given “proof” is the result of the test and “prove” is “test” so multiple tests giving the same result do imply the theory is correct but they not categorically rule any other option out. BTW. Since many people wrongly believe “prove” to mean demonstrate to be true there is often a good debate. They are however wrong, no matter how many of them believe otherwise (ref. the old adage about flies)

Jul 12, 2016 6:34pm Doug Webb (190) 1180 posts	Chris Here are my results after deleting EasyFonts from the start up menu. show Chars ex Resources:$.Apps.!Chars Dir. Resources:$.Apps.!Chars Option 02 (Run) CSD Resources:"Unset" Lib. Resources:"Unset" URD Resources:"Unset" !Help WR/ Text 10:27:26 09-Jul-2016 5 kbytes !Run WR/ Obey 10:27:23 09-Jul-2016 235 bytes show Wimp* Wimp$Font : Homerton.Medium Wimp$IconTheme : PandaLand2. Wimp$Scrap : SDFS::ARMiniX.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir.ScrapFile Wimp$ScrapDir : SDFS::ARMiniX.$.!BOOT.Resources.!Scrap.ScrapDirs.ScrapDir Wimp$State : desktop Then after attempting to run !Chars show Chars Chars$Dir : Resources:$.Apps.!Chars Chars$Path : Resources:$.Apps.!Chars.,Resources:$.Resources.Chars. * I would do it in a nice textual way if the help file was any use whats so ever :-)

Jul 12, 2016 7:33pm Chris (121) 472 posts	Frederick: I’m the one whose wits are slow :) The reason you’re getting the error when selecting Help from the menu is that you’ve moved the !Run file, thus setting `Chars$Dir` to that directory. I’d forgotten you were doing that in order to get it to run with a larger wimpslot. So that’s one thing solved. But I’m no closer to understanding why Chars on your/Doug’s system runs out of memory. I suppose it would be useful to know if it’s running as it should on OMAP3/4 ROMs generally, or whether this is something that affects all Beagle/Pandaboards.

Jul 12, 2016 8:12pm Rick Murray (539) 13840 posts	so multiple tests giving the same result do imply the theory is correct but they not categorically rule any other option out. Which is why it was put in quotes. A “proof” (layman’s definition) doesn’t really prove anything other than “here’s one more test that doesn’t disprove the theory”.

Jul 12, 2016 9:30pm Steffen Huber (91) 1953 posts	Then only one conclusion seems to be left over: RISC OS should be rewritten so that it expects specified encodings in text files. (No idea if this makes sense, I finished some glasses of delightful grappa already.) It is the job of whatever application is showing the text file to support different encodings and, if it cannot be determined, let the user choose the correct encoding. It would be a good idea if the OS would support conversion between different common encodings. Apart from that, the OS should be encoding agnostic. All IMHO of course.

Jul 12, 2016 10:42pm Doug Webb (190) 1180 posts	Chris, I think I know what is the issue and it seems to be related to the number of Fonts in the !Fonts directory in Resources. I installed a clean !Boot and then rebooted so all the choices were set up as new and run !Chars and it worked. I then reintroduced all of the added Fonts I had in !Fonts and rebooted and tried !Chars and got the failure. I deleted them gradually, testing each time after a reboot, until I had 23 different font folders in !Fonts at which point !Chars worked. To ensure it didn’t not relate to a particular Font I altered the fonts that made up the 24th entry, though I only tried another 10 different fonts not all of them, and on each occasion !Chars either gave the error. So it does seem to be related to the number of fonts at least on this set up. Hope helps

Jul 13, 2016 12:12am Frederick Bambrough (1372) 837 posts	I think I know what is the issue and it seems to be related to the number of Fonts in the !Fonts directory in Resources. This was easy for me to confirm. I keep two font directories, one for the default fonts (5) and another for fonts I’ve added (68). This made it easy for me to move the second dir out of Resources temporarily and reboot. Result same as Doug’s – Chars works.

Jul 13, 2016 8:06am Chris (121) 472 posts	I think I know what is the issue and it seems to be related to the number of Fonts in the !Fonts directory in Resources. In the source in CVS it looks like the code that creates the fontlist, which should grow the wimpslot to accommodate long lists, doesn’t. Not sure why – it used to. I think when I did some tidying of the source for submission I must have had an idiot moment and mangled the code. I’ll take a look at it tonight and should be able to send a fix in. Apologies for the inconvenience, many thanks for your detective work!

Jul 13, 2016 9:46am Rick Murray (539) 13840 posts	Apologies for the inconvenience, That’s okay. That’s why this is not the “stable” release. Think of it as crowd sourced bug bashing. ;-) While I’m here – is there somebody with a large font collection willing to zip up and mail me a copy? I ought to test Ovation with lots of fonts.

Jul 13, 2016 10:34am Andrew Conroy (370) 740 posts	While I’m here – is there somebody with a large font collection willing to zip up and mail me a copy? I ought to test Ovation with lots of fonts. Drop me an email to a.m.conroy (at) owlart.co.uk and I can send you tons of them!