Safeguarding the past, present and future of RISC OS for everyone

News | Downloads | Bugs | Bounties | Forums | Library

Forums → Bugs →

Encodings

106 posts, 18 voices

Pages: 1 2 3 4 5

Apr 15, 2018 9:34am Jeffrey Lee (213) 6048 posts	It’s actually pretty easy to identify text files encoded in UTF-8 because of the way the non-ASCII characters have a particular format to the binary representation which is quite distinctive and quite unlike Latin 1. You have to use heuristics, but it’s pretty reliable. Also, well-formed Unicode text files should start with a Unicode BOM.

Apr 15, 2018 1:20pm Matthew Phillips (473) 721 posts	Also, well-formed Unicode text files should start with a Unicode BOM. Well, I’d agree if you said it the other way round: a file starting with a Unicode BOM is much more likely to be a Unicode file, but for UTF-8 a BOM is not required. The Wikipedia article gives various reasons in favour or against use of a BOM with UTF-8.

Apr 15, 2018 3:34pm nemo (145) 2552 posts	Clive wrote double acute Yes, it’ll always be ‘hungarumlaut’ in my head because that’s its Postscript name. ;-) Keyboard diagrams… Oh very nice.

Apr 15, 2018 3:37pm nemo (145) 2552 posts	a file starting with a Unicode BOM is much more likely to be a Unicode file, but for UTF-8 a BOM is not required Indeed. BOMs are a screaming hint to ancient systems that can’t be bothered to recognise what they’ve got. They are much more useful for UTF-16 than UTF-8.

Apr 15, 2018 4:01pm Clive Semmens (2335) 3276 posts	it’ll always be ‘hungarumlaut’ in my head because that’s its Postscript name Oh, I know that. I wasn’t blaming you!!

Jun 20, 2018 1:47pm Glen Walker (2585) 469 posts	Let’s get a text editor that allows you to specify the encoding on input and output before worrying about characters in filenames! Sorry for resurrecting an old thread…but I’ve been out of the loop somewhat. I did start work on an editor that could do as you ask (although I was only going to have it save/load between UTF-8 and the Latin1 in RISC OS). Haven’t got very far at all with it but its on a list of things I would like to do with RISC OS at some point! If anyone is interested progress will likely be on here when progress happens: xsltpro.co.uk/content/thorn

Pages: 1 2 3 4 5

Reply

To post replies, please first log in.

Forums → Bugs →

Search forums

Social

Follow us on

and

ROOL Store

Buy RISC OS Open merchandise here, including SD cards for Raspberry Pi and more.

Donate! Why?

Help ROOL make things happen – please consider donating!

RISC OS IPR

RISC OS is an Open Source operating system owned by RISC OS Developments Ltd and licensed primarily under the Apache 2.0 license.

Description

Bug discussions that aren’t covered by the bugs database.

Voices

Options

Forums
Login

Contact Us | About Us

The RISC OS Open Beast theme is based on Beast's default layout
Site design © RISC OS Open Limited 2024 except where indicated

Hosted by Arachsys

Powered by Beast © 2006 Josh Goebel and Rick Olson
This site runs on Rails