Ticket #435 (Open)Tue Feb 14 23:31:40 UTC 2017
LanManFS Filer Error
Reported by: | Richard Coleman (3190) | Severity: | Normal |
Part: | RISC OS: Module | Release: | |
Milestone: | Status | Open |
Details by Richard Coleman (3190):
I downloaded a file to my Windows laptop and tried to access it via LanManFS (version 2.56) on my ArmX6.
The file contains a top-bit ascii character (a dot character) which LanManFS changes to an underscore, but when I try to copy the file across I get a file not found error.
Renaming the file on the laptop to remove the dot character and it copies across okay.
Changelog:
Modified by Sprow (202) Fri, February 17 2017 - 08:39:13 GMT
Can you be more specific about the dot character?
For example, press F2 on the filename on Windows, copy it to the clipboard, and paste it into Notepad and attach it here.
Or, if the offending file was off the internet, paste a link to the file.
Modified by Richard Coleman (3190) Fri, February 17 2017 - 23:00:47 GMT
I couldn’t see how to attach a file when I logged the ticket but I see the button for that now.
I’d printed the article off the Internet, with the filename being “10 Reasons Ministry Isnt for Wimps • ChurchLeaders.pdf”.
The page is at http://churchleaders.com/smallgroups/small-grou…
and I notice that in the HTML title tag, the dot is the • character.
Hope that helps.
For interest LanMan98 also gets confused by this, but with that the filer display gets corrupted. Robin has confirmed the bug but doesn’t know if it’ll get fixed.
Modified by Richard Coleman (3190) Fri, February 17 2017 - 23:04:33 GMT
That should have said, the HTML title tag, the dot is the ampersand bull; character (bullet unicode 2022)
Modified by Sprow (202) Sat, February 18 2017 - 21:21:50 GMT
Hmm, for reasons best known to Microsoft, the • has been sent on the wire as character 7. That’s definitely not valid in a RISC OS filename. Maybe it’s been mapped like http://usefulshortcuts.com/alt-codes/bullet-alt…
Modified by Richard Coleman (3190) Sun, February 19 2017 - 09:25:34 GMT
That explains why I couldn’t find the character when I tried the alt codes, only tried those above 126.
LanManFS in the filer display does map character 7 to an underscore, so it displays okay in the filer, it’s when you try and copy the file to RISC OS that it comes up with file not found.
So I can only surmise that LanManFS has forgotten which character it changed to an underscore, and is trying to access the file on Windows with the underscore character rather than character 7, which would explain why it can’t then find the file on the Windows side.
Modified by Jeffrey Lee (213) Sun, February 19 2017 - 16:24:12 GMT
If it’s character 7 then the filename has probably been mapped to a DOS code page.
https://en.wikipedia.org/wiki/Code_page_437
Hopefully some part of the LanMan protocol will indicate/negotiate what encoding scheme is in use.
On the RISC OS side of things, I’m wondering whether there’s any way for LanMan to encode characters it can’t directly translate, similar to percent-encoding in URLs. That way it should be possible to do symmetric conversion between remote and local filenames.
Modified by Sprow (202) Mon, February 20 2017 - 08:07:09 GMT
> Hopefully some part of the LanMan protocol will indicate/negotiate what encoding scheme is in use.
Sadly not. There’s only the CAP_UNICODE flag, which is currently ignored. A quick rummage through SAMBA shows they have a config file to set client/server code page number since there’s no message to negotiate it.
For characters 32-126 and 128-255 LanManFS makes a substitution, but also has a table of ambiguous substitutions (because RISC OS has more reserved characters than DOS, think %$@*# and so on) which it resolves using a directory enumeration if needed. The problem with • is characters 0-31 are all mapped to “duff” and no reverse lookup is attempted.
It would probably be better to just omit duff filenames so there’s no false hope of being able to see the file there but it’s out of reach! Or switch to Unicode. Or escape it somehow.