LanManFS losing directory contents after a while
Martin Avison (27) 1494 posts |
Rick: LMcheck here has now been running on my Pi for 12 hours, enumerating one directory containing 981 files, at intervals from 30 to 90 minutes. No problems so far. Will: I tried a LM directory called ‘Laser Tank Ver 1/60 For RISC OS_files’ and although I did see some ‘directory not found’ errors when trying to rename a file within in from the Filer, LMcheck ran without problems. Another confusion. |
Will Ling (519) 98 posts |
@Martin, the .Newer was a directory inside, so I think it was when it tries to read the files the next level down in that. You might struggle to get files written to the next level to test it though using just lanmanfs. |
Martin Avison (27) 1494 posts |
@Will: with another directory it did fail. Now testing LMcheck with error handling to investigate what weird combination of / and blanks causes this problem! |
Martin Avison (27) 1494 posts |
Re Will’s missing directory problem: Using LanMan98 I created 4 directories, all with names AxByC where x & y were both blank; both forward slashes; blank & /; and / & blank. Inside these directories was a small text file, and a directory containing a small text file. LMcheck enumerated the two blanks, and the two slashes without problem. I do have a Reporter output for this using Colin’s debug module, and I can send it to him if it is of any use. While not obviously related to the missing file problem, it is certainly a bug: LanMan98 works without any problems. |
Martin Avison (27) 1494 posts |
I have just uploaded LMcheck v0.02 which has a few small changes: improvements in error handling; ability to randomly vary the time delay; and Task name identifies initial directory. It is available here It has been running here on my Pi for 18 hours, enumerating a single directory with 981 files. |
Colin (478) 2433 posts |
Send it it may reveal something. It may be useful to put Re the missing file problem it is possible I suppose that the debug version of lanmanfs doesn’t have the problem. Jeff doesn’t seem to be able to repeat the problem using it. It won’t be the first time that the addition of debug code fixed a problem. |
Martin Avison (27) 1494 posts |
@Colin: Sent. |
Will Ling (519) 98 posts |
I’m now running LMcheck v0.02 on pi and iyonix. I’ve also got another pi serving, as well as my nas. I’ve not managed to get a failure today. (I’m not using the debug lanmanfs). |
Jeff Blyther (1856) 47 posts |
Success at last Colin, Just like you I was thinking that the debug code was having an effect, hopefully the info will be of some use. |
Martin Avison (27) 1494 posts |
Bit premature, I think! The info I sent Colin was for the secondary missing directories error. There has been no hard evidence so far of the original missing files problem, which seems to be currently hiding? Afterthought: or perhaps you meant you had some success at reproducing the missing files problem? I do hope so. |
Colin (478) 2433 posts |
Yes Jeff did send me a file. A quick look shows that the last request for filenames has a response from the server containing no filenames and indicating that it has no more to find. Jeff. Did you use Martin’s program to trigger the problem? I haven’t had a chance to look at your report yet Martin. |
Will Ling (519) 98 posts |
Reviewing the *cat listing I did on my earlier post, I ran Colins debug LM to compare. And it’s clear that only the first batch of names is received from the server. The debug log shows from *cat, 7 files are fetched in the first batch, matching the 7 *cat showed when failing. And the filer request gets more, 14 then 3. The 14 are the ones I saw in the filer, and the 3 that were missing are what would have been fetched in the second batch. |
Martin Avison (27) 1494 posts |
@Will: when you have some missing files in a *Cat, does LMcheck also have them missing? |
Colin (478) 2433 posts |
does it end like this:
64 is the smallest response possible – indicating no filename data in the response. This bit:
is printed in the debug listing when the server flags the end of search has been reached. The last filename of the previous chunk is used as a key to fetch the next chunk. It may be that the server is not finding this key. If the key was corrupted you’d expect a refresh to solve the problem. |
Will Ling (519) 98 posts |
I can’t say right now I’m afraid as when I captured that cat, that was the last fail I’ve managed to get. |
Jeff Blyther (1856) 47 posts |
Colin, I didn’t use Martin’s prog to trigger the problem. |
Will Ling (519) 98 posts |
Jeff, once it’s gone wrong, does it affect all folders with more than the numer of items it cuts off at? 14 for me for the 3 folders I got a screen shot on. |
Jeff Blyther (1856) 47 posts |
Will, yes once its gone wrong any folder with more than 7ish files in it are only part displayed. |
Colin (478) 2433 posts |
If anyone would like to try it LanManFS_NameFix/zip should fix the name translation problem Will highlighted. I’ve done some testing with it and found no problems. It may be safer to test it out initially on a test share. This version still has debug output enabled. |
Will Ling (519) 98 posts |
Colin, I’ve taken that for a spin on my Iyonix and it’s looking good, thanks! |
Martin Avison (27) 1494 posts |
Colin, I have also tried NameFix here on my Iyonix with all the directories I had problems with … and now there are no problems with them. I suppose in these sorts of changes the question then is – has it caused any other new problems!? Presumably this was not related to the missing files problem? |
Martin Avison (27) 1494 posts |
My Pi has now run LMcheck on one directory containing 981 files for 48 hours at intervals of 30 to 90 minutes, without any missing files. My conclusion is that my current test environment does not cause the problem! I think that LMcheck now uses the same OS_GBPB parms that the Filer uses to read directories, so suspicions are turning to the directory contents or activity. My latest tests were all file names like ‘P1250nnn/JPG’ where nnn was 001 to 999, and there was no update activity in that directory. I will try on more mixed files. |
Colin (478) 2433 posts |
The pathname conversion to DOS is quite complicated and the problem was that in looking for a leaf part of the path the path would get a riscos to dos conversion applied to it more than once so ‘A B/C’ would get changed to ‘A B.C’ and then back to ‘A B/C’ then it would try to find ‘A B’ and return No such directory. It would only be a problem if there are ‘contentious characters’ in the path so a folder ‘AB/C’ was ok but ‘A B/C’ wasn’t. I can’t see that it will cause problems. I don’t think it’s related to the missing file problem. |
Richard Walker (2090) 431 posts |
Colin, I have given your LanManFS a whirl. It is brilliant. I was able to create and use a directory called ‘Acorn C/C++’. That fixes my problem as reported here https://www.riscosopen.org/forum/forums/11/topics/11694 I did start looking at the code and the xlate file was making my head spin. I was studying the first half, then realised it wasn’t even built because of the long name flag! |
Colin (478) 2433 posts |
Good. I’ve submitted the change to ROOL so hopefully it will be added. |