RISC OS Open: Forum: OS_GBPB9 in practice

Jan 15, 2023 9:01am

Ronald (387) 195 posts

Nemo, your link to heebygeeby for this topic is ‘not found’
My goal is to build something along the lines of ‘Enumdir [dir] vdu: [wildcard]’ in C so it can be
merged with bigger unixlib programs.
This PB9.c file builds with default unixlib or mlibscl.
Edit: updated file
This version has no riscosifying or unixifying (native RISC OS naming)
The resulting unix sharedlib version is about 1Kb smaller than my previous usage of unixlib functions.

I found the use of r[3] to be necessary for extracting ‘only’ wildcarded entries.
I went a bit further and added a no match function by printing when r[3]==0
In this situation the entry number needs to be tracked so no processing is done when r[3]==0
and there are no new entries.
At a minimum this happens once at the last entry. (as it is known, it could happen mid fetching)
r[3]=0 ‘could’ indicate the fetch end one loop earlier than r[4]== -1
Maybe that was the temptation to mistakenly use it for that purpose?
So I am checking against r[4] which I hope climbs numerically even if the entries are random.
It appears GPB9 is geared towards ordinary wildcarded entry retrieving though as the issues aren’t
there when checking for r[3]!=0

I am looking for something minimal but would like to hear of any critical errors or improvements.

useage is pb9 [dir] [wildcard] [!]
The plink arg is to do the ‘NOT’ wildcard.

Jan 15, 2023 10:35am

Steve Pampling (1551) 8172 posts

Nemo, your link to heebygeeby for this topic is ‘not found’

If you’re after that specific item, then some re-arrangement of the URL in the link will get you there.

https://nemo20000.keybase.pub becomes https://keybase.pub/nemo20000/

Looking at the style, I’ll make a guess that the DNS entry points at what Citrix describe in the link here as a content switch setup that directs all nemo2000.keybase.pub requests to keybase.pub/nemo2000
Or it should…

Jan 15, 2023 10:48am

Rick Murray (539) 13850 posts

char *entbuf[256];

Haven’t you just declared an array of 256 pointers?

/* should the regs and buf be freed here? */

You free what you specially allocate (malloc, calloc, etc). Variable declarations in procedures are allocated on the stack (unless static), so they’re discarded when the function exits.

So I am checking against r[4] which I hope climbs numerically even if the entries are random.

GBPB 9 is very clearly documented https://www.riscosopen.org/wiki/documentation/show/OS_GBPB%209

It states two things:

R3 can return zero, it’s allowed.
DO NOT make assumptions about R4.

While R4 generally does run in order for FileCore filing systems, it typically does not for everything else.

Jan 15, 2023 10:50am

Ronald (387) 195 posts

becomes https://keybase.pub/nemo20000/

That got it thanks.
Eek that looks different in a few ways.
Bedtime downunder..

Jan 15, 2023 10:56am

Ronald (387) 195 posts

Haven’t you just declared an array of 256 pointers?

Yes that should be just entbuf,
not sure how that snuck back in after I initially typed it. I have been moving stuff a bit.
Edit: I guess the only other way to check for an actual entry while r[3]==0 would be to reset byte 0 of the buffer each time and see if it changes. ‘\0’ should serve the purpose.

Jan 15, 2023 2:01pm

nemo (145) 2556 posts

Ronald reported

your link to heebygeeby

I’m an idiot:

Look at what the idiot did

It’s GIF/JIF all over again. I’ve fixed all the links, and the name of the file.

Jan 15, 2023 2:07pm

nemo (145) 2556 posts

Rick responded

GBPB 9 is very clearly documented

GBPB,8 is also ‘very clearly documented’, but is completely wrong.

Please see the HeebyGeeby ReadMe for the gory details.

“Don’t use GBPB,8” continues to be good advice.

Jan 15, 2023 2:25pm

nemo (145) 2556 posts

Ronald speculated

I guess the only other way to check for an actual entry while r[‍3]==0 would be to reset byte 0 of the buffer each time and see if it changes.

No. R3=0 means there is no result in the buffer, it does not mean the buffer did not get written to.

In any case, it is not impossible for a filing system to support – and therefore report – zero-length leafnames – bit 29 of the Filing System Information Word documents this for an FS (though it’s really unusual – I’m only aware of DeviceFS using it, though FS numbers 1,2 & 4 probably would if they were implemented).

Jan 15, 2023 4:30pm

Rick Murray (539) 13850 posts

“Don’t use GBPB,8” continues to be good advice.

The “buffer”. Oh my god.

Jan 15, 2023 4:39pm

nemo (145) 2556 posts

Oh my god

That GBPB,8 has never been implemented correctly is only of historical interest now, but it ought to be properly documented because it is so wrong.

The only way to be completely safe when using it is to allocate a 1MB buffer, which is… excessive. The pragmatic approach is to use a 256-byte-per-item buffer and to always ask for more than one item, though the implementational error means early termination is still possible. Going by the documentation is neither safe nor guaranteed to do what is asked.

Jan 15, 2023 6:53pm

Ronald (387) 195 posts

R3=0 means there is no result in the buffer, it does not mean the buffer did not get written to.

Means there is no result in the buffer you require (normally)
however I have updated the file to set and observe entbuf[0] and this enables the inverse of
matching to be processed. As Rick pointed out, the previous use of r[4] would fall over eventually.
It is not necessary to implement it as a scripts typical usage would be to collect a set of .c files into
a variable by ‘matching’ in a Makefile.
I suppose it could be useful for removing unwanted files.

Jan 15, 2023 9:45pm

Stuart Swales (8827) 1357 posts

this enables the inverse of matching to be processed

Oh dear god no. It CANNOT be used like that. A filing system may spew nonsense into the buffer if R3out is zero. All you can say is what is documented – there are R3out valid matches returned on a particular pass, you pass R4out as R4in next time round (no guarantee on value) and end looping (after processing any valid entries) when R4out is -1.

Jan 15, 2023 11:43pm

Ronald (387) 195 posts

A filing system may spew nonsense into the buffer if R3out is zero

have you had a look at the file? R3 is set to 1 before gbpb9 is run for each single entry fetch.
The regs structure is in memory and should be able to retain r4 for the next running of gbpb9 surely?
Multiple entries in the buffer separated by ‘\0’ would need further processing, I’m sure there is a way to do
this, So far I have only been with asking for one.

Jan 16, 2023 12:38am

Stuart Swales (8827) 1357 posts

Yes, but you can’t rely on the buffer being filled by GBPB 9 for an entry that isn’t being returned as it’s failed the wildcard match. I think that’s what you’re trying to achieve with the negation?

For doing what it’s meant to do it is rather more efficient – especially for network fs – to set a reasonably large R3 (say buffer size / 16 to accomodate some longer names).

Jan 16, 2023 1:21am

nemo (145) 2556 posts

Ronald said

[wrongness]

The filing system implementation, and FileSwitch, and EVERY VECTOR HANDLER ON GBPBV is entitled to do whatever they like to the whole of your buffer, in the process of deciding what to return to you as the final result.

If R3=0 on exit, there is nothing of value in the buffer, certainly not whatever you put in it before the call.

If R3=1 on exit, there is one result in the buffer and the entirety of the rest of the buffer may have had anything written to it.

When you call GBPB with a buffer you are giving permission for anything in that codepath to write to the buffer. You cannot thereafter complain that something did in fact write to that buffer.

That’s what it’s for.

Jan 16, 2023 3:39am

Ronald (387) 195 posts

Yes, but you can’t rely on the buffer being filled by GBPB 9 for an entry that isn’t being returned as it’s failed the wildcard match. I think that’s what you’re trying to achieve with the negation?

if you compile the file with the ‘else’ removed for the r3==0 section you will see that all the entries go through the buffer regardless of the wildcard. I am not negating anything, just using the fact that the non matched entries are also there for the taking.
Of course it is not the intended use, thats why extra conditions had to be applied. It will be removed for
real installations, I’ll keep a utility with it as a novelty item, see if it fails one day.

if R3=0 on exit, there is nothing of value in the buffer

yes it means there is nothing there that was requested, it does appear to be holding the entry that it is hiding though.
I haven’t had any issue with the buffer doing anything unusual, or not doing what it should be?
The extra tweaks aren’t needed for normal wildcard use, everything else looks fine.

R3 at buffer size/16 gbpb9 should do the job in one step?

Jan 16, 2023 7:10am

Steve Fryatt (216) 2105 posts

yes it means there is nothing there that was requested, it does appear to be holding the entry that it is hiding though. I haven’t had any issue with the buffer doing anything unusual, or not doing what it should be?

…on the filing systems that you’ve tested against.

What a particular FS does or doesn’t do when implementing OS_GBPB 9 is up to whoever wrote the implementation for that FS. It has to meet the API defined in the PRM, but beyond that it can do whatever it likes. Fileswitch for example, treats R4 as a linear sequence of values, but it isn’t defined to be one – which is why things which treat it as if it is (…hello, FilerAction…) behave unexpectedly when it isn’t.

The FSs that you’ve tested your code on write everything to the buffer, and use R3 to indicate whether or not it matched the wildcard filename. I’ve no idea if any do this, but consider a network FS which passes the wildcard to the server and only ever gets back matching objects. You wouldn’t see those held in the buffer while R3 was 0, because the local FS wouldn’t have even seen them.

Jan 16, 2023 10:10am

Ronald (387) 195 posts

I have moved on to using gbpb9 with r3 set at BUFSIZ/16 and to convert the buffer full of null separated strings would be difficult without being able to rely on r3 for the number of entries.
Using a wildcard r4 returns with 60 (total) but r3 returns 9 for the wildcarded entries in the buffer.

One unexpected thing for me was that even though the entries are all retrieved in one use of gbpb9,
it actually has to run a second time before r4 gets set to -1 and it can quit.
Unless I am missing something (flags?) it will always run once more than needed while
waiting for r4 == -1.
Filing systems that cant supply accurate r3 possibly would work with r1 =1 (single entry mode) but are
you going to detect what fs you are reading from in your prog?

Jan 16, 2023 10:24am

Stuart Swales (8827) 1357 posts

R3 is the number of valid entries. They are all at the start of the buffer. Anything else you observe in there should be treated as noise. Filing systems DO supply accurate R3. Many users have used this call successfully since 1987. Or even 1986; I think it was the first addition that I made to OS_GBPB.

Run until you obtain R4 == -1.

Jan 16, 2023 1:31pm

nemo (145) 2556 posts

I can see I will have to expand HeebyGeeby to also randomise the (unused part of) the buffer to avoid… misunderstandings… about the difference between defined behaviour and whatever you happen to have in your particular machine, today, for this FILENAME.

Jan 16, 2023 1:49pm

Stuart Swales (8827) 1357 posts

Please do! Preferably with no zero bytes so any errant access will wander off into the unknown.

Jan 16, 2023 6:48pm

Charles Ferguson (8243) 427 posts

I can see I will have to expand HeebyGeeby to also randomise the (unused part of) the buffer to avoid… misunderstandings… about the difference between defined behaviour and whatever you happen to have in your particular machine, today, for this FILENAME.

I love it when such changes are easy…

This was the change I’ve just added to RISC OS Pyromaniac to make it a configurable, like the changes that I made for the context handle recently (https://github.com/gerph/riscos-gbpb-context-mangling):

charles@laputa ~/projects/RO/pyromaniac (buffer-corruption-experiment↑)> git show
commit c0603e6e7cde614d10c70b044dba89786f033bb8 (HEAD -> buffer-corruption-experiment)
Author: Charles Ferguson <<a href="mailto:gerph@gerph.org">gerph@gerph.org</a>>
Date:   Mon Jan 16 18:40:30 2023 +0000

    Add option for explicitly corrupting the OS_GBPB buffer.
    
    The OS_GBPB buffer can be written to in any size (up to the length
    given) that the filesystem likes when it was called. So let's make
    sure that this happens to provoke problems. The new configuration
    option is specific to OS_GBPB but the corruption code could move
    to its own module.

diff --git a/riscos/filesystem/__init__.py b/riscos/filesystem/__init__.py
index 11daa3e2..744dcc24 100644
--- a/riscos/filesystem/__init__.py
+++ b/riscos/filesystem/__init__.py
@@ -112,6 +112,7 @@ class FilesystemConfig(Configuration):
             'mask_timestamp': riscos.rotime.QuinTimestamp,
             'enumerate_last_chunk': 'enum:efficient,classic',
             'enumerate_context': riscos.contextmangler.ContextManglerName,
+            'enumerate_destroys_buffer': 'regex:no|increment|random|[0-9a-fA-F]{1,2}',
             'info_use_sysdateformat': 'bool',
             'filetype_strict_range': 'bool',
             'close_0_effect': 'enum:error,ignore',
@@ -273,6 +274,20 @@ When this option is enabled it means that:
 
 When enabled, this option matches the behaviour of FileCore file systems
 by generating an error. When disabled, it matches many network file systems.
+""",
+
+            'enumerate_destroys_buffer': """
+Configures the type of buffer corruption applied to the OS_GBPB enumeration
+buffer. The buffers supplied may be writen in part or not at all by the
+filesystem. This configuration ensures that the buffer is written to
+with data to provoke errors.
+
+The corruption options available are:
+
+    * `no`: Do not corrupt.
+    * `increment`: Corrupt with incrementing values.
+    * `random`: Randomise every byte.
+    * <hex>: Write the supplied hex value to every byte.
 """,
     }
     native_boot_option = 'run'
@@ -284,6 +299,7 @@ by generating an error. When disabled, it matches many network file systems.
     info_use_sysdateformat = True
     enumerate_last_chunk = 'efficient'
     enumerate_context = 'identity'
+    enumerate_destroys_buffer = 'no'
     mask_timestamp = None
     native_freespace = 'host'
     filetype_strict_range = False
diff --git a/riscos/swis/fs/osgbpb.py b/riscos/swis/fs/osgbpb.py
index e6420899..ee72e591 100644
--- a/riscos/swis/fs/osgbpb.py
+++ b/riscos/swis/fs/osgbpb.py
@@ -3,6 +3,8 @@
 OS_GBPB implementation.
 """
 
+import random
+
 from riscos import handlers
 import riscos.constants.swis as swis
 from riscos.errors import RISCOSSyntheticError
@@ -18,6 +20,42 @@ handlers.osgbpb.const_object = highfsi
 handlers.osgbpb.const_prefix = "OSGBPB_"
 
 
+def corrupt_buffer(ro, form, data, datalen=None):
+    """
+    Perform a buffer corruption on the memory buffer supplied.
+
+    @param form:    Form the corruption should take:
+                        no: Do not corrupt
+                        increment: Corrupt with incrementing values
+                        random: Randomise every byte
+                        <hex>: Write the supplied hex value to every byte
+    @param data:    Memory buffer to corrupt (may be a memory range)
+    @param datalen: Size of data to corrupt, or None to corrupt the entire memory range.
+    """
+    if form == 'no':
+        return
+    if datalen is None:
+        datalen = data.size
+
+    if form == 'increment':
+        for n in range(datalen):
+            data.write_byte(n & 255, offset=n)
+
+    elif form == 'random':
+        for n in range(datalen):
+            data.write_byte(random.randrange(256), offset=n)
+
+    elif form != 'no':
+        try:
+            value = int(form, 16)
+            if value > 255:
+                raise ValueError("Too big")
+            data.write_bytes(chr(value) * datalen)
+
+        except ValueError:
+            raise RISCOSSyntheticError(ro, "Bad configuration for buffer corruption ('{}' is not valid)".format(value))
+
+
 @handlers.swi.register(swis.OS_GBPB)
 def swi_OS_GBPB(ro, swin, regs):
     """
@@ -165,6 +203,9 @@ def osgbpb_direntries(ro, reason, regs):
     if ro.kernel.filesystem.debug_osfileentries:
         print("Read direntries (GBPB %i) on '%s' context %r, count %i" % (reason, dirname, context, ntoread))
 
+    # Apply the buffer overwriting, to provoke problems.
+    corrupt_buffer(ro, ro.config['filesystem.enumerate_destroys_buffer'], data, datalen)
+
     # FIXME: This doesn't match the RISC OS interface, so tying to the filesystems
     #        will be harder.
     orderedfiles = ro.kernel.filesystem.enumerate(dirname)

<skip test expectation diffs>

diff --git a/testcode/tests-filesystem-enumeration.txt b/testcode/tests-filesystem-enumeration.txt
index e9f38a89..77f1e6af 100644
--- a/testcode/tests-filesystem-enumeration.txt
+++ b/testcode/tests-filesystem-enumeration.txt
@@ -55,3 +55,22 @@ Args: descending
 
 Test: Context: Multiplier
 Args: multiplier
+
+
+Group: Filesystem: Enumeration (corruption)
+Command: $TOOL --config filesystem.native_directory=testdata/fstest --config filesystem.enumerate_destroys_buffer=$ARG1 bin/file_enumerate c2 @
+# They should all enumerate the same order as for ReadDirEntriesCatInfo, without any differences at all.
+Expect: expect/file/enumeratecontext_c_corruption_$ARG1
+Replace: expect/file/enumerate_replacements_ic
+
+Test: Corruption: Increment
+Args: increment
+
+Test: Corruption: Random
+Args: random
+
+Test: Corruption: 0
+Args: 0
+
+Test: Corruption: FF
+Args: FF

Jan 16, 2023 7:35pm

nemo (145) 2556 posts

I’m glad you didn’t post the complicated version.

HeebyGeeby 0.02 now scribbles over the supplied buffer and let that be a lesson to you.

It has also ballooned from 256 bytes to 316 for which I can only apologise.

Jan 17, 2023 12:58pm

Charles Ferguson (8243) 427 posts

I’m glad you didn’t post the complicated version.

I’m sorry. I had assumed that it would be relatively obvious what it was doing, and the explanation made it clear.

I’ll try to explain…

It’s written in the high level language Python, just as most of the RISC OS Pyromaniac system is.
I’ve shown the git commit message and diff, as that’s how the source is managed.
The diff shows the addition of the configuration option which allows the type of corruption to be configured at runtime, and the help message that is included alongside it.
The function corrupt_buffer performs the different types of corruption that were requested, separate from the main implementation of the OS_GBPB functions. It’s separate because I’ll reuse this later for some of the other buffer uses – OS_ReadVarVal is an obvious candidate but there are a bunch of others as well.
The function osgbpb_direntries has been modified to add the call to corrupt the buffer. This function handles the OS_GBPB 9-12 entry points.
And the remaining text file change is the definition of the test code that ensures that the change is exercised in some of the forms.
And as I said, this was relatively easy to do and (I feel) understandable… but each to their own.

So basically… this is the entire change that provides a configuration option to allow the corruption of the enumeration buffer to be changed, and the tests that ensure that it works properly.

Yes, this is longer than the change that you suggest to your module, but that’s because it’s a more expressive and functional change, and is written in a high level language. Size of the target code is pretty much irrelevant – maintainability and functionality is everything.

Jan 17, 2023 4:08pm

nemo (145) 2556 posts

Not everyone’s sense of humour is the same, and that’s ok. Sorry you mistook the subject of my incomprehension.

OS_GBPB9 in practice

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Jan 15, 2023 9:01am Ronald (387) 195 posts	Nemo, your link to heebygeeby for this topic is ‘not found’ My goal is to build something along the lines of ‘Enumdir [dir] vdu: [wildcard]’ in C so it can be merged with bigger unixlib programs. This PB9.c file builds with default unixlib or mlibscl. Edit: updated file This version has no riscosifying or unixifying (native RISC OS naming) The resulting unix sharedlib version is about 1Kb smaller than my previous usage of unixlib functions. I found the use of `r[3]` to be necessary for extracting ‘only’ wildcarded entries. I went a bit further and added a no match function by printing when `r[3]==0` In this situation the entry number needs to be tracked so no processing is done when `r[3]==0` and there are no new entries. At a minimum this happens once at the last entry. (as it is known, it could happen mid fetching) `r[3]=0` ‘could’ indicate the fetch end one loop earlier than `r[4]== -1` Maybe that was the temptation to mistakenly use it for that purpose? So I am checking against `r[4]` which I hope climbs numerically even if the entries are random. It appears GPB9 is geared towards ordinary wildcarded entry retrieving though as the issues aren’t there when checking for `r[3]!=0` I am looking for something minimal but would like to hear of any critical errors or improvements. useage is `pb9 [dir] [wildcard] [!]` The plink arg is to do the ‘NOT’ wildcard.

Jan 15, 2023 10:35am Steve Pampling (1551) 8172 posts	Nemo, your link to heebygeeby for this topic is ‘not found’ If you’re after that specific item, then some re-arrangement of the URL in the link will get you there. https://nemo20000.keybase.pub becomes https://keybase.pub/nemo20000/ Looking at the style, I’ll make a guess that the DNS entry points at what Citrix describe in the link here as a content switch setup that directs all nemo2000.keybase.pub requests to keybase.pub/nemo2000 Or it should…

Jan 15, 2023 10:48am Rick Murray (539) 13850 posts	char entbuf[256]; Haven’t you just declared an array of 256 pointers? / should the regs and buf be freed here? / You free what you specially allocate (malloc, calloc, etc). Variable declarations in procedures are allocated on the stack (unless static), so they’re discarded when the function exits. So I am checking against r[4] which I hope climbs numerically even if the entries are random. GBPB 9 is very clearly documented https://www.riscosopen.org/wiki/documentation/show/OS_GBPB%209 It states two things: R3 can return zero, it’s allowed. DO NOT* make assumptions about R4. While R4 generally does run in order for FileCore filing systems, it typically does not for everything else.

Jan 15, 2023 10:50am Ronald (387) 195 posts	becomes https://keybase.pub/nemo20000/ That got it thanks. Eek that looks different in a few ways. Bedtime downunder..

Jan 15, 2023 10:56am Ronald (387) 195 posts	Haven’t you just declared an array of 256 pointers? Yes that should be just entbuf, not sure how that snuck back in after I initially typed it. I have been moving stuff a bit. Edit: I guess the only other way to check for an actual entry while `r[3]==0` would be to reset byte 0 of the buffer each time and see if it changes. ‘\0’ should serve the purpose.

Jan 15, 2023 2:01pm nemo (145) 2556 posts	Ronald reported your link to heebygeeby I’m an idiot: Look at what the idiot did It’s GIF/JIF all over again. I’ve fixed all the links, and the name of the file.

Jan 15, 2023 2:07pm nemo (145) 2556 posts	Rick responded GBPB 9 is very clearly documented GBPB,8 is also ‘very clearly documented’, but is completely wrong. Please see the HeebyGeeby ReadMe for the gory details. “Don’t use GBPB,8” continues to be good advice.

Jan 15, 2023 2:25pm nemo (145) 2556 posts	Ronald speculated I guess the only other way to check for an actual entry while r[‍3]==0 would be to reset byte 0 of the buffer each time and see if it changes. No. R3=0 means there is no result in the buffer, it does not mean the buffer did not get written to. In any case, it is not impossible for a filing system to support – and therefore report – zero-length leafnames – bit 29 of the Filing System Information Word documents this for an FS (though it’s really unusual – I’m only aware of DeviceFS using it, though FS numbers 1,2 & 4 probably would if they were implemented).

Jan 15, 2023 4:30pm Rick Murray (539) 13850 posts	“Don’t use GBPB,8” continues to be good advice. The “buffer”. Oh my god.

Jan 15, 2023 4:39pm nemo (145) 2556 posts	Oh my god That GBPB,8 has never been implemented correctly is only of historical interest now, but it ought to be properly documented because it is so wrong. The only way to be completely safe when using it is to allocate a 1MB buffer, which is… excessive. The pragmatic approach is to use a 256-byte-per-item buffer and to always ask for more than one item, though the implementational error means early termination is still possible. Going by the documentation is neither safe nor guaranteed to do what is asked.

Jan 15, 2023 6:53pm Ronald (387) 195 posts	R3=0 means there is no result in the buffer, it does not mean the buffer did not get written to. Means there is no result in the buffer you require (normally) however I have updated the file to set and observe `entbuf[0]` and this enables the inverse of matching to be processed. As Rick pointed out, the previous use of `r[4]` would fall over eventually. It is not necessary to implement it as a scripts typical usage would be to collect a set of .c files into a variable by ‘matching’ in a Makefile. I suppose it could be useful for removing unwanted files.

Jan 15, 2023 9:45pm Stuart Swales (8827) 1357 posts	this enables the inverse of matching to be processed Oh dear god no. It CANNOT be used like that. A filing system may spew nonsense into the buffer if R3out is zero. All you can say is what is documented – there are R3out valid matches returned on a particular pass, you pass R4out as R4in next time round (no guarantee on value) and end looping (after processing any valid entries) when R4out is -1.

Jan 15, 2023 11:43pm Ronald (387) 195 posts	A filing system may spew nonsense into the buffer if R3out is zero have you had a look at the file? R3 is set to 1 before gbpb9 is run for each single entry fetch. The regs structure is in memory and should be able to retain r4 for the next running of gbpb9 surely? Multiple entries in the buffer separated by ‘\0’ would need further processing, I’m sure there is a way to do this, So far I have only been with asking for one.

Jan 16, 2023 12:38am Stuart Swales (8827) 1357 posts	Yes, but you can’t rely on the buffer being filled by GBPB 9 for an entry that isn’t being returned as it’s failed the wildcard match. I think that’s what you’re trying to achieve with the negation? For doing what it’s meant to do it is rather more efficient – especially for network fs – to set a reasonably large R3 (say buffer size / 16 to accomodate some longer names).

Jan 16, 2023 1:21am nemo (145) 2556 posts	Ronald said [wrongness] The filing system implementation, and FileSwitch, and EVERY VECTOR HANDLER ON GBPBV is entitled to do whatever they like to the whole of your buffer, in the process of deciding what to return to you as the final result. If R3=0 on exit, there is nothing of value in the buffer, certainly not whatever you put in it before the call. If R3=1 on exit, there is one result in the buffer and the entirety of the rest of the buffer may have had anything written to it. When you call GBPB with a buffer you are giving permission for anything in that codepath to write to the buffer. You cannot thereafter complain that something did in fact write to that buffer. That’s what it’s for.

Jan 16, 2023 3:39am Ronald (387) 195 posts	Yes, but you can’t rely on the buffer being filled by GBPB 9 for an entry that isn’t being returned as it’s failed the wildcard match. I think that’s what you’re trying to achieve with the negation? if you compile the file with the ‘else’ removed for the r3==0 section you will see that all the entries go through the buffer regardless of the wildcard. I am not negating anything, just using the fact that the non matched entries are also there for the taking. Of course it is not the intended use, thats why extra conditions had to be applied. It will be removed for real installations, I’ll keep a utility with it as a novelty item, see if it fails one day. if R3=0 on exit, there is nothing of value in the buffer yes it means there is nothing there that was requested, it does appear to be holding the entry that it is hiding though. I haven’t had any issue with the buffer doing anything unusual, or not doing what it should be? The extra tweaks aren’t needed for normal wildcard use, everything else looks fine. R3 at buffer size/16 gbpb9 should do the job in one step?

Jan 16, 2023 7:10am Steve Fryatt (216) 2105 posts	yes it means there is nothing there that was requested, it does appear to be holding the entry that it is hiding though. I haven’t had any issue with the buffer doing anything unusual, or not doing what it should be? …on the filing systems that you’ve tested against. What a particular FS does or doesn’t do when implementing OS_GBPB 9 is up to whoever wrote the implementation for that FS. It has to meet the API defined in the PRM, but beyond that it can do whatever it likes. Fileswitch for example, treats R4 as a linear sequence of values, but it isn’t defined to be one – which is why things which treat it as if it is (…hello, FilerAction…) behave unexpectedly when it isn’t. The FSs that you’ve tested your code on write everything to the buffer, and use R3 to indicate whether or not it matched the wildcard filename. I’ve no idea if any do this, but consider a network FS which passes the wildcard to the server and only ever gets back matching objects. You wouldn’t see those held in the buffer while R3 was 0, because the local FS wouldn’t have even seen them.

Jan 16, 2023 10:10am Ronald (387) 195 posts	I have moved on to using gbpb9 with r3 set at BUFSIZ/16 and to convert the buffer full of null separated strings would be difficult without being able to rely on r3 for the number of entries. Using a wildcard r4 returns with 60 (total) but r3 returns 9 for the wildcarded entries in the buffer. One unexpected thing for me was that even though the entries are all retrieved in one use of gbpb9, it actually has to run a second time before r4 gets set to -1 and it can quit. Unless I am missing something (flags?) it will always run once more than needed while waiting for r4 == -1. Filing systems that cant supply accurate r3 possibly would work with r1 =1 (single entry mode) but are you going to detect what fs you are reading from in your prog?

Jan 16, 2023 10:24am Stuart Swales (8827) 1357 posts	R3 is the number of valid entries. They are all at the start of the buffer. Anything else you observe in there should be treated as noise. Filing systems DO supply accurate R3. Many users have used this call successfully since 1987. Or even 1986; I think it was the first addition that I made to OS_GBPB. Run until you obtain R4 == -1.

Jan 16, 2023 1:31pm nemo (145) 2556 posts	I can see I will have to expand HeebyGeeby to also randomise the (unused part of) the buffer to avoid… misunderstandings… about the difference between defined behaviour and whatever you happen to have in your particular machine, today, for this FILENAME.

Jan 16, 2023 1:49pm Stuart Swales (8827) 1357 posts	Please do! Preferably with no zero bytes so any errant access will wander off into the unknown.

Jan 16, 2023 6:48pm Charles Ferguson (8243) 427 posts	I can see I will have to expand HeebyGeeby to also randomise the (unused part of) the buffer to avoid… misunderstandings… about the difference between defined behaviour and whatever you happen to have in your particular machine, today, for this FILENAME. I love it when such changes are easy… This was the change I’ve just added to RISC OS Pyromaniac to make it a configurable, like the changes that I made for the context handle recently (https://github.com/gerph/riscos-gbpb-context-mangling): charles@laputa ~/projects/RO/pyromaniac (buffer-corruption-experiment↑)> git show commit c0603e6e7cde614d10c70b044dba89786f033bb8 (HEAD -> buffer-corruption-experiment) Author: Charles Ferguson <<a href="mailto:gerph@gerph.org">gerph@gerph.org</a>> Date: Mon Jan 16 18:40:30 2023 +0000 Add option for explicitly corrupting the OS_GBPB buffer. The OS_GBPB buffer can be written to in any size (up to the length given) that the filesystem likes when it was called. So let's make sure that this happens to provoke problems. The new configuration option is specific to OS_GBPB but the corruption code could move to its own module. diff --git a/riscos/filesystem/__init__.py b/riscos/filesystem/__init__.py index 11daa3e2..744dcc24 100644 --- a/riscos/filesystem/__init__.py +++ b/riscos/filesystem/__init__.py @@ -112,6 +112,7 @@ class FilesystemConfig(Configuration): 'mask_timestamp': riscos.rotime.QuinTimestamp, 'enumerate_last_chunk': 'enum:efficient,classic', 'enumerate_context': riscos.contextmangler.ContextManglerName, + 'enumerate_destroys_buffer': 'regex:no\|increment\|random\|[0-9a-fA-F]{1,2}', 'info_use_sysdateformat': 'bool', 'filetype_strict_range': 'bool', 'close_0_effect': 'enum:error,ignore', @@ -273,6 +274,20 @@ When this option is enabled it means that: When enabled, this option matches the behaviour of FileCore file systems by generating an error. When disabled, it matches many network file systems. +""", + + 'enumerate_destroys_buffer': """ +Configures the type of buffer corruption applied to the OS_GBPB enumeration +buffer. The buffers supplied may be writen in part or not at all by the +filesystem. This configuration ensures that the buffer is written to +with data to provoke errors. + +The corruption options available are: + + * `no`: Do not corrupt. + * `increment`: Corrupt with incrementing values. + * `random`: Randomise every byte. + * <hex>: Write the supplied hex value to every byte. """, } native_boot_option = 'run' @@ -284,6 +299,7 @@ by generating an error. When disabled, it matches many network file systems. info_use_sysdateformat = True enumerate_last_chunk = 'efficient' enumerate_context = 'identity' + enumerate_destroys_buffer = 'no' mask_timestamp = None native_freespace = 'host' filetype_strict_range = False diff --git a/riscos/swis/fs/osgbpb.py b/riscos/swis/fs/osgbpb.py index e6420899..ee72e591 100644 --- a/riscos/swis/fs/osgbpb.py +++ b/riscos/swis/fs/osgbpb.py @@ -3,6 +3,8 @@ OS_GBPB implementation. """ +import random + from riscos import handlers import riscos.constants.swis as swis from riscos.errors import RISCOSSyntheticError @@ -18,6 +20,42 @@ handlers.osgbpb.const_object = highfsi handlers.osgbpb.const_prefix = "OSGBPB_" +def corrupt_buffer(ro, form, data, datalen=None): + """ + Perform a buffer corruption on the memory buffer supplied. + + @param form: Form the corruption should take: + no: Do not corrupt + increment: Corrupt with incrementing values + random: Randomise every byte + <hex>: Write the supplied hex value to every byte + @param data: Memory buffer to corrupt (may be a memory range) + @param datalen: Size of data to corrupt, or None to corrupt the entire memory range. + """ + if form == 'no': + return + if datalen is None: + datalen = data.size + + if form == 'increment': + for n in range(datalen): + data.write_byte(n & 255, offset=n) + + elif form == 'random': + for n in range(datalen): + data.write_byte(random.randrange(256), offset=n) + + elif form != 'no': + try: + value = int(form, 16) + if value > 255: + raise ValueError("Too big") + data.write_bytes(chr(value) * datalen) + + except ValueError: + raise RISCOSSyntheticError(ro, "Bad configuration for buffer corruption ('{}' is not valid)".format(value)) + + @handlers.swi.register(swis.OS_GBPB) def swi_OS_GBPB(ro, swin, regs): """ @@ -165,6 +203,9 @@ def osgbpb_direntries(ro, reason, regs): if ro.kernel.filesystem.debug_osfileentries: print("Read direntries (GBPB %i) on '%s' context %r, count %i" % (reason, dirname, context, ntoread)) + # Apply the buffer overwriting, to provoke problems. + corrupt_buffer(ro, ro.config['filesystem.enumerate_destroys_buffer'], data, datalen) + # FIXME: This doesn't match the RISC OS interface, so tying to the filesystems # will be harder. orderedfiles = ro.kernel.filesystem.enumerate(dirname) <skip test expectation diffs> diff --git a/testcode/tests-filesystem-enumeration.txt b/testcode/tests-filesystem-enumeration.txt index e9f38a89..77f1e6af 100644 --- a/testcode/tests-filesystem-enumeration.txt +++ b/testcode/tests-filesystem-enumeration.txt @@ -55,3 +55,22 @@ Args: descending Test: Context: Multiplier Args: multiplier + + +Group: Filesystem: Enumeration (corruption) +Command: $TOOL --config filesystem.native_directory=testdata/fstest --config filesystem.enumerate_destroys_buffer=$ARG1 bin/file_enumerate c2 @ +# They should all enumerate the same order as for ReadDirEntriesCatInfo, without any differences at all. +Expect: expect/file/enumeratecontext_c_corruption_$ARG1 +Replace: expect/file/enumerate_replacements_ic + +Test: Corruption: Increment +Args: increment + +Test: Corruption: Random +Args: random + +Test: Corruption: 0 +Args: 0 + +Test: Corruption: FF +Args: FF

Jan 16, 2023 7:35pm nemo (145) 2556 posts	I’m glad you didn’t post the complicated version. HeebyGeeby 0.02 now scribbles over the supplied buffer and let that be a lesson to you. It has also ballooned from 256 bytes to 316 for which I can only apologise.

Jan 17, 2023 12:58pm Charles Ferguson (8243) 427 posts	I’m glad you didn’t post the complicated version. I’m sorry. I had assumed that it would be relatively obvious what it was doing, and the explanation made it clear. I’ll try to explain… It’s written in the high level language Python, just as most of the RISC OS Pyromaniac system is. I’ve shown the git commit message and diff, as that’s how the source is managed. The diff shows the addition of the configuration option which allows the type of corruption to be configured at runtime, and the help message that is included alongside it. The function `corrupt_buffer` performs the different types of corruption that were requested, separate from the main implementation of the OS_GBPB functions. It’s separate because I’ll reuse this later for some of the other buffer uses – `OS_ReadVarVal` is an obvious candidate but there are a bunch of others as well. The function `osgbpb_direntries` has been modified to add the call to corrupt the buffer. This function handles the OS_GBPB 9-12 entry points. And the remaining text file change is the definition of the test code that ensures that the change is exercised in some of the forms. And as I said, this was relatively easy to do and (I feel) understandable… but each to their own. So basically… this is the entire change that provides a configuration option to allow the corruption of the enumeration buffer to be changed, and the tests that ensure that it works properly. Yes, this is longer than the change that you suggest to your module, but that’s because it’s a more expressive and functional change, and is written in a high level language. Size of the target code is pretty much irrelevant – maintainability and functionality is everything.

Jan 17, 2023 4:08pm nemo (145) 2556 posts	Not everyone’s sense of humour is the same, and that’s ok. Sorry you mistook the subject of my incomprehension.