Forums → General →

Parameters to module commands

18 posts, 4 voices

Jun 6, 2023 2:50pm

Simon Willcocks (1499) 513 posts

Can someone please confirm to me that the command tail passed to a module command has the following structure:

Each parameter is space terminated (including the last).
There is no final control character terminator.
The “number of parameters” is effectively the number of space-to-non-space transitions in the command line (including one at the start of the tail)
If any parameter is to be GSTrans’d by the OS CLI, sequences of spaces will be replaced by single spaces in the string passed to the module
the value of a parameter expanded by GSTrans is not permitted to contain CHR$127 or any spaces or control characters.

Jun 6, 2023 5:00pm

Rick Murray (539) 13840 posts

Each parameter is space terminated (including the last).

No. The parameters are literally the tail of the command line. Warning – there is no difference in behaviour if you put a string in quotes. If you use spaces to separate, you’ll need to track this yourself.
I suspect you’re supposed to use OS_ReadArgs or something…

There is no final control character terminator.

Yes, and it is not fixed. If you start up a module by calling OS_Module from a bit of BASIC, it’ll be null terminated. However if you do an RMLoad/RMRun from the command line, the terminator appears to be &0D, [&0D, [&0D,]] &00. Not sure where it’s getting the CRs from, but it looks like “two or three” is what I’m seeing.

As tabs could be validly in the string, in this case, I’d be inclined to treat CR=NULL=termination (and probably just discard anything else <32 or =127).

The “number of parameters” is effectively the number of space-to-non-space transitions

Nope. Try: *RMRun EntryParms One Two Three Four
Now try *RMRun EntryParms "One Two Three Four"

If any parameter is to be GSTrans’d by the OS CLI,

I tried passing |@ and <32> and it passed those literal. I think the environment string is “anything and everything passed after the module name, exactly as written”.

the value of a parameter expanded by GSTrans is not permitted to contain CHR$127 or any spaces or control characters.

SYS "OS_Module", 1, "EntryParms", "one"+CHR$(127)+"two"
Gave, exactly, “one<delete>two” as the environment string.

Jun 6, 2023 5:05pm

Rick Murray (539) 13840 posts

Here, play with this…

; EntryParms
; ==========
;
; by Rick Murray
;   Version: 0.01
;   Date   : Tuesday, 6th June 2023
;
;   hxxps://www.riscosopen.org/forum/forums/5/topics/17925
;    ^^ tt, but bloody Textile... 


; SWIs (to save including anything)
OS_Write0      * 2
OS_NewLine     * 3
OS_CLI         * 5
OS_Exit        * &11
OS_ConvertHex8 * &D4


; ===========
; Here we go!
; ===========

        AREA     |Module$Code|, CODE, DATA, PIC, A32bit
        ENTRY

entrypoint
        ; Standard RISC OS module header
        DCD      (start - entrypoint)        ; Start code
        DCD      (initialise - entrypoint)   ; Initialise code
        DCD      0                           ; no Finalise code
        DCD      0                           ; no Service call code
        DCD      (titlestring - entrypoint)  ; Module title string
        DCD      (helpstring - entrypoint)   ; Module help string
        DCD      0                           ; no help/command table
        DCD      0                           ; no SWI chunk
        DCD      0                           ; no SWI handler code
        DCD      0                           ; no SWI decoding table
        DCD      0                           ; no SWI decoding code
        DCD      0                           ; no Messages file
        DCD      (thirtytwo - entrypoint)    ; 32bit flag word


titlestring
        =        "EntryParms", 0
        ALIGN


helpstring
        =        "EntryParms", 9, "0.01", " (06 Jun 2023)", 0
        ALIGN


thirtytwo
        DCD      1                           ; Shiny!



; =====================
; MODULE INITIALISATION
; =====================
;
;   R3 = Non-zero for Init, Zero for Start
;   R4 = Envirostring pointer (R0 on entry)
;

initialise
        STR      R14, [R13, #-4]!
        ; Output "I", and set R3 to *non*-zero
        MOV      R4, R10          ; remember envirostring (in R10)
        MOV      R0, R10          ; and fudge it not being in R0 ;)
        SWI      256 + 'I'
        MOV      R3, #1
        B        do_the_rest

start
        ; Output "S", and set R3 to *zero*
        MOV      R4, R0           ; remember envirostring (in R0)
        SWI      256 + 'S'
        MOV      R3, #0
        ; fall through

do_the_rest
        ; Output ="<environment string> (following 'I' or 'S')
        SWI      256 + '='
        SWI      256 + 34 ; "
        SWI      OS_Write0        ; <-- corrupts R0
        SWI      256 + 34 ; "
        SWI      OS_NewLine

        ; Now make the address a hex string
        MOV      R0, R4           ; recall envirostring
        ADR      R1, addr
        MOV      R2, #10
        SWI      OS_ConvertHex8

        ; Output "@ &<address>"
        SWI      256 + '@'
        SWI      256 + ' '
        SWI      256 + '&'
        SWI      OS_Write0

        ; Replace terminull with a space so command works
        MOV      R0, #32
        STRB     R0, [R1]

        ; Now call the command to dump it in byte mode
        ADR      R0, cmnd
        SWI      OS_CLI
        SWI      OS_NewLine

        ; Now, how do we get out of here?
        CMP      R3, #0
        SWIEQ    OS_Exit        ; Start code MUST exit this way

        LDR      PC, [R13], #4  ; otherwise, exit this way

cmnd    ; Becomes: *Memory B <address> +20
        =        "Memory B "
addr    =        "          +20", 0
        ALIGN

When used, it looks like this:

*RMLoad EntryParms This is a LOAD op.
"="This is a LOAD op.
@ &205AB146
Address  :  6  7  8  9  A  B  C  D  E  F  0  1  2  3  4  5 :    ASCII Data
205AB146 : 54 68 69 73 20 69 73 20 61 20 4C 4F 41 44 20 6F : This is a LOAD o
205AB156 : 70 2E 0D 00 00 00 00 00 00 00 00 00 00 00 00 00 : p...............

*RMRun EntryParms This is a RUN op.
"="This is a RUN op.
@ &205AB145
Address  :  5  6  7  8  9  A  B  C  D  E  F  0  1  2  3  4 :    ASCII Data
205AB145 : 54 68 69 73 20 69 73 20 61 20 52 55 4E 20 6F 70 : This is a RUN op
205AB155 : 2E 0D 0D 0D 00 00 00 00 00 00 00 00 00 00 00 00 : ................

"="This is a RUN op.
@ &205AB145
Address  :  5  6  7  8  9  A  B  C  D  E  F  0  1  2  3  4 :    ASCII Data
205AB145 : 54 68 69 73 20 69 73 20 61 20 52 55 4E 20 6F 70 : This is a RUN op
205AB155 : 2E 0D 0D 0D 00 00 00 00 00 00 00 00 00 00 00 00 : ................

*

The tail ‘"’ is at the start of the line (overwriting the I or S) because it’s being called from the command line, and OS_Write0 obeys the carriage return.
Using OS_Module from BASIC doesn’t do this as the string is null terminated.

Jun 6, 2023 7:49pm

Simon Willcocks (1499) 513 posts

Thanks, both of you, but I’m not thinking of RMLoad/RMRun, rather *commands provided by a loaded module. Did I use the wrong terminology?

That code is passed a string in R0 and a count of parameters in R1. The OS will also GSTrans up to 8 of the first 8 parameters, according to bits 8-15 in the info word associated with the command.

Jun 6, 2023 8:44pm

Rick Murray (539) 13840 posts

Well, gee, that’s a whole different entry point. ;)

Mod the code to provide a fake *command and dump out what it gets as input, then you can play with it and see what happens and if there are any weird edge cases to be aware of (like the line termination thing above).

Jun 6, 2023 10:41pm

Simon Willcocks (1499) 513 posts

That’s what I have been doing, and how I came up with the five rules (with a little reading in the CLI code).

I’m pretty sure it’s accurate, but the bug I’ve been chasing showed itself in another area (my GSTrans wasn’t doing the whole job).

I’ll leave it here, on the basis that the best way to get an answer to a question on the internet is to say something wrong!

Jun 7, 2023 10:29am

Alan Adams (2486) 1149 posts

Looking at the results of your research, and Rick’s comments about endings, one test I would try is to use the star command with a long parameter list, then follow it with a shorter list.
Because I’m thinking that as described, there may be no way to detect when the new list ends, and the remains of the previous one start.

Jun 7, 2023 12:57pm

Rick Murray (539) 13840 posts

The number of parameters given?

It’s just a shame that the OS, having already parsed the input command line, didn’t go that I’ve little step further and give a list of words pointing to where each parameter begins. Instead, the OS, having already examined things, requires the module code to do the same all over again. 😪

Jun 7, 2023 1:46pm

Simon Willcocks (1499) 513 posts

There is, Alan, you have to use the count and eliminate strings of spaces. This isn’t good enough, because the OS seems to count quoted strings as a single parameter:

void c_test_command( char const *tail, uint32_t count )
{
  Write0( "Number of parameters: " ); WriteC( '0' + count ); NewLine();
  char const *p = tail;
  while (count > 0) {
    char c;
    do {
      c = *p++;
      WriteC( c );
    } while (c > ' ');
    NewLine();
    count--;
    if (count > 0) {
      while (' ' == *p) { p++; }
    }
  }
  char const *q = tail;
  while (q < p) {
    char c = *q++;
    WriteC( c );
  }
  NewLine();
}

I just tried:

*testcommand2 abc "def " ghi jkl
Number of parameters: 4
abc
“def
"
ghi
abc "def " ghi

That makes it a lot more challenging to parse. It also doesn’t seem to care if there’s a closing quote, which looks like a bug, to me. At least then there seems to be a control character terminator.

void c_test_command( char const *tail, uint32_t count )
{
  Write0( "Number of parameters: " ); WriteC( '0' + count ); NewLine();
  char const *p = tail;
  while (count > 0) {
    char c = *p;
    bool quoted = (c == '"');

    do {
      WriteC( c );
      c = *++p;
    } while ((!quoted && c > ' ') || (quoted && c >= ' ' && c != '"'));

    if (quoted) {
      WriteC( c );
      p++;
      if (c != '"') {
        Write0( "Oops? " ); if (c > ' ') WriteC( c ); NewLine();
      }
    }

    NewLine();
    count--;
    if (count > 0) {
      while (' ' == *p) { p++; }
    }
  }
  char const *q = tail;
  while (q < p) {
    char c = *q++;
    WriteC( c );
  }
  NewLine();
}

*testcommand2 abc "def " ghi jkl
Number of parameters: 4
abc
“def "
ghi
jkl
abc "def " ghi jkl

Jun 7, 2023 6:39pm

Charles Ferguson (8243) 427 posts

Here is the argument parser in RISC OS Pyromaniac…

    def parse_args(self, args):
        """
        Process the arguments to the command to honour the command requirements.

        Raises RISCOSError for any cases where the string violates its constraints.

        @param args:    Arguments string to process for this command

        @return:    tuple of (count, args) where args has been expanded according to
                    the GSTrans map
        """

        gsmap = self.gstransmap

        args = args.lstrip()

        newstring = []
        index = 0
        while args:
            # Find where this parameter ends.
            if gsmap & (1<<index):
                if newstring and args[0] == ' ':
                    newstring.append(' ')

                # String should be GSTrans'd
                offset, arg = self.ro.kernel.gstrans.translate(args,
                                                               use_escapes=True,
                                                               terminate_on_space=True,
                                                               quotes_special=True)
                # It's not allowed to expand to nothing (as this would cause problems for
                # things like a variable that's unset being substituted as a parameter).
                if arg == '':
                    raise RISCOSSyntheticError(self.ro, errors.ErrorNumber_BadParmString, "BadParmString")

                newstring.append(arg)
                args = args[offset:]
                if args:
                    newstring.append(' ')
            else:
                arg = []
                quoted = False
                if args[0] == '"':
                    quoted = True

                while args:
                    arg.append(args[0])
                    args = args[1:]
                    if not args or (args[0] == ' ' and not quoted):
                        break
                    if args[0] == '"':
                        quoted = not quoted

                newstring.append(''.join(arg))

            # Any trailing spaces after this argument are not themselves arguments; we need to consume
            # them into the string without increasing the index.
            while args and args[0] == ' ':
                newstring.append(' ')
                args = args[1:]

            index += 1

        if index < self.minargs or index > self.maxargs:
            raise RISCOSSyntheticError(self.ro, errors.ErrorNumber_Syntax, self.syntax_message())

        return index, ''.join(newstring)

Jun 7, 2023 10:44pm

Simon Willcocks (1499) 513 posts

I have to admit I can’t read python (?) all that well, but I don’t see a difference between self.gstransmap being zero or non-zero. In the former case, modules are passed all the tail, including multiple spaces (rule 4).

In this example, the gstransmap is 0 for testcommand2, 2 for testcommand:

*testcommand2 abc         def
Number of parameters: 2
abc
def
abc         def
*testcommand abc         def
Number of parameters: 2
abc
def
abc def

Jun 8, 2023 7:40am

Charles Ferguson (8243) 427 posts

I have to admit I can’t read python (?) all that well, but I don’t see a difference between self.gstransmap being zero or non-zero.

The gstransmap case performs GSTrans on the string, which replaces your environment variables and does the character escaping (like |L), and terminates on the first space. And then we append the remaining spaces. That might be different to how classic does it, but that’s what I’m doing.

I’ll need to work out how to transcribe your examples into tests but I believe the best way to go express this is essentially with the expectations checks that I’ve been doing so far.

So if I’ve read your example right:

Flags	Input	Nargs	Output
0	⟪abc def⟫	2	⟪abc def⟫
2	⟪abc def⟫	2	⟪abc def⟫

Using ⟪ and ⟫ as the string delimiters because of the way that pre is handled.

That might mean that my code is wrong… If you can give the test cases like the above (or give me the test code that you’re using), it will be relatively easy to check the behaviour is the same.

Jun 8, 2023 12:58pm

Simon Willcocks (1499) 513 posts

Your (very impressively formatted!) table is correct. If the flags byte is non-zero, the OS copies the string, replacing sequences of spaces with a single space, expanding the appropriate parameters using GSTrans, otherwise it just passes in the whole string, counting parameters as anything between spaces or inside quotes (where the spaces are ignored for the purposes of counting).

The expanded parameters are not allowed to be empty, or contain a space, control character, or ‘\x7f’. Otherwise you get “Parameter expansion contains unrecognised characters” (even when it contains none). (See nastycharscan in Kernel/s/Oscli.)

You can’t “cheat” and try something like “|”<expands$to$a$space>|"", either, and putting |" into an expanded string really confuses it.

In these examples, testcommand expands parameter 2, testcommand2 doesn’t expand any.

*testcommand hello "|"" hhh
Number of parameters: 3
hello
" hhh Oops? 

fix
hello " hhh fix

Put an expanded parameter in quotes at the end of the parameter list and everything goes to pot. (Specifically, the count is one too high.)

*testcommand Hello "two words"
Parameter expansion contains unrecognised characters
*testcommand "quoted"
Number of parameters: 1
"quoted"
"quoted"
*testcommand Hello "quoted"
Number of parameters: 3
Hello
quoted
" Oops? 

Hello quoted " 
*testcommand Hello "quoted" fix
Number of parameters: 3
Hello
quoted
fix
Hello quoted fix

As far as I can see, when there’s no GSTrans involved, there’s no problem and everything acts as expected.

*testcommand2 Hello "two words"
Number of parameters: 2
Hello
"two words"
Hello "two words"
*testcommand2 Hello "word"
Number of parameters: 2
Hello
"word"
Hello "word"
*testcommand2 Hello "word" done
Number of parameters: 3
Hello
"word"
done
Hello "word" done

Jun 8, 2023 6:13pm

Charles Ferguson (8243) 427 posts

If you’ve got a good set of examples, then can you give a similar table of flags values and expectations that can be codified. I’ve got a bunch of tests already, but if you understand it well enough to have a number of assertions that can be applied, then we can extend the tests to give more consistent expectations. I’m relatively happy with what I’ve got and I’m not especially bothered by replicating the exact behaviour, unless it matters or there can be a bunch of clear regression tests.

Jun 8, 2023 6:15pm

Charles Ferguson (8243) 427 posts

More specifically, I’m very happy to have an easy (ish) to understand python routine that does the expansion and which keeps things sane, rather than wasting time trying to exactly the same behaviour, since it won’t matter to most things and dragging along oddities isn’t going to help anyone. But if there are definite cases that need to be matched, that can be asserted on, that’s good.

Jun 8, 2023 6:55pm

Simon Willcocks (1499) 513 posts

Well, the existing system seems to be a bit of a mess, but doesn’t give problems in practice. I went down the rabbit hole because the setting of Fon$Prefix (" ") in Font$Path wasn’t working in my system, so I had to look deeper into it. Turned out I’d just missed the “remove the surrounding quotes” function of GSTrans.

Now I have many new problems, instead!

I doubt anyone depends on the number of spaces between parameters. At least I hope not.

Jun 8, 2023 7:53pm

Charles Ferguson (8243) 427 posts

Well that’s why we have tests for this – so that we know what the behaviour is and can see what is affected when we change things.

Jun 12, 2023 10:20pm

Charles Ferguson (8243) 427 posts

Right, as I suggested, I’ve now written tests for this, which you can find in the riscos-tests (https://github.com/gerph/riscos-tests/) repository. If you’ve got any further tests you want to add to that, as usual, I’ll state that you can raise a PR for it, or just suggest them in an issue. The tests I’ve put into the module all pass on RISC OS Pyromaniac (except for the one that reports an error about thee parameter string), whilst on RISC OS Classic, there are two failures due to mismatches reported – for the case where the first string is quoted and GSTrans’d, which has to be a bug.

The whole module command parsing is a bit of a mess when it comes to the GSTrans; I’d not appreciated how screwed up it was until I compared them… I don’t believe it’s usable the way it is, but … hey-ho.

Reply

To post replies, please first log in.

Forums → General →

Parameters to module commands

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options