<notextile>
Date: | 16-Mar-2001 |
Document number: | 2501,846/FS |
Change number: | ECO 4428 |
Master format: | HTML |
Issue: | 3 |
Last release: | 2 |
Associated project: | 310 “STB-400” |
Authors: | A.Hodgkinson |
Status: | Confidential (secret) © Pace Micro Technology plc |
Modified for RISC OS Open Limited Wiki by ADH, 30-May-2007. Adapted for Textile
engine with layout-related changes only, except for removal for redundant external
links.
Software on the STB 400 often needs to find
out if a given uniform resource locator -
URL 1,
2, 3 – meets some comparison criteria in
order to perform different behaviour depending on the URL in use.
For example, a match may activate various extensions in the web browser depending on
the page being fetched or lead to selection of a certain protocol module for video
playback. Check URL provides a central interface through which its clients
may perform this task.
© Pace Micro Technology plc. All trademarks are acknowledged.
This document is aimed at a technical audience intending to incorporate calls
to Check URL in client software.
Check URL matches a given URL against one or many
URL fragments and reports whether or not there
is a match. The fragments can lead to matches based on complete host names or
partial domains, ports, and paths. The fragments can be supplied in a central
configuration file read at module initialisation (in RAM
builds) or given to the module at run-time.
Matches are searched for within an area and only
URL fragments within that area will be checked. This
allows many different applications to use Check URL simultaneously.
It also allows them to share their required fragments within one Check
URL configuration file, though they may dynamically add and remove areas
if they wish.
Client software decides what area titles to use. It is strongly recommended that
area titles are based on the software’s allocated name, an underscore, then any
more specific title that the client may wish to associate. This helps avoid
area namespace collision (for example, VideoControl_ProtocolModules).
There are no outstanding issues at present.
Fragment and area descriptions can be provided in a central configuration file
as well as dynamically. The configuration file is read from
Choices:CheckURL at module initialisation in RAM builds
or just before the first Check URL SWI is invoked after
initialisation in ROM builds. The file is plain text; LF or CR
are taken as line endings, and blank lines are ignored (so DOS
style text files will work fine). All other white space is treated as a field
separator. The body of each area within the file is based on two fields; one
describes hosts, domains, or partial URLs (without any fetch
scheme specified) and the other gives a parameter string.
If at the start of a new line a hash character is encountered, the line
is treated as a comment and the sectionNumberingContents are ignored save for scanning for a
subsequent LF or CR character marking the start of the next line. Otherwise, the
data is treated as the host, domain or URL fragment:
After any amount of white space not including CR or LF, Check URL
expects to see a parameter string. This is any combination of characters
terminated by CR and LF. When clients find a match, this string (NUL terminated) is
given back as part of the match details. Clients can thus encode any information
they need statically associated with the match fragment in this parameter string.
If a parameter string is not seen, the data read thus far is assumed to be an area
name instead of a URL fragment. Consequently area titles cannot
contain white space. If you do not want to associate any parameter string with a
URL, just include (say) a single non-white space character as a place holder; for
example, a single hyphen.
The number of fragments, the length of those fragments and the length of their
parameter strings is limited by available memory only. The length of an area title
is also limited by available memory only. The internal usage of area IDs limits
the number of areas to 224, memory permitting.
Within each area, matches are carried out from the bottom up. Put the most specific
matches, if required, last, and the most general matches first. For example, you may
wish to match a path of /this/specific/path/ for one thing, otherwise match
/this/ for everything else – in that case put the more general rule first in
the file.
All string matches, without exception, are case-sensitive.
Whenever an attempt to open a file in Check URL fails when a SWI
requires it, error &818601 is generated. This isn’t raised if the
central configuration file can’t be opened at module initialisation. If the format of any
file appears to be invalid, error &818602 is generated. This
does include the central configuration file.
# Video Control protocol module selection.<notextile>
VideoControl_ProtocolModules
jupiter.eng.acorn.co.uk 53580
.eng.acorn.co.uk/testvideos/ 53540
.eng.acorn.co.uk/testvideos/2/ 535C0
# JavaScript video control extension security.
NCFresco_JavaScript_VideoSecurity
webpool.isp.com -
users.isp.com -
<script language='JavaScript1.1' type='text/javascript'> <!— icp(2, “swi_checkurl_check”, “CheckURL_Check”); //—> </script> <noscript> CheckURL_Check </noscript> |
(SWI &54140) |
See if a URL matches any fragments.
R0 | = |
Flags:
|
||||||||||||
R1 | = | Pointer to a NUL-terminated area name string if R0:0 clear, else an area ID | ||||||||||||
R2 | = | Pointer to a NUL-terminated URL string if R0:1 clear, else pointer to a URL descriptor block |
R0 | = |
Flags:
|
|||||||||
R1 | = | Pointer to the parameter string associated with the matched fragment, NUL terminated, if R0:0 on exit is set, else preserved |
<script language='JavaScript1.1' type='text/javascript'> <!— icp(2, “swi_checkurl_readareaid”, “CheckURL_ReadAreaID”); //—> </script> <noscript> CheckURL_ReadAreaID </noscript> |
(SWI &54141) |
Find out the quick reference ID for a given area name string or vice versa.
R0 | = |
Flags:
|
|||||||||
R1 | = | Pointer to a NUL-terminated area name string if R0:0 clear, else an area ID |
R0 | = | Flags All bits currently reserved (must be zero) |
R1 | = | An area ID if R0:0 on entry was clear, else pointer to a NUL-terminated area name string |
<script language='JavaScript1.1' type='text/javascript'> <!— icp(2, “swi_checkurl_readfile”, “CheckURL_ReadFile”); //—> </script> <noscript> CheckURL_ReadFile </noscript> |
(SWI &54142) |
Read a new configuration file.
R0 | = | Flags: All bits currently reserved (must be zero) |
R1 | = | Pointer to NUL-terminated filename string |
VideoControl_ProtocolModulesand a client attempts to load this file:
/ 53580
.eng.acorn.co.uk/testvideos/ 53540
VideoControl_ProtocolModulesthe result is the same as loading the following in one go:
/ 53A00
/multicast 53540
VideoControl_ProtocolModulesSo with bottom-up matching, the / entry for 53580 would never get matched. Adding new areas or adding new fragments to existing areas does not alter the validity of area IDs.
/ 53580
.eng.acorn.co.uk/testvideos/ 53540
/ 53A00
/multicast 53540
<script language='JavaScript1.1' type='text/javascript'> <!— icp(2, “swi_checkurl_addarea”, “CheckURL_AddArea”); //—> </script> <noscript> CheckURL_AddArea </noscript> |
(SWI &54143) |
Add a new area, or a fragment to an existing area.
R0 | = |
Flags:
|
||||||||||||
R1 | = | A NUL-terminated area name string if R0:0 clear, else an area ID | ||||||||||||
R2 | = | Pointer to a NUL-terminated set of CR or LF separated fragments and parameter pairs if R0:1 clear, else pointer to a NUL-terminated filename string; zero if no fragments are to be added to the area at this time |
R1 | = | Area ID of the (possibly new) area in use. |
<script language='JavaScript1.1' type='text/javascript'> <!— icp(2, “swi_checkurl_deletearea”, “CheckURL_DeleteArea”); //—> </script> <noscript> CheckURL_DeleteArea </noscript> |
(SWI &54144) |
Remove one or all areas.
R0 | = |
Flags:
|
||||||||||||
R1 | = | If R0:0 set, ignored. If R0:0 and R0:1 clear, pointer to a NUL-terminated area name string. If R0:0 clear and R0:1 set, an area ID |
The Check URL module is allocated one error block at &818600:
Error no. | Meaning |
---|---|
&818600 |
Area not known A client has passed an unknown area name string or ID to SWI CheckURL_Check, CheckURL_ReadAreaID, or CheckURL_DeleteArea. |
&818601 |
Cannot open configuration file An attempt to open a configuration file has failed. This is only raised in response to any SWI that calls for a file to be read. |
&818602 |
Invalid configuration file Raised whenever any file read by Check URL is of an apparently invalid format. This includes finding an area name field if a file is read in SWI CheckURL_AddArea. |
&818603 |
Invalid fragments The URL fragments and parameters string given to SWI CheckURL_AddArea is of an apparently invalid format (for example, a fragment may be missing a parameter). |
&818604 |
Check URL could not claim enough memory Memory was exhausted during some allocation operation being performed by Check URL. |
For URL canonicalisation, Check URL will call
the URL_ParseURL SWI and thus URL Fetcher 0.43
or later must be present.
Final code size of the version described by this document should be inside
24K. Memory claimed will depend on the number of areas and their fragments
and parameters, and the lengths of the strings involved. No memory will be
claimed at run-time if all areas are deleted. Since area IDs work as indices
into an array of pointers, the array itself may remain at a high water mark
size if an area sits at the top of the array even after all others are
deleted; however, once that area is removed the whole array will be freed.
The SWI interface will be tested to ensure it performs as
documented in a debug build, with various combinations of valid and invalid
configuration files or configuration file fragments.
The SWIs must perform as documented and the
performance targets must be met.
Refering to an area by its area ID is significantly faster than referring
to it by name. Multiple clients may be using Check URL however, and
it is possible that one of them could delete and recreate an area you are
using, e.g. to ensure that a known set of fragments exist in that area and
no others. At this point the old ID becomes stale. Whilst you may consider
it legitimate to allow this stale ID to fault depending on the nature of
the client you are writing, if possible it is better to make a single
attempt to re-read the ID and continue if this attempt succeeds. An example
piece of C code might read as follows:
</notextile>
#include#include <stdlib.h></notextile><notextile>#include <stdbool.h>
#include <swis.h>
/ This is exported via. Check URL’s !MkExport /
#include <CheckURL.h>
/ Area name to use /
#define AreaName "VideoControl_ProtocolModules"
/ For any URL, this can hold a more complete description than /
/ strings, and makes comparing two URLs in a valid manner easier. /typedef struct url_description
{
char full; / Complete, canonicalised URL */char protocol; / Such as ‘http’ or ‘mailto’ /
char host; / E.g. ‘www.acorn.com’ /
char port; / For example ‘8080’ */char user; / E.g. ‘ahodgkin’ /
char password; / E.g. ‘NotMine’ /
char account; / As in ftp://user:pass:account@host/ */char path; / Speaks for itself */
char query; / CGI info - after a ‘?’ in a URL /
char fragment; / Anchor info - after a ‘#’ in a URL /
}
url_description;/**************************************************************/
/ url_match() /
/ /
/ Match a given URL_Description in the area recorded in /
/ ‘ConfigArea’ through the Check URL module. Caches the area /
/ ID for speed and will attempt to re-cache if this ID /
/ appears to become invalid later via. a recursive call. /
/ /
/ Parameters: Pointer to the url_description to match; /
/ /
/ Pointer to a char to take a pointer to the /
/ match parameter (will be NULL on exit if the /
/ match fails); /
/ /
/ true to support the stale ID recovery attempt /
/ else false. /
/*************************************************************/static kernel_oserror url_match(url_description d,
const char param,
bool allow_fail)
{
static unsigned int areaid = 0;_kernel_oserror * e;
unsigned int match;if (param == NULL) return NULL;
*param = NULL;/ Ensure we have an area ID /
if (area_id == 0)
{
allow_fail = false; / Make sure we don’t try and reread it in a moment /e = swix(CheckURL_ReadAreaID,
_INR(0,1) | OUT(1),0,
AreaName, / See near top of file /&area_id);
if (e != NULL) return e;
}/ Try the match /
e = swix(CheckURL_Check,
_INR(0,2) | OUTR(0,1),CU_Check_OnEntry_GivenAreaID | CU_Check_OnEntry_GivenURLDescriptor,
area_id,
d,&match,
param);if (e == NULL)
{
/ If no match, clear “param” (R1 is preserved on exit for no match) /if ((match & CU_Check_OnExit_MatchFound) == 0) param = NULL;
}
else if (e->errnum == cu_ERROR_AREA_NOT_KNOWN && allow_fail)
{
/ Since allow_fail is true, we area allowed to fail on an area ID /
/ lookup. This is because we know IDs can become stale. In this /
/ case, try again, but only once. */area_id = 0;
e = url_match(d, param, false);
}return e;
}<script language='JavaScript1.1' type='text/javascript'> <!— isn(1); //—> </script> History
Issue A 06-Mar-2000 First draft completed and checked (ADH) Issue B 09-Mar-2000 Released following review (ADH) Issue 1 21-Mar-2000 Corrected a description of R1 that didn’t match the description of flags in R0 and corrected description of the way in which fragments are added to existing areas (ADH) Issue 1A 22-May-2000 Extended CheckURL_AddArea to return an area ID in R1. ROM builds have to defer loading of the central configuration file. Gave example code for cacheing an area ID. Few typing errors fixed (ADH) Issue 1B 05-Jul-2000 Corrected example code which wasn’t checking on-exit flags of the call to CheckURL_Check (ADH) Issue 2 04-Aug-2000 Released following review (AMR 5390, ADH) Issue 3 16-Mar-2001 Updated automatic section numbering to handle subsectionNumberingSections. Used full subsection numbers for SWIs. Updated references section using links into the drawing office search engine and added validation footer. Overall, all changes were internal. ECO allocated for release (ECO 4428, ADH) <script language='JavaScript1.1' type='text/javascript'> <!— isn(1); //—> </script> References
The following may prove useful:
- RFC 1630: Uniform Resource Identifiers in WWW
(‘http://www.faqs.org/rfcs/rfc1630.html’ – April 1998).
Official location: ‘ftp://ftp.isi.edu/in-notes/rfc1630.txt’
- RFC 1738: Uniform Resource Locators
(‘http://www.faqs.org/rfcs/rfc1738.html’ – April 1998).
Official location: ‘ftp://ftp.isi.edu/in-notes/rfc1738.txt’
- RFC 1808: Relative Uniform Resource Locators
(‘http://www.faqs.org/rfcs/rfc1808.html’ – April 1998).
Official location: ‘ftp://ftp.isi.edu/in-notes/rfc1808.txt’
- URL Fetcher API
Specification (1215,220/FS issue 3, 12-Nov-1998, ECO 4131)</notextile>