RISC OS Open: Forum: Self-extracting archives

Feb 12, 2010 2:57am

Steve Revill (20) 1361 posts

There are now self-extracting versions of the UnTarBZ2 tool and HardDisc4 images on the download pages. These simply have to be downloaded onto a RISC OS system, given the ‘Utility’ filetype (&FFC) and then run. They will decompress their contents into the same location as the archive.

The important benefit of doing things this way is that these archives are stand-alone. You do not require any additional software do decompress them – unlike zip, tar and bz2 archives.

There is also a self-extracting version of the tool which we used to create these archives. This is supplied by 7th software and is free for anyone to download and use to create their own archives.

ROOL will probably move over to using self-extracting archives for all RISC OS binary releases in the near future. Eventually, we hope to supplement our downloads pages with a proper software packaging system.

Feb 12, 2010 11:31am

James Lampard (51) 120 posts

ROOL will probably move over to using self-extracting archives for all RISC OS binary releases in the near future.

Er, why? Surely the universally supported PKZIP format is a far smarter choice than a file that you have to set to type &FFC and then run. Any of the standard archive types can be read on any platform, your oddball format would seemingly be limited to RISC OS.

Feb 12, 2010 11:57am

Theo Markettos (89) 919 posts

I haven’t tried CreateSEC, but if it outputs in PKZIP format with a self-extracting header it should be OK. Infozip is capable of skipping any extraneous stuff and finding an embedded PKZIP. I don’t know if other tools like WinZip do, though.

I do poke around RISC OS Zips on other platforms from time to time, so something that was only decompressable on RISC OS would be a pain.

Feb 12, 2010 1:02pm

Jeffrey Lee (213) 6048 posts

I’m in agreement with James and Theo here – if you’re going to replace everything with self-extracting archives then PKZIP would be a better internal format (even if it would most likely add a dependency on the Shared C Library being present). Apart from allowing people on other platforms to extract the archives, and providing a better compression ratio than Squash, it would also provide a handy solution to the problem of being unable to extract archives that are larger than RAM (since you should just be able to load them into any half-decent RISC OS zip program instead).

Judging by Wikipedia, it looks like PKZIP’s “Shrink” format uses LZW - so in theory you could create a PKZIP-compatible self extractor that can still be decompressed using the Squash module (thus keeping the current system’s advantage of a small header size and no SCL dependency)

Feb 12, 2010 1:40pm

Andrew Hodgkinson (6) 465 posts

The reason for doing this is sanity.

Recently, both Steve and I tried to set up RPCEmu from clean using public components only, to test the IOMD ROM. Before trying an unstable ROM, though, it made sense to try RISC OS 3.71 first. So I got hold of a 3.71 ROM, downloaded RPCEmu and:

Whether you use a blank HDD image + HostFS or just HostFS, RPCEmu has no boot sequence.
Thus you have no Internet etc. (and it doesn’t work on all platforms anyway). So HostFS is the only way to import data.
If I decompress a Zip archive into HostFS on the host OS, I will not get filetypes, so this doesn’t work.
If I write a Zip file with ”,xyz” filename extensions and decompress that on the host OS into HostFS it’ll work, but it would fail if decompressed on RISC OS.
Thus we need a RISC OS specific Zip tool which understands embedded filetypes.
Only SparkPlug is available for free AFAICT.

So…

David Pilling provides a self-extracting SparkPlug but this is 26-bit only.
Download and extract it, assuming you have a 26-bit safe OS you can run it on (if you were on IOMD ROM 32-bit, you’d now be completely stuck and unable to proceed).
Having extracted SparkPlug 2.26, use it to unpack the RISC OS filetype extended Zip file of 32-bit SparkPlug 2.28…
...except you can’t because the !Run file ensures CLib but we have no Boot/System yet.
Comment this out, find it runs OK, get SparkPlug 2.28 (assuming you’re going to try a 32-bit OS).
We can’t run 2.28 because it has even more dependencies, so let’s ignore that for now and carry on in 26-bit mode under RISC OS 3.71.
Unpack UnTarBZ2’s Zip archive from the ROOL site.
Now we can use UnTarBZ2 to unpack the HD4 .tar.bz2 image…
...except we can’t, because the tool has numerous module dependencies; it requires a populated !System.
The skeleton !System bundle on the downloads page does not have all the required modules, in particular no updated CLib.
You can examine !Run and download each module from the ROOL site; tedious but possible.
These are in Zip files, so we need SparkPlug for that. Some dancing around with 26-bit or 32-bit versions may be necessary, especially if you inadvertently load the 32-bit CLib, e.g. by running UnTarBZ2 only to find you missed out some dependency…
...since softloading 32-bit CLib on RISC OS 3.71 is necessary to run UnTarBZ2, but if you eventually unpack the HD4 image, note that !Boot (and many other pieces of software) only run with the 26-bit CLib. If you load the 32-bit library, you get “Message token C72 not found” errors all over the place.
It’s a good thing RPCEmu reboots quickly.
Now we can finally run UnTarBZ2 to unpack archives.
We discover that trying to unpack the HD4 image to HostFS:$ doesn’t work, output pathnames being badly scrambled, presumably a Unixlib filename bug I’ve mentioned in another thread.
We try going to ADFS via a HD4 image downloaded from the Spoon site, but this being RISC OS 3.71, we don’t have long filenames. “HardDisc4/tar/b2” gets truncated to “HardDisc4/” so UnTarBZ2 refuses to touch it.
Rename the file to something short with ”/tar/bz2” on the end, finally get the thing unpacked.
Doesn’t work because it’s for RISC OS 5, but that was the intended target anyway. For RISC OS 3.71 it’s easier; we can use a Zip archive from acorn.riscos.com. Crusty but functional, though as I say, won’t run with 32-bit CLib present.

It’s not quite so bad just for RISC OS 3.71 because you could ignore UnTarBZ2, mostly; in fact you’d probably ignore the ROOL site altogether since you already have your own ROM and thus presumably your own RISC OS machine anyway. If you want RISC OS 5, though, you have this ridiculous boostrapping process and all sorts of trouble with 26-bit and 32-bit variants, with much of the trouble boiling down to filetypes and HostFS.

The self-extractor solves all of this
Just run it and it works
It even gets Text and Data filetypes right on HostFS, despite HostFS having horrible bugs here
The same cannot be said of other tools

Something like UnTarBZ2 which instead unzipped things has been discussed in another thread. The CLI zip tool would need to understand RISC OS filetype information. Having no dependency on an external CLib is critical because of the problems and incompatibilities that can arise due to a half-brought-up system, just when you don’t need that kind of pain.

It might perhaps be possible to have some kind of Zip-compatible thing which self-extracts on RISC OS. If someone wants to write and test a tool which creates such things and can demonstrate them working on HostFS then great! We’ll use it. You’ll need to pay close attention to filetypes, particularly text and data, and ideally dealing with hard spaces when executing under <= RISC OS 3.71 if possible – our current code doesn’t manage that and HostFS barfs.

Otherwise, this new tool certainly saves me huge amounts of pain trying to actually use components from the ROOL site, which is half the point of them being there.

There is nothing to say we might not put Zip and self-extracting archives side by side so other operating systems could be used to examine contents in the rare case where this is necessary.

Since the most likely use case for self-extracting components at the current time is via HostFS, having ”,ffc” on the end of filenames might be wise also. Harmless on ADFS-like filesystems, very handy on HostFS-like filing systems. Means the end user can just download and run.

Feb 12, 2010 2:40pm

Martin Bazley (331) 379 posts

Only SparkPlug is available for free AFAICT.

Ahem. Infozip?

Feb 12, 2010 2:55pm

Andrew Hodgkinson (6) 465 posts

Ahem. Infozip?

The self-extractor requires a 32-bit C library which you can’t obtain without being able to decompress a Zip file :D

Feb 12, 2010 3:02pm

Steve Revill (20) 1361 posts

Given the mixed response, I have no problem with supplying the binary downloads (note: because they are BINARY downloads, they only work in RISC OS anyway!) in both zip and self-extracting formats. It’d just be one extra step in the process (which is already not fully automated and a PITA right now) for creating the binary downloads.

Feb 12, 2010 3:04pm

Steve Revill (20) 1361 posts

Oh, and the reason I didn’t use zip as a self extraction file format is because that would have been no fun – I wanted to do something fun for a change. So ner! :p

Feb 12, 2010 9:03pm

James Woodcock (307) 32 posts

I have written a very quick and dirty tool to extract these self extracting archives on a Linux box.

The code can be improved greatly.

File types are appended to file names in the usual way, so there is a small advantage to using this tool rather than unzip when extracting under Linux.

I haven’t packaged it up yet, but code is at github: git://github.com/mjwoodcock/unsec.git

Feb 12, 2010 9:45pm

Andrew Hodgkinson (6) 465 posts

git://github.com/mjwoodcock/unsec.git

Oooh, git. Very modern :-)

Nice tool, many thanks – we may not rely on the self-extractor for everything but it’ll certainly be used for a few things and having a cross-platform tool to access the archives.

You shouldn’t need to reverse engineer the file format, though (which is what you’ve done, if the Github readme is to be believed). It’s documented in the Help file. Steve – hope you don’t mind – I’ve extended the Wiki documentation for the software to include a large chunk copy & pasted from the software.

https://www.riscosopen.org/wiki/documentation/pages/Software+information%3A+CreateSEC

Feb 12, 2010 10:00pm

James Woodcock (307) 32 posts

Oh, yes. Read the documentation. I must admit, that hadn’t thought of that.

Thanks for the link. I’ll have a look sometime shortly.

Feb 13, 2010 12:01am

Andrew Hodgkinson (6) 465 posts

I’ll have a look sometime shortly.

I note reading it again that it gives details of the archive format, but does not go into details about squash format. That’ll be in the PRMs. The Squash API is described in volume 4 from page 103 and the file format in volume 4 from page 499. You can find these in PDF format here:

http://foundation.riscos.com/Private/manuals/PRMs/

...though given the URL I’m not sure they’re meant to be open for public access. Still, they are, so you may as well grab a copy.

The PRM information, looking at it right now, is actually unusually poor. Even more annoyingly, we have not been able to secure rights to release the source code to this component. Looking at your code, most of the work you’ve done seems to have been on the actual decompression side, so in fact you may well have had no choice but to reverse engineer the lion’s share of it anyway.

Feb 13, 2010 9:05am

James Woodcock (307) 32 posts

Thanks for the link. I did manage to find a copy when I was developing the tool.

squash_compress gives the most relevant information: the algorithm is 12-bit LZW as used by Unix compress command. I had some LZW code from nspark (arcfs and spark dearchiver for various OSs) that was nearly there – I just had to cater for the unix compress header in the stream.

Feb 13, 2010 3:54pm

Steve Revill (20) 1361 posts

I have tweaked the format of the binary so that you have an offset to the start of the compressed data structure near the start of the file. Thus, reading the word at offset 24 (bytes) gives you an offset (bytes) from the start of the file to the start of the compressed data structure (i.e. the “rsqs” word).

Note: the word immediately preceding the the “rsqs” ID word is the size of the structure (bytes), if you care – which you probably don’t because you’ve already loaded the file so know how big it is. Still, it’s a useful extra sanity check.

I’ve also added a -n switch to CreateSEC so that you can build an archive using our format that doesn’t include the self-extraction code.

Finally, I rebuilt the self-extracting code downloads that were on our site (and noticed that the HardDisc4 one was broken, oops!).

Feb 13, 2010 7:29pm

James Woodcock (307) 32 posts

That sounds useful – thanks a lot. I’ll update my code to deal with that at some stage soon.

Feb 14, 2010 3:08pm

James Lampard (51) 120 posts

Ahem. Infozip?

The self-extractor requires a 32-bit C library which you can’t obtain without being able to decompress a Zip file :D

Then why don’t you produce your own version, using your self extractor without the dependency. You could throw in any additional required modules. I’ve also seen the InfoZip back end binaries compiled with GCC.

If I write a Zip file with ”,xyz” filename extensions and decompress that on the host OS into HostFS it’ll work, but it would fail if decompressed on RISC OS.

I’ve written a program called LM98Util (available from http://www4.webng.com/resurgam/) which on RISC OS will strip these and set the filetypes.

Feb 14, 2010 4:30pm

Steve Revill (20) 1361 posts

Then why don’t you produce your own version

Erm, because we don’t have to?

Feb 14, 2010 11:23pm

Theo Markettos (89) 919 posts

Don’t get me wrong, self-extracting archives are a good idea. I’ve suffered the RPCEmu !Boot shuffle enough times to be fed up with it. So your solution is welcomed from that perspective.

My only worry was about suggesting that distribution would switch to SEAs. Given the aim for greater cross-compile supoort, which means things like manipulating archives on other platforms, I was concerned that this would be made more difficult. Filetypes are an annoyance but, given the right unzip tool with support for the ”-,” option (append ,xxx types) I find it easier to unpack on the host system (or NFS server) than in the emulator. So having both possibilities is good.

(which reminds me, anyone know the status of -, going into mainline infozip?)

Feb 15, 2010 9:39pm

W P Blatchley (147) 247 posts

Steve, I think this going to be a huge help for people setting up RPCEmu. Nice work. I did the ‘setup shuffle’ a few days back, and as Andrew describes above in great detail, it’s not a fun dance to do!

Is there any mileage in the suggestions to switch to ZIP format? It seems like the compression algorithms used could be compatible (though I’m not sure if Squash’s particular brand of LZW is supported by PKZIP), and the following suggests that a RISC OS executable header could be appended without straying outside the ZIP spec.:

http://en.wikipedia.org/wiki/ZIP_%28file_format%29#Combining_ZIP_with_other_file_formats

Seems like, if ROOL intend to start distributing SEAs, that would allow you to just put one file for each component up on the website – which could be self-extracted on RISC OS, or just accessed as a regular ZIP archive on other OSes – possibly saving some hassle in the long run?

Feb 15, 2010 11:27pm

Jeffrey Lee (213) 6048 posts

Is there any mileage in the suggestions to switch to ZIP format? It seems like the compression algorithms used could be compatible (though I’m not sure if Squash’s particular brand of LZW is supported by PKZIP)

I’m looking at the code now, and it looks like there are some important differences between the two LZW formats – specifically, how they handle code 256. Unix ‘compress’ 2.0 and below treats it as a standard data token, while >2.0 treats it as a “clear code tree” command. PKZIP, on the other hand, expects any 256 code to either be followed by a 1 (for “increase code size”) or a 2 (for “partial clear code tree”). So unfortunately it doesn’t look like there’s any sensible way of getting files which can be decompressed by both Squash and PKZIP.

Of course, I’ve only just discovered that to create a self-extracting zip archive all you need to do is prepend a RISC OS build of ‘unzipsfx’ to the zip file and then run ‘zip -A’ to correct the zip header. So, how about this for a compromise:

I get off my fat arse and submit my ‘unzip’ patches to InfoZip (including fixing unzipsfx, since the version I’ve built doesn’t seem to work, whereas Chris Bazley’s does)
For any downloads which ROOL want to make self-extracting, they use self-extracting unzipsfx-based zip files (which only adds ~46k onto the file size – presumably less if it’s possible to use Norcroft’s Squeeze utility to compress the code without having it trash the zip data on decompression)
For the two people who use machines without 32bit compatible C libraries, and who still want to make use of ROOL’s archives, and who don’t have a better decompression option available, ROOL provide a CreateSEA-ified copy of the C library

Unfortunately a quick check suggests that SparkFS handles self-extracting zip archives, but SparkPlug doesn’t – which could make life a bit annoying for people without SparkFS and want to browse the zipfiles as image filing systems.

Feb 21, 2010 4:21pm

Peter Howkins (211) 236 posts

Just a quick note to let you know that the self extracting archives have been compiled with strh (ARMv4) instructions in, that don’t work on ARM6/7/7500 and don’t work reliably on the SA (when used in a RPC).

Feb 21, 2010 4:56pm

Jeffrey Lee (213) 6048 posts

Are you sure? I can’t see any sign of STRH in the CreateSEC archive, nor in the BASIC program that generates the archive headers. The only place where I do see STRH is the occasional one inside the compressed data stream – which should obviously never get executed.

Feb 21, 2010 5:07pm

Matthew Howkins (373) 3 posts

I have tried the self-extracting archives in RPCEmu. They always fail when attempting the opcode 0xe18c10b3. This disassembles as an STRH instruction.

I can’t find any references to this opcode in ‘CreateSEC.util’, so it is probably not at fault.

However I can find one example in the IOMD ROM - is it possible this instruction is present in the ROM, and just happens to get called when running one of the self-extracting utilities?

Feb 21, 2010 5:15pm

Jeffrey Lee (213) 6048 posts

Yeah, it looks like it’s the squash module that’s at fault. Due to source licensing issues the version in CVS is just a binary blob, and it looks like the IOMD ROM builds are using the newer >=ARMv5 version of the module instead of the older <=ARMv5 version.

Self-extracting archives

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Feb 12, 2010 2:57am Steve Revill (20) 1361 posts	There are now self-extracting versions of the UnTarBZ2 tool and HardDisc4 images on the download pages. These simply have to be downloaded onto a RISC OS system, given the ‘Utility’ filetype (&FFC) and then run. They will decompress their contents into the same location as the archive. The important benefit of doing things this way is that these archives are stand-alone. You do not require any additional software do decompress them – unlike zip, tar and bz2 archives. There is also a self-extracting version of the tool which we used to create these archives. This is supplied by 7th software and is free for anyone to download and use to create their own archives. ROOL will probably move over to using self-extracting archives for all RISC OS binary releases in the near future. Eventually, we hope to supplement our downloads pages with a proper software packaging system.

Feb 12, 2010 11:31am James Lampard (51) 120 posts	ROOL will probably move over to using self-extracting archives for all RISC OS binary releases in the near future. Er, why? Surely the universally supported PKZIP format is a far smarter choice than a file that you have to set to type &FFC and then run. Any of the standard archive types can be read on any platform, your oddball format would seemingly be limited to RISC OS.

Feb 12, 2010 11:57am Theo Markettos (89) 919 posts	I haven’t tried CreateSEC, but if it outputs in PKZIP format with a self-extracting header it should be OK. Infozip is capable of skipping any extraneous stuff and finding an embedded PKZIP. I don’t know if other tools like WinZip do, though. I do poke around RISC OS Zips on other platforms from time to time, so something that was only decompressable on RISC OS would be a pain.

Feb 12, 2010 1:02pm Jeffrey Lee (213) 6048 posts	I’m in agreement with James and Theo here – if you’re going to replace everything with self-extracting archives then PKZIP would be a better internal format (even if it would most likely add a dependency on the Shared C Library being present). Apart from allowing people on other platforms to extract the archives, and providing a better compression ratio than Squash, it would also provide a handy solution to the problem of being unable to extract archives that are larger than RAM (since you should just be able to load them into any half-decent RISC OS zip program instead). Judging by Wikipedia, it looks like PKZIP’s “Shrink” format uses LZW - so in theory you could create a PKZIP-compatible self extractor that can still be decompressed using the Squash module (thus keeping the current system’s advantage of a small header size and no SCL dependency)

Feb 12, 2010 1:40pm Andrew Hodgkinson (6) 465 posts	The reason for doing this is sanity. Recently, both Steve and I tried to set up RPCEmu from clean using public components only, to test the IOMD ROM. Before trying an unstable ROM, though, it made sense to try RISC OS 3.71 first. So I got hold of a 3.71 ROM, downloaded RPCEmu and: Whether you use a blank HDD image + HostFS or just HostFS, RPCEmu has no boot sequence. Thus you have no Internet etc. (and it doesn’t work on all platforms anyway). So HostFS is the only way to import data. If I decompress a Zip archive into HostFS on the host OS, I will not get filetypes, so this doesn’t work. If I write a Zip file with ”,xyz” filename extensions and decompress that on the host OS into HostFS it’ll work, but it would fail if decompressed on RISC OS. Thus we need a RISC OS specific Zip tool which understands embedded filetypes. Only SparkPlug is available for free AFAICT. So… David Pilling provides a self-extracting SparkPlug but this is 26-bit only. Download and extract it, assuming you have a 26-bit safe OS you can run it on (if you were on IOMD ROM 32-bit, you’d now be completely stuck and unable to proceed). Having extracted SparkPlug 2.26, use it to unpack the RISC OS filetype extended Zip file of 32-bit SparkPlug 2.28… ...except you can’t because the !Run file ensures CLib but we have no Boot/System yet. Comment this out, find it runs OK, get SparkPlug 2.28 (assuming you’re going to try a 32-bit OS). We can’t run 2.28 because it has even more dependencies, so let’s ignore that for now and carry on in 26-bit mode under RISC OS 3.71. Unpack UnTarBZ2’s Zip archive from the ROOL site. Now we can use UnTarBZ2 to unpack the HD4 .tar.bz2 image… ...except we can’t, because the tool has numerous module dependencies; it requires a populated !System. The skeleton !System bundle on the downloads page does not have all the required modules, in particular no updated CLib. You can examine !Run and download each module from the ROOL site; tedious but possible. These are in Zip files, so we need SparkPlug for that. Some dancing around with 26-bit or 32-bit versions may be necessary, especially if you inadvertently load the 32-bit CLib, e.g. by running UnTarBZ2 only to find you missed out some dependency… ...since softloading 32-bit CLib on RISC OS 3.71 is necessary to run UnTarBZ2, but if you eventually unpack the HD4 image, note that !Boot (and many other pieces of software) only run with the 26-bit CLib. If you load the 32-bit library, you get “Message token C72 not found” errors all over the place. It’s a good thing RPCEmu reboots quickly. Now we can finally run UnTarBZ2 to unpack archives. We discover that trying to unpack the HD4 image to HostFS:$ doesn’t work, output pathnames being badly scrambled, presumably a Unixlib filename bug I’ve mentioned in another thread. We try going to ADFS via a HD4 image downloaded from the Spoon site, but this being RISC OS 3.71, we don’t have long filenames. “HardDisc4/tar/b2” gets truncated to “HardDisc4/” so UnTarBZ2 refuses to touch it. Rename the file to something short with ”/tar/bz2” on the end, finally get the thing unpacked. Doesn’t work because it’s for RISC OS 5, but that was the intended target anyway. For RISC OS 3.71 it’s easier; we can use a Zip archive from acorn.riscos.com. Crusty but functional, though as I say, won’t run with 32-bit CLib present. It’s not quite so bad just for RISC OS 3.71 because you could ignore UnTarBZ2, mostly; in fact you’d probably ignore the ROOL site altogether since you already have your own ROM and thus presumably your own RISC OS machine anyway. If you want RISC OS 5, though, you have this ridiculous boostrapping process and all sorts of trouble with 26-bit and 32-bit variants, with much of the trouble boiling down to filetypes and HostFS. The self-extractor solves all of this Just run it and it works It even gets Text and Data filetypes right on HostFS, despite HostFS having horrible bugs here The same cannot be said of other tools Something like UnTarBZ2 which instead unzipped things has been discussed in another thread. The CLI zip tool would need to understand RISC OS filetype information. Having no dependency on an external CLib is critical because of the problems and incompatibilities that can arise due to a half-brought-up system, just when you don’t need that kind of pain. It might perhaps be possible to have some kind of Zip-compatible thing which self-extracts on RISC OS. If someone wants to write and test a tool which creates such things and can demonstrate them working on HostFS then great! We’ll use it. You’ll need to pay close attention to filetypes, particularly text and data, and ideally dealing with hard spaces when executing under <= RISC OS 3.71 if possible – our current code doesn’t manage that and HostFS barfs. Otherwise, this new tool certainly saves me huge amounts of pain trying to actually use components from the ROOL site, which is half the point of them being there. There is nothing to say we might not put Zip and self-extracting archives side by side so other operating systems could be used to examine contents in the rare case where this is necessary. Since the most likely use case for self-extracting components at the current time is via HostFS, having ”,ffc” on the end of filenames might be wise also. Harmless on ADFS-like filesystems, very handy on HostFS-like filing systems. Means the end user can just download and run.

Feb 12, 2010 2:40pm Martin Bazley (331) 379 posts	Only SparkPlug is available for free AFAICT. Ahem. Infozip?

Feb 12, 2010 2:55pm Andrew Hodgkinson (6) 465 posts	Ahem. Infozip? The self-extractor requires a 32-bit C library which you can’t obtain without being able to decompress a Zip file :D

Feb 12, 2010 3:02pm Steve Revill (20) 1361 posts	Given the mixed response, I have no problem with supplying the binary downloads (note: because they are BINARY downloads, they only work in RISC OS anyway!) in both zip and self-extracting formats. It’d just be one extra step in the process (which is already not fully automated and a PITA right now) for creating the binary downloads.

Feb 12, 2010 3:04pm Steve Revill (20) 1361 posts	Oh, and the reason I didn’t use zip as a self extraction file format is because that would have been no fun – I wanted to do something fun for a change. So ner! :p

Feb 12, 2010 9:03pm James Woodcock (307) 32 posts	I have written a very quick and dirty tool to extract these self extracting archives on a Linux box. The code can be improved greatly. File types are appended to file names in the usual way, so there is a small advantage to using this tool rather than unzip when extracting under Linux. I haven’t packaged it up yet, but code is at github: git://github.com/mjwoodcock/unsec.git

Feb 12, 2010 9:45pm Andrew Hodgkinson (6) 465 posts	git://github.com/mjwoodcock/unsec.git Oooh, git. Very modern `:-)` Nice tool, many thanks – we may not rely on the self-extractor for everything but it’ll certainly be used for a few things and having a cross-platform tool to access the archives. You shouldn’t need to reverse engineer the file format, though (which is what you’ve done, if the Github readme is to be believed). It’s documented in the Help file. Steve – hope you don’t mind – I’ve extended the Wiki documentation for the software to include a large chunk copy & pasted from the software. https://www.riscosopen.org/wiki/documentation/pages/Software+information%3A+CreateSEC

Feb 12, 2010 10:00pm James Woodcock (307) 32 posts	Oh, yes. Read the documentation. I must admit, that hadn’t thought of that. Thanks for the link. I’ll have a look sometime shortly.

Feb 13, 2010 12:01am Andrew Hodgkinson (6) 465 posts	I’ll have a look sometime shortly. I note reading it again that it gives details of the archive format, but does not go into details about squash format. That’ll be in the PRMs. The Squash API is described in volume 4 from page 103 and the file format in volume 4 from page 499. You can find these in PDF format here: http://foundation.riscos.com/Private/manuals/PRMs/ ...though given the URL I’m not sure they’re meant to be open for public access. Still, they are, so you may as well grab a copy. The PRM information, looking at it right now, is actually unusually poor. Even more annoyingly, we have not been able to secure rights to release the source code to this component. Looking at your code, most of the work you’ve done seems to have been on the actual decompression side, so in fact you may well have had no choice but to reverse engineer the lion’s share of it anyway.

Feb 13, 2010 9:05am James Woodcock (307) 32 posts	Thanks for the link. I did manage to find a copy when I was developing the tool. squash_compress gives the most relevant information: the algorithm is 12-bit LZW as used by Unix compress command. I had some LZW code from nspark (arcfs and spark dearchiver for various OSs) that was nearly there – I just had to cater for the unix compress header in the stream.

Feb 13, 2010 3:54pm Steve Revill (20) 1361 posts	I have tweaked the format of the binary so that you have an offset to the start of the compressed data structure near the start of the file. Thus, reading the word at offset 24 (bytes) gives you an offset (bytes) from the start of the file to the start of the compressed data structure (i.e. the “rsqs” word). Note: the word immediately preceding the the “rsqs” ID word is the size of the structure (bytes), if you care – which you probably don’t because you’ve already loaded the file so know how big it is. Still, it’s a useful extra sanity check. I’ve also added a -n switch to CreateSEC so that you can build an archive using our format that doesn’t include the self-extraction code. Finally, I rebuilt the self-extracting code downloads that were on our site (and noticed that the HardDisc4 one was broken, oops!).

Feb 13, 2010 7:29pm James Woodcock (307) 32 posts	That sounds useful – thanks a lot. I’ll update my code to deal with that at some stage soon.

Feb 14, 2010 3:08pm James Lampard (51) 120 posts	Ahem. Infozip? The self-extractor requires a 32-bit C library which you can’t obtain without being able to decompress a Zip file :D Then why don’t you produce your own version, using your self extractor without the dependency. You could throw in any additional required modules. I’ve also seen the InfoZip back end binaries compiled with GCC. If I write a Zip file with ”,xyz” filename extensions and decompress that on the host OS into HostFS it’ll work, but it would fail if decompressed on RISC OS. I’ve written a program called LM98Util (available from http://www4.webng.com/resurgam/) which on RISC OS will strip these and set the filetypes.

Feb 14, 2010 4:30pm Steve Revill (20) 1361 posts	Then why don’t you produce your own version Erm, because we don’t have to?

Feb 14, 2010 11:23pm Theo Markettos (89) 919 posts	Don’t get me wrong, self-extracting archives are a good idea. I’ve suffered the RPCEmu !Boot shuffle enough times to be fed up with it. So your solution is welcomed from that perspective. My only worry was about suggesting that distribution would switch to SEAs. Given the aim for greater cross-compile supoort, which means things like manipulating archives on other platforms, I was concerned that this would be made more difficult. Filetypes are an annoyance but, given the right unzip tool with support for the ”-,” option (append ,xxx types) I find it easier to unpack on the host system (or NFS server) than in the emulator. So having both possibilities is good. (which reminds me, anyone know the status of -, going into mainline infozip?)

Feb 15, 2010 9:39pm W P Blatchley (147) 247 posts	Steve, I think this going to be a huge help for people setting up RPCEmu. Nice work. I did the ‘setup shuffle’ a few days back, and as Andrew describes above in great detail, it’s not a fun dance to do! Is there any mileage in the suggestions to switch to ZIP format? It seems like the compression algorithms used could be compatible (though I’m not sure if Squash’s particular brand of LZW is supported by PKZIP), and the following suggests that a RISC OS executable header could be appended without straying outside the ZIP spec.: http://en.wikipedia.org/wiki/ZIP_%28file_format%29#Combining_ZIP_with_other_file_formats Seems like, if ROOL intend to start distributing SEAs, that would allow you to just put one file for each component up on the website – which could be self-extracted on RISC OS, or just accessed as a regular ZIP archive on other OSes – possibly saving some hassle in the long run?

Feb 15, 2010 11:27pm Jeffrey Lee (213) 6048 posts	Is there any mileage in the suggestions to switch to ZIP format? It seems like the compression algorithms used could be compatible (though I’m not sure if Squash’s particular brand of LZW is supported by PKZIP) I’m looking at the code now, and it looks like there are some important differences between the two LZW formats – specifically, how they handle code 256. Unix ‘compress’ 2.0 and below treats it as a standard data token, while >2.0 treats it as a “clear code tree” command. PKZIP, on the other hand, expects any 256 code to either be followed by a 1 (for “increase code size”) or a 2 (for “partial clear code tree”). So unfortunately it doesn’t look like there’s any sensible way of getting files which can be decompressed by both Squash and PKZIP. Of course, I’ve only just discovered that to create a self-extracting zip archive all you need to do is prepend a RISC OS build of ‘unzipsfx’ to the zip file and then run ‘zip -A’ to correct the zip header. So, how about this for a compromise: I get off my fat arse and submit my ‘unzip’ patches to InfoZip (including fixing unzipsfx, since the version I’ve built doesn’t seem to work, whereas Chris Bazley’s does) For any downloads which ROOL want to make self-extracting, they use self-extracting unzipsfx-based zip files (which only adds ~46k onto the file size – presumably less if it’s possible to use Norcroft’s Squeeze utility to compress the code without having it trash the zip data on decompression) For the two people who use machines without 32bit compatible C libraries, and who still want to make use of ROOL’s archives, and who don’t have a better decompression option available, ROOL provide a CreateSEA-ified copy of the C library Unfortunately a quick check suggests that SparkFS handles self-extracting zip archives, but SparkPlug doesn’t – which could make life a bit annoying for people without SparkFS and want to browse the zipfiles as image filing systems.

Feb 21, 2010 4:21pm Peter Howkins (211) 236 posts	Just a quick note to let you know that the self extracting archives have been compiled with strh (ARMv4) instructions in, that don’t work on ARM6/7/7500 and don’t work reliably on the SA (when used in a RPC).

Feb 21, 2010 4:56pm Jeffrey Lee (213) 6048 posts	Are you sure? I can’t see any sign of STRH in the CreateSEC archive, nor in the BASIC program that generates the archive headers. The only place where I do see STRH is the occasional one inside the compressed data stream – which should obviously never get executed.

Feb 21, 2010 5:07pm Matthew Howkins (373) 3 posts	I have tried the self-extracting archives in RPCEmu. They always fail when attempting the opcode 0xe18c10b3. This disassembles as an STRH instruction. I can’t find any references to this opcode in ‘CreateSEC.util’, so it is probably not at fault. However I can find one example in the IOMD ROM - is it possible this instruction is present in the ROM, and just happens to get called when running one of the self-extracting utilities?

Feb 21, 2010 5:15pm Jeffrey Lee (213) 6048 posts	Yeah, it looks like it’s the squash module that’s at fault. Due to source licensing issues the version in CVS is just a binary blob, and it looks like the IOMD ROM builds are using the newer >=ARMv5 version of the module instead of the older <=ARMv5 version.