Building RISC OS Online at ROUGOL, Mon 16th Nov 2020
Pages: 1 2
David Boddie (1934) 222 posts |
Very interesting! Thanks to Gerph for the presentation, and to Bryan and Leo for setting things up and posting the video. |
David J. Ruck (33) 1636 posts |
AndrewR is going to have to pull something really special out of the bag this month to follow that one! |
Jon Abbott (1421) 2651 posts |
I hope he’s changed his ways since he was last around. I recall the hail of abuse he gave me over ADFFS several years ago – I didn’t say anything at the time, but I very nearly walked away because of it. Do we really need that sort of individual back in our community? |
David Feugey (2125) 2709 posts |
He clearly said he did not consider to be part of the RISC OS 5 community… while working on RISC OS projects. In the same time it seems he was very happy to talk to ROUGOL people and he made a few very important bug reports here. You remember me I should make some tests of HOMM2 on ADFFS :) |
Chris Evans (457) 1614 posts |
I’d say yes. Justin has done an awful lot for RISC OS. No one should be abusing anyone. I know Justin has been on the receiving end of a lot of abuse in the past but yours is the first report I’ve heard of him abusing anyone. If RISC OS is to develop we need people of Justin and your experience and skills. |
Richard Coleman (3190) 54 posts |
In the video he did also acknowledge that he had given out some abuse as well and he clearly regretted that happening. So it would seem to me Jon, that he has changed. And it is great to see him back and doing something with RISC OS, and I look forward to seeing how what he’s producing develops and may even be a help in moving to 64bit. |
Steve Pampling (1551) 8172 posts |
It’s difficult to stay calm when the reaction to you pointing out a significant failing1 is a hail of abuse from the guilty party and then being stuck in the middle of an all out war. It happened before some of our newer users were even born. Peace. 1 There is a reason everyone is very very touchy about GPL |
Charles Ferguson (8243) 427 posts |
I don’t have any specific recollection of that. My only recollection of ADFFS is a strong feeling that it was the wrong way the solve the problem of retaining compatibility with earlier software and rather than emulating the hardware and interfaces that they use which are obsolete, the applications and games should be modified to make them use standardised interfaces to ensure future compatibilities. Almost certainly I was headstrong in my view, because I’m very sure of myself when it comes to RISC OS. I’m sorry that you felt offended at the time, and hope that you’ll understand that there was a lot of other things going on that were contributory.
To be honest, if that is the kind of welcome I receive, then then it reinforces my wish not to be involved. And my first reaction is one of extreme anger at you for criticising me. And I am angry, because that’s how I’ve become when seeing criticism in the RISC OS world. But that’s my problem. And I don’t need that sort of reaction. You don’t need it. It’s not good for anyone. I did a quick search to see if I could find a specific reference to what I might have said to try to explain, or to understand how much I offended you and to apologise directly for that, but nothing immediately turned up. So I can’t give you a direct response, either to provide justification or apology. But honestly, it doesn’t matter. Whatever may have been said was said, and whatever may have been has been. I can’t change that slight, and for that I’m sorry too. I hope you’ll accept that without specifics, but there’s nothing I can do to change past. I’ll no doubt be angry now for a day or two at this, just because you’ve said that you don’t feel you want me around. And that’s not your fault for expressing how you felt about what went before – you’re quite entitled to that, obviously. But I have real difficulty with that, and I take things very personally – more so when I know I was probably in the wrong. As part of my preparation for the talk, I stumbled upon a few posts from the dim pasts. Towards the end, I could be abbrasive. It was clear that I was one of those people who was making the community untenable. I can argue that as I’m more sensitive to these things that this was probably not so bad but, again, it doesn’t matter. It’s sad to find that you’re one of the type people that you dislike. All you can do is apologise, and try to be a better person. |
Grahame Parish (436) 481 posts |
I’m happy to move forward and put the past behind us. This community will cease to exist if we alienate others that come here to help us, especially when they bring a wealth of experience with them. Please guys, don’t bite the hands that offering help – we need all the help we can get for RISC OS to develop and survive. |
David J. Ruck (33) 1636 posts |
:thumbsup: Yes I know that doesn’t work |
Andrew Rawnsley (492) 1445 posts |
Thanks for the measured response, Charles – forums can be bitter places. I am strongly of the opinion that the past is best left there (ie. in the past), especially when it comes to forums/newsgroups and other impersonal spaces where it is so easy for people to type things without thinking about the ramifications of those reading. RISC OS was a different world 20 years ago, and I hope it’ll be a different (better!) world in 20 more, but with the same RISC OS-y heart of our beloved OS. I don’t believe anyone can have the right answers all the time, but it is very hard to see that at the time, and often we only see how we can handle things better years later. Even then, it can be a bitter pill. Anyway, these days the thing to dwell on is that we’re all wanting the same positive goal for a better OS future, and the key is to welcome all contributions from all directions, because who knows where the next great idea/breakthrough will come from? Oh, and for the record, I’m not even going to try and top Gerph’s talk! Or Stuart’s, or Jason’s or… |
Steve Pampling (1551) 8172 posts |
On the basis that in-person meetings will eventually resume I think user groups should continue with presentations from virtual speakers as well as on-site. Hopefully a developer will see the reason to port a conferencing solution (writing from scratch isn’t really viable) |
David J. Ruck (33) 1636 posts |
As long as there is beer and curry to hand, it shouldn’t matter where the speaker is! |
Steve Pampling (1551) 8172 posts |
Indeed, there is beer aplenty here, pre-made curry paste and other makings, a good selection of Scotch, so I can watch anything with proper accompaniment. Seriously, the RO groups should consider providing for virtual attendees long term. |
David Feugey (2125) 2709 posts |
:) |
Charles Ferguson (8243) 427 posts |
It seems that the shell server at shell.riscos.online (and the build.riscos.online service) has been down since I scaled it back down at the start of December, and I only just discovered when one of the CI jobs on GitHub failed. Anyhow, given that nobody’s mentioned it to me I suspect that nobody had tried it in that time – or they’d moved on assuming that I’d turned it off. No, I just made a mess of scaling it down. It’s now got monitoring to tell me when it’s broken |
Andreas Skyman (8677) 170 posts |
I tried it, but assumed I’d made a mistake. Looking forward to trying it again! |
Kees Grinwis (3528) 18 posts |
I did try it as well after seeing you talk yesterday, I did notice it but had the idea that it was a temporary error. Thanks for fixing it as I mentioned your excellent work to the members of the Big Ben Club (I expect that they read about it in 1 or 2 days and now it will work if they do try it). I did test traceroute (just curious how much of the network was implemented) that did not work – OK no problem. Next attempt: ping. *ping www.bigbenclub.nl I’m not saying that ping of traceroute should work but in case it should, then you should know of the error of course ;-) Anyway keep up the good work and have fun – that is the most important part… |
Charles Ferguson (8243) 427 posts |
Ping fails for two reasons, one of which is worked around, but the other one causes it to fail. The one you see here first is because there’s no InetDBase:Services file present, so it cannot look up what the protocol ‘icmp’ is. It then goes, ‘ok, I know this one’ and assumes that the protocol number is 1. Fair enough. Then it tries to create such a socket and is told ‘no, you can’t do that’ (EPERM) because it’s running as a non-root user – without the right set of permissions you cannot create ICMP sockets on Unix-like systems. RISC OS, without any permissions model, just lets you do it. But Unix won’t, and I didn’t want to give the docker container (and the user within it) any higher level of permissions than was necessary to run things. So yeah, I knew this was broken, but I’d forgotten that I left `ping` in, as it’s not actually that useful without root permissions (or suitable capabilities). But it demonstrates a vaguely interesting part of the system and the limitations allowed by running on a different host. Quick summary on those terms in case anyone reading is unaware of what those do, or why the lack of permissions is relevant… ICMP is the ‘Internet Control Message Protocol’, which is just a concise way of saying that it manages the lower level interactions when hosts communicate with one another. There are two versions – IPv4 and IPv6 and they are similar but not identical features. When you want to talk to another host you start a connection (or just send a message if the thing you’re sending is a ‘fire-and-forget’ protocol like UDP). Both UDP and TCP are the same at this level – a packet goes to the remote system, addressed to the target IP and port number. But it might not get there – along the way a router might firewall it. It can be firewalled a number of ways and whilst it might seem handy to just ignore the message, that means that the machine sending the message may never know that it could never get through, and so would try again. That generates more traffic for something that can never succeed. This is where ICMP comes in. A response is sent from the router to say ‘Destination unreachable’ (‘addressee not known’, in postal terms). The Sender should receive this and go ‘huh, that communication isn’t going to work’ and report the error to the calling program. The router could also say ‘ok, I can do that but, if you send it here instead it’ll get there faster’; a ‘Redirect’. It can also say ‘Um… that packet’s been travelling a long time, we cannot deliver it’, which is known as Time exceeeded. That message prevents packets going round and round in circles because someone configured things wrongly (or maliciously). These types of control messages allow you to make a very bad day for not only yourself, but other people using the machine (consider if Alice is trying to send a message to Bob on his system, but Mallory who’s a regular user on that system can keep sending out ‘Destination unreachable’ messages – Bob would never get any communication from Alice, and Alice would think that Bob wasn’t online). Because of this, ICMP is a restricted protocol, for which you need special permissions to access. On Unix-like systems, the ‘ping’ tool has usually been given special permissions so that it can do these operations. Ping doesn’t use those messages, but a separate one – ICMP Echo Request. For Echo Request, whatever is received by the remote system is just sent back to the sender as an ICMP Echo Reply. This allows the sender to check that the content was not corrupted (which was a bigger problem with earlier transports for data in the early Internet), that different sizes of packets work bidirectionally (router bugs caused problems with this in the past, but also some parts of the path may not allow long packets so your message might not get through), and to use the data in the body to report how long it was between packet send and receive (the ping time that is the most commonly used part of the ping response). Traceroute uses the ICMP Time Exceeded and Destination unreachable messages, not for sending but receiving. You might say that receiving isn’t a problem because you can’t break other people by just listening, and should be allowed. However, it’s a security issue if you can learn what other people using your machine are doing (consider Alice trying to browse the internet, and gets back some messages redirecting data, or saying that she cannot connect to machines through ICMP; if Eve was on the same machine – another regular user – listening to the ICMP responses she could know what systems someone was trying to access, without any special privilege). Traceroute works by sending a UDP packet with a special ‘Time To Live’ set on it. The ‘Time To Live’ (TTL) is a number that starts out big when you send it, and then with every system that it passes through (‘hop’, and it’s not necessarily for each system, but that’s good enough for this) the number is decremented. If it reaches 0, the machine that decremented it says ’That’s too far, I couldn’t deliver it’ by replying with an ICMP Time Exceeded response. Traceroute sends these UDP packets with increasing TTLs and listens for the response. If it gets back a ‘Time exceeded’, it knows what machine said that (because the response has that machine’s IP address in it). So it now knows one of the machines along the way to the destination. So it sends a UDP packet with a TTL one higher, and tries again. When it finally gets back an ICMP Destination unreachable message, it knows that it can’t go any further (either because it’s been blocked or because it’s reached the target machine and the port it was trying to communicate with isn’t listening). All the ICMP messages are the domain of the internet stack, and generally users don’t interact with them – the permissions model prevents them from doing damage to the system or to other users. |
Charles Ferguson (8243) 427 posts |
A little more interesting (and equally non-functional) is the ‘pong’ tool, which I tried just now to see whether it would work. I forget what it actually does, but that’s not relevant as it doesn’t work here anyhow. Just running ‘pong’ on its own gives a debug report about an invalid access of zero page. Since the purpose of the OS is to debug, this is pretty helpful – and it’s really clear in how it fails (at least to me). It also seems to show that, bizarrely, the command line arguments are being treated as an array of signed chars, which is a bit weird too. Anyhow, if anyone were interested in how failures are reported, that gives an example of the watchpoints in user mode. There’s another example of a watchpoint triggering which happens on boot (which you very briefly see as you connect – blink and you’ll miss the noisy ROM startup), which you can manually trigger by |
Steve Pampling (1551) 8172 posts |
A response CAN be sent. Some network/firewall people play dirty :) and silently drop the packet…
The theory is that nothing is more than 255 hops away. Fine for ICMP, but the default for normal TCP is different on different systems e.g. 128 on modern Windows, as I recall BSD is/was low still
Which, usefully also allows you (in most incarnations) to do the trace with a selected protocol and port number specifiied – important if the firewalls along the way only have specific ports open e.g. TCP 443 open everything else closed so you select -P TCP -p 443 FreeBSD implementation has: traceroute [-adDeFISnrvx] [-f first_ttl] [-g gateway] [-M first_ttl][-m max_ttl] [-P proto] [-p port] [-q nprobes] [-s src_addr] |
Theo Markettos (89) 919 posts |
Just watched Gerph’s talk – utterly brilliant! I did some CI/testing-related things a few years ago – the aim was to build RISC OS ROMs and things in Jenkins and test them on Raspberry Pis. The build side worked, but I never got very far with testing: at the time you couldn’t netboot Pis so you had to do messy things with SD card switching and all the plumbing was a bit annoying. In addition, as Gerph says, you would have only got a ‘boots’ / ’didn’t boot’ which is a fairly coarse grained thing, although I’d hoped to get some more output from running applications. There’s still merit in testing the low level things like device drivers, but that’s tricky when everything is so easy to wedge. Pyromaniac is a whole other level of impressiveness, well done Gerph! |
Charles Ferguson (8243) 427 posts |
bq. Just watched Gerph’s talk – utterly brilliant! Thank you! It’s a lot of fun, and I get the opportunity to choose not to implement things… A finding from earlier this week… FontManager does some weird stuff with vectors which I knew about vaguely but never really investigated – it falls into the ‘more magic’1 category of things. Trying to work out what OS_Plot 214 does, I re-discovered that FontManager has UKPLOTV and UKVDU23V 2 claimed and can subsequently claim VDUXV 2 based on those. VDU23,25 will change the font transfer thresholds. VDU23,26 will effectively do a Font_FindFont and make it current. I think. I didn’t probe much deeper because I don’t really care what eldritch horrors lie beneath. OS_Plot 208-215 and VDU23,25 will change the VDU destination to VDUXV, claim VDUXV and then process text output and plot it through the font system, terminating on a control code it doesn’t understand. So that lets you plot text with a font by just writing to the VDU stream. It’s actually quite cute in some respects. Try I am not implementing these interfaces in Pyromaniac ’cos life truly is too short to care about that kind of craziness. Here’s how this is documented in the PRM:
That’s it. If you don’t tell people what they are, they can’t be tempted to use them? As I’m writing this I’ve realised why the test code I was using used One of the interesting things about the testing though, is that you can do a lot of testing without actually having a real machine. With, as you suggest, the limitations on device drivers. But it makes a lot of sense to try booting a real device with a netboot, although it’d also be possible to make the default image a bootstrap image the boots from SD and then boots the test image from a known network location (even without PXE). That way you could kick off arbitrary test. I’d pretty much expect Pis to have a watchdog that can reboot them if they hang, and if not, a lack of out/response from network could always be given a swift kick to the power supply (network controlled power supply resets was something we’d done at Bromium for testing our highly invasive system that could hang the MacMinis and cause them to require a full power-cyle). Doable, but tedious. Although I’ve not thought about it enough, it should also be possible to have a hypervisor that can essentially police the system-under-test – RISC OS – to report useful information from the system on request from (say) Jenkins, or if it goes Very Bad. That sort of thing might be fun. Recently, though, I wanted to try out ‘how easy is it to design and implement a module from scratch in Pyromaniac’. ie using Pyromaniac for the intended purpose of prototyping and testing. This is a module that lets multiple hardware drivers register with it and allows callers to enumerate the devices and query them in a structured way. The drivers implementing just the the hardware interfaces, and the controller module implementing the interface in a structured way and exposing it nicely to the user. I created a basic spec in text form, and played around until the interface looked right and then implemented the module in Python. Tested it out, changed some SWIs around, and reworked the registration interface. The python’s littered with I converted the documentation to PRMinXML so that it’s more nicely readable, and in the process found some omissions and a few tweaks to the controller module. Then I translated the whole thing from Python to C, to create a real RISC OS module for the controller, and another for the dummy driver. Either could be loaded in place of the Python version to the behaviour checked to see that it’s still doing things in a sensible way. The C module was dead easy because even though I had to replace a few of the structures – no dictionaries in C unless I write them myself, string manipulation is fiddly, etc – it was just the same code, but in C. According to the commit messages, the C implementation took only a day for both modules. Then I gave the C code to a friend and they pointed out implementation bugs and a couple of better design choices, which were easy to fix (although I had to fix them in the C version, the Python version and the documentation by this point – should have got them to look at it earlier!). There’s a comment in the Mythical Man Month about ‘write prototype code to throw it away’ (something like that) which I believe is generally accepted as great advice, and (in my experience of real software development) never the way that things go. Prototype code generally makes it to production because there just isn’t time to redo everything. That problem aside, this exercise was in that spirit. They prototype wasn’t thrown away, but it was intentionally not the final goal of the product (accepting for a moment that the /actual/ goal was to find out if the process worked). What did I learn from this? Well… designing in a text spec first is something that I’d have done before this anyhow, and that helps to get some things right. Implementing in Python as a high level language is fast. Faster than it would have been if I’d implemented it in C as an application that simulated the SWI interfaces so that I could see how it worked, etc (which is what I would have done before). However, the Python implementation was used directly as SWIs, and I could call it from BASIC so it was much more like what I wanted to use. If I’d done the C-application-like-the-module route, I wouldn’t have been able to try it from BASIC in that nice way. BUT, as the C-application-like-the-module, the translation to an actual module would have been merely sticking the module header on top of the C code that I’d written. Instead I had to re-code the whole thing, converting from the Python implementation to a C implementation. So that slowed things at that point – but again, that only took about a day anyhow, so it’s not an amazing benefit. The python code was significantly easier to restructure when I found that I had handled part of the structure wrongly. There’s no faffing with Similarly, I could mix-and-match controller and driver modules between the C and python versions once they were complete. That meant that I had a known reference point to see how they should react and contrast the behaviour. In some cases the Python reacted better, and the C code could be improved to match (and vice-versa). Is it something that is worth doing in the future? Maybe. The thing is that it was beneficial, but then I’m a pretty good module author, and I know Pyromaniac well, so writing things in both is pretty easy for me. Did it make the design and implementation process more efficient? I think so for the prototype. I didn’t have to worry about getting the design as right early on, because changing the Python module is way easier than changing the C module. Writing it in Python first also had an effect of solidifying some of the structures that would have been used in the C version but in a less nice way. The list of devices is actually an object in python, which iterators and item accessors, and functions for registration and deregistration. In C, that just became a set of functions that operated on structured, in a separate file – pretty much how I would have written it before, but the object in Python makes the operations a lot easier to work with and it’s more obvious what you need up front. Some things did not transfer across from Python to C, and I could not translate the code directly. Each device is an object in the Python and we have calls to (essentially) read information, write information, and configure the device. In the python, the interface from the caller to the driver, was through a method But other than that, it wasn’t especially difficult to convert between the two. Dictionaries became simple linked lists (because I’m not expecting large numbers of registrants, and it can be fixed later if that becomes an issue), and exceptions became return codes. Anyhow, it was interesting to me to step through the process and actually use the system for what I had meant it for, rather than just implementing an OS because It’s Not Finished Yet! I think it was kinda successful in that. There were a few times when I wanted more debug facilities in Pyromaniac, so I’ve got some more notes on things to add. Actually one of the things I noted was that when running the application it would be handy to be able to turn on certain debug whilst it was running – you can use the Um… oops, that turned into a bit of a digression. 1 http://catb.org/jargon/html/magic-story.html |
Richard H (8675) 100 posts |
Your “digression” is one of the most fascinating and informative things I’ve read recently, on this or any other forum. Please, continue to digress. |
Theo Markettos (89) 919 posts |
This is interesting because when writing stuff for RISC OS often you’re doing two things: writing against an API (Wimp, filing system, networking, printing, whatever) and writing code to fit within the limits of the environment (application, module; BASIC, assembler, C, C++). It’s nice to be able to separate working out the API from what is in reality a relatively low level programming environment (given RISC OS’ limited memory/process management, robustness, debugging, etc). I wonder if there’s any language that can interact with the Pyromaniac Python code, and yet not have to be rewritten to run natively. There is an existing common denominator – ARM machine code – but it’s not a friendly one. Is there anything higher level than that? Anything that might be compiled down to (for example) C or C++ that can then be built for RISC OS? |
Pages: 1 2