Possible bug in AcornHTTP
Pages: 1 2
Matthew Phillips (473) 721 posts |
I am struggling with some weird behaviour in an application I am developing that uses AcornHTTP to fetch material from an API over HTTPS. The API fetches information every 30 seconds and plots it on a map. After a period of time, it fails. I am testing on an Iyonix, RISC OS 5.16, with the following modules: This morning it failed on the 52nd fetch. This evening it failed on the 51st. That seems too close to be a coincidence. I still have 166K free in the module area and plenty of free RAM. The process calls URL_Register successfully. We then call URL_GetURL. Then it calls URL_ReadData repeatedly, waiting for material to be returned. We get the following sequence of status word values back from URL_ReadData (see URL_Status).
In each case R4 is zero, so no bytes received. But on the next call we get a SWI error “Exit called”. My code then disposes of the fetch using URL_Deregister. The interesting thing is that, after this has occurred, other applications which work wholly reliably like Recce have trouble fetching too, but not entirely consistently. Recce uses exactly the same code to do its fetching. After my new application has encountered the mysterious “Exit called” error, I get exactly the same “Exit called” error from Recce when searching Flickr, CycleStreets Photomap and STATS19 collisions. OpenWeatherMap and MetOffice report no results but don’t trigger “Exit called”. Google StreetView and the planning application search work fine. This was repeateable: exactly the same features failed this morning as this evening. And in both cases, OpenWeatherMap worked on a second attempt but all the other services continued to fail (or work). Of those, only OpenWeatherMap uses HTTP. All the others are HTTPS, but why Google StreetView and the planning applications are happy is a mystery. Rerunning my new applications fails straight away. It looks to me as though something in AcornHTTP has got jammed up, but it’s certainly failing in a very peculiar way. If I do RMReInit AcornHTTP and then re-run my application, it can fetch fine again. Recce, which was still sitting on the iconbar, was then able to fetch from Flickr, Cyclestreets Photomap and STATS19. The Met Office weather returned no results, which was odd. I have verified that the API is still behaving normally by fetching the same URL, after my application has failed, using NetSurf, and that was fine. It’s not a perfect test as a different cookie will be set. Does anyone have any ideas as to what I might try next to debug this one? |
Julie Stamp (8365) 474 posts |
“Exit called” is printed by the SharedCLibrary when exit() is called in supervisor mode. |
Rick Murray (539) 13850 posts |
More specifically, it looks like if we’re in a privileged mode then:
Essentially, this is saying that something is calling exit() in a privileged mode, it had better be something that lives in the RMA. |
Dave Higton (1515) 3534 posts |
What is the date of the AcornHTTP module file? Could it be from the period when faulty modules were being distributed? |
Matthew Phillips (473) 721 posts |
The AcornHTTP module is from 4 December 2021. It includes the fix for allowing leading zero in the chunk lengths for chunked transfer encoding which I contributed recently. I first started writing this application in April or thereabouts, and I’ve had this issue all along — it’s only now that I’ve had time to investigate thoroughly. Perhaps I’ll have to try to compile a debug build of the module to investigate further. From Julie and Rick’s explanation of the meaning of the error, it doesn’t sound like it can be my application’s fault. I’m very interested in the fact that some connections still work and others are broken until the RMReInit. That may give us some clues if I can work out commonalities in transfer encoding or suchlike. |
Matthew Phillips (473) 721 posts |
Of course, I have a debug build from November when I was fixing that fault with the chunked transfer encoding. I cannot remember where the debugging gets put though! Or if you have to turn it on somehow other than by running AcornHTTPD rather than AcornHTTP. |
Matthew Phillips (473) 721 posts |
Ah – got it. It depends on build options. You can use SysLog or just output to a file. In the latter case the filename is specified in Fetchers.HTTP.c.protocol as RAM:$.HTTP_Trace. I’d better make sure the RAM disc has lots of space. |
Matthew Phillips (473) 721 posts |
Behaviour on my ARMX6 is slightly different. Instead of a SWI error “Exit called” I just get the application crashing with a Type 5 error. I’ve looked at the module’s debug output and I think I know what triggers the fault. After 51 requests to the API, each with response headers including “Connection: close”, the 52nd request gets the response “Connection: Upgrade, close”. The debug says: ** End of headers detected Processing the headers Connection header found - scanning it for tokens Filtering header named in connection: Upgrade Deleting `Upgrade' header Filtering header named in connection: close and then the output ceases and the module must have called exit() somehow. When I then try a fetch of data through Recce, the debugging resumes as normal. The Connection header is “close”, and we get the following debugging: ** End of headers detected Processing the headers Connection header found - scanning it for tokens Filtering header named in connection: close and then the output ceases and the crash occurs. Whatever it is doing in response to the Upgrade token, it causes subsequent handles of “close” to fail. I think Upgrade is a hint from the server that it would be nice to switch to a keep-alive connection. I don’t know whether AcornHTTP is supposed to support this. I think that’s enough progress for this evening. I don’t really want to look at the module code just yet! |
Dave Higton (1515) 3534 posts |
Keep-alive connections came in with HTTP 1.1. AcornHTTP is stuck at HTTP 1.0. |
Dave Higton (1515) 3534 posts |
You may just find that the far end is keeping all the sockets open. Try ResRec or anything else to show you. You might get away with sending an extra header line “Connection: close” (IIRC) to change the far end’s behaviour. |
Matthew Phillips (473) 721 posts |
No, AcornHTTP does have some handling of HTTP/1.1, but it translates all the material to 1.0 before passing it back to the client. Looks like I was wrong in my analysis. The connection which has the response “Connection: Upgrade, close” was actually from the first request made by Recce. The last request made by my other application ends in: ** End of headers detected Processing the headers Connection header found - scanning it for tokens Filtering header named in connection: close The previous, successful attempt continues: Need 688 bytes to hold the entire header Writing `HTTP/1.0' header to buffer (value `200 OK') ... and continues with the rest of the header lines, except Connection which has been filtered out. I’m wondering therefore whether it crashes on a memory allocation attempt. I’m looking at the function http_filter_connection_tokens which I think is supposed to delete any headers named after tokens it finds in the Connection header, so it might delete the “Upgrade” header response. It should then return to http_process_all_response_headers, delete the Connection header, and move onto doing something with cookies, if COOKIE is defined (and I don’t know if it is). The function concludes with http_construct_header_buffer. That opens by running through the header structure to count up the length needed, and before the malloc outputs debugging to say how much it needs. The crash happens before the debugging though. Maybe it is not working through the headers properly to do the count. I’d better find out if COOKIE is defined, because it could be going wrong there. I can always add some more debugging. |
Matthew Phillips (473) 721 posts |
I’ve found a very stupid bug in add_cookie_to_queue but I’m not sure if that’s the only problem. More testing to follow. Here’s an excerpt from add_cookie_to_queue in Networking.Fetchers.HTTP.c.cookie. My changes noted in the comments. static void add_cookie_to_queue(Cookie *cookie) { /* Enforce an upper limit on the number of cookies that may be held in * the pending queue */ if (cookie_queue_root != NULL) { Cookie *list; int count = 0; for (list=cookie_queue_root; list->next; list=list->next) ++count; if (count > MAX_COOKIES_IN_QUEUE) { if (loading_cookie_file) { /* Periodic flush of the queue when loading our cookie file! */ move_cookies_from_queue_to_list(0); } else { // Amend faulty line: list->next will be NULL // and cookie_unlink_cookie expects pointer // to the head of the list as first argument // cookie_unlink_cookie(&list, list->next); cookie_unlink_cookie(&cookie_queue_root, list); // Similar rubbish here! // destroy_cookie(list->next); destroy_cookie(list); } } } ... |
Matthew Phillips (473) 721 posts |
After running the application for a couple of hours, it seems to be fine, so I think that was the only fault. I’ll try and get a pull request on GitLab later on. Exploring the code made me realise there are several features relating to cookies including some AcornHTTP SWIs that appear to be undocumented outside the source. The URL Fetcher API specification mentions that AcornHTTP provides cookie management code, but I’ve never seen any documentation for this. If I can get to grips with the wiki, I will try and document it from the source. |
Rick Murray (539) 13850 posts |
That’s because the documentation is a mess split across several files. |
Matthew Phillips (473) 721 posts |
Merge request now created. Thanks, Rick, for the pointer to the text documentation. It’s quite sparse. What is really missing is an explanation of how you are supposed to use these calls. I’ve not studied the cookie code fully, but it seems to me that if you are to read the cookies sent in response to the request you’ve just made, you have to do this when the status word has bit 16 set, and you need to do it before any Wimp_Polls, because I cannot see how the module can otherwise distinguish been cookies for you and cookies for any other client. I think the cookies are global across all clients of the module. Is that your understanding? I had not even realised that AcornHTTP did any cookie handling at all, and so I had written my own code to record and transmit back any cookies relating to the URL_Get requests I was making. We definitely need a better wiki page explaining it. Where is the best place to create one, I wonder? |
Rick Murray (539) 13850 posts |
I’m not at the computer right now (in bed with heated blanket, having just watched The Expanse finale), but as far as I can remember there is no “this document uses these cookies” call. I simply enumerate them all and nuke the ones with the relevant URL. I suppose one could enumerate before and after a fetch to see what is different, but remember the spec was drawn up sometime in the late 90s. I doubt anybody would have foreseen quite what our friendly little HTML 3.2 (and less friendly tableset) world would turn into a quarter century down the line… |
Dave Higton (1515) 3534 posts |
How about here ? |
Matthew Phillips (473) 721 posts |
I meant to explain how the bug is triggered. There is a queue into which cookies are written as they come in from the server. I think it is intended that the clients can read them from there, and that they are also put in a store organised by domain, etc. The queue holds a maximum of 100 cookies, and when it’s full, the oldest is supposed to get forgotten. That was the code that was rather broken. In my case, the application was fetching a URL which had two cookies set in the response, so it conked out after 50 requests. I’m surprised it ever worked well enough to pass any tests, but perhaps it was mainly tested with Browse which probably empties the queue (if it works how I think it does). While my application is now happy, I’d like to study the code a bit more to try to understand how it was intended to be used. |
Chris Mahoney (1684) 2165 posts |
If you can edit the commit message (I’m not sure offhand whether this is possible) then you may wish to include the URL of this thread, just for documentation purposes. Might help whoever at ROOL ends up reviewing the change too :) Apart from that, good work! I’ve previously dived into AcornHTTP so I know it can be a bit of a black box at times, so well done tracking down the issue. |
Matthew Phillips (473) 721 posts |
Thanks for the suggestion. I have edited the commit message. I have also now documented the client-facing SWIs for AcornHTTP. I still need to put in some cross-references from the URL_Fetcher documentation so that people are alerted to the cookie handling features. I suspect there is a minor bug in the calculation of the total number of cookies in HTTP_EnumerateCookies. |
Matthew Phillips (473) 721 posts |
It’s not totally clear what the intention is from the source code, but I think that the total cookies returned in R2 by HTTP_EnumerateCookies is supposed to be the number of cookies in the global store, which is used to automatically add cookies to requests made through URL_GetURL, and that the value in R3 is supposed to be the number of queued incoming cookies which have yet to be accepted or rejected via HTTP_ConsumeCookie. Unfortunately my tests show that discarding a cookie from the incoming queue will decrement the total cookies tally. There may be other circumstances in which the total count goes awry: checking how the code handles duplicate cookies is one thing I would like to do. I was concerned that the HTTP_AddCookie call might not handle NULL values for the pointer to the expiry string. It doesn’t seem to crash, but I am not quite sure what it does do. The other confusion is where the cookies are persisted to disc. It seems they are written to WWW.AcornHTTP.Cookies in Choices, but there is an environment variable, Browse$CookieFile, which allows them to be read from somewhere else. It’s a bit of a muddle. The wiki is now up to date with my understanding of the module. Please feel free to chip in if you have further knowledge! |
Matthew Phillips (473) 721 posts |
There is another thing I would like to correct. In HTTP_AddCookie there are five parameters which are pointers to NULL-terminated strings. Most of them are handled carefully, and if they are NULL pointers or pointers to empty strings then the cookie is not added. The exception is the expiry date/time. If NULL is passed, then the module does not take any special action, even though creating a session cookie would seem a sensible thing to do. Instead it merrily skips spaces, looks for matching day and month strings and digits, and if all this fails returns the current time. This must be a potential cause of ZeroPain errors. What’s the expected standard in terms of SWIs guarding against the caller being stupid? |
Matthew Phillips (473) 721 posts |
I’ve added another merge request because the module was failing to keep count accurately of the number of cookies, and these values are returned to the user in HTTP_EnumerateCookies. I’m not thinking about HTTP_AddCookie which was intended to allow a browser to implement Javascript that sets cookie values. At present cookies which are set via this route do not get saved in the cookie file in Choices:WWW.AcornHTTP.Cookies which is not helpful: as far as I know Javascript can create persistent cookies. At present, if a NULL pointer is passed for the domain, path, name or value, then the call silently does not create a new cookie. But if these values point to an empty string, there’s no problem. I think a cookie which has no domain or no name should surely be faulted, though I’m not sure whether it should be silently discarded or whether an error should be generated. The SWI currently generates no errors. I’m pretty sure weird things happen if you pass a NULL pointer or a pointer to an empty string as the expiry date. I think a pointer to an empty string should probably have the effect of creating a session cookie, but it might be quite useful if a NULL pointer set the expiry to the current time, as this would give an easier means of deleting a cookie from the store. While I’m looking at this call, I wondered about adding support for setting the MaxAge as an alternative to the expiry date of the cookie. The MaxAge attribute is supported in cookies received from a server, but not in the HTTP_AddCookie call. Does anyone who has used this call have any thoughts about what would be desirable? |
Matthew Phillips (473) 721 posts |
Thoughts on the above still welcome. While I am familiar with the code, I am wondering about adding support for more cookie attributes to HTTP_EnumerateCookies because the call does not support HttpOnly or SameSite at present. The SameSite attribute is a bit of a problem for the URL Fetcher and AcornHTTP model, as there is no way at present of passing information on the browser’s current URL into a URL_GetURL to allow the AcornHTTP module to determine whether a cookie should be transmitted or not. But being able at least to read the attribute values for queued or stored cookies would be of potential use. I think ultimately the system needs to be redesigned so that each application can maintain its own cookie environment, or even multiple environments (e.g. when implementing a “private browser” mode). There are other big changes needed too, like support for persistent connections. Has anyone done any thinking about possible designs? |
Andrew Hodgkinson (6) 465 posts |
I didn’t write the module cookie code, but must’ve integrated it into !Browse – but that’s more than two decades ago and I certainly don’t remember details, apart from there definitely being a few “fun” bugs in the module in the early days so I’m not too surprised to hear of possible missing null pointer checks or similar. Best thing I could recommend was looking for relevant SWI calls from the !Browse application source to determine what it does, at least as a guideline for one possible approach. |
Pages: 1 2