RISC OS Open: Forum: Weak TCP/IP stack

Apr 19, 2015 8:05pm

I more and more think the two problems are the same. I did have a problem with disk accesses. But I have also a problem of non responding sockets. Network blinks (= request received), disk does not (= nothing in the WebJames’s log). The problem seems to be here for months/years, as the RC12a ROM has this bug too.

To be honest, I don’t know what to do. I think, I’ll simply shut down all the riscos.fr website, the time to find an alternative, as I sink in Google rank. The curious point is that nobody did notice the problem until today. I have probably no visitors :)

Nota: perhaps that this problem is the same as the “no sockets left”, that I have after around one month of WebJames use.

(Would be cool to see this solved for ROS 5.22 :) )

Apr 19, 2015 8:24pm

David Feugey (2125) 2709 posts

I came back to my fastest setup: PandaBoard ES. It’s the fastest to wake up. But also to see its socket going to death. The strange point is that once active, sockets don’t die. I should make an apps to ping on my server. Does anyone has simple Basic code to make an http call ? :)

Apr 19, 2015 8:55pm

David Feugey (2125) 2709 posts

My batch file on a PC

:START
wget http://www.riscos.fr
del index.html
sleep 5
GOTO START

Guess what. no lag any more on server. All socket stay active. If some sockets are dead, you must wait several minutes to revive them, but all will be back to normal later… or almost (sometimes a request fail). Problem: my SD card will be dead in two days with this ‘fix’.

Apr 19, 2015 8:56pm

David Feugey (2125) 2709 posts

Now I need to code this in Basic, else I will need to use two servers. One for RISC OS & one to revive the it.
That’s completely crazy :)

Apr 19, 2015 8:57pm

Rick Murray (539) 13840 posts

I think, I’ll simply shut down all the riscos.fr website, the time to find an alternative, as I sink in Google rank.

Two things.

One – Google’s rank is unimportant. You will sink anyway because you are not an https site and Google prefers that these days. I’m the same, but I’ve never cared about my Google status. I got disillusioned with the whole thing when I took a peek in the “SEO” world and how obsessed people were with “monetizing” their website, talking about pulling content that was not turning a profit. These people aren’t interested in sharing information, they just want to cash in. And thanks to that, a lot of Google’s search results are crap.

Two – perhaps you could write a short program to (periodically, like maybe on the hour?) shut down the webserver task, close open files, then call OS_Reset to kick the entire machine. I think a shutdown-reset cycle should take around 30s (a total guess!), that might not cure your problems, but may at least stop them from being show stoppers. I am looking at your site right now and sometimes it doesn’t respond, other times everything comes up quickly.

If you want to make a basic HTTP call, you could perhaps call wget via OSCLI? Tell it to fetch something. ;-)

Apr 19, 2015 9:16pm

David Feugey (2125) 2709 posts

One – Google’s rank is unimportant. You will sink anyway because you are not an https site and Google prefers that these days.

Not only. The fact that the website crashes each evening when Google robots are coming is not good. And it means too that there is no cache and no indexation of the content.

that might not cure your problems

No. The problem appears only a few seconds/minutes after the launch of WebJames.

If you want to make a basic HTTP call, you could perhaps call wget via OSCLI? Tell it to fetch something. ;-)

As expected, it’s not really useful. Only one socket is alive. So another connection could make all failed. The script is only useful to revive sockets faster (or to get timeouts instead of you, if you prefer).

The strange point is that ShareFS, that use sockets too, works perfectly. Ping too. So it’s really linked to sockets called via the C interface.

Apr 19, 2015 10:18pm

Colin (478) 2433 posts

Have you tried any of the roms in my omap4 network test thread. They have changes to etherusb which may be relevant.

Apr 20, 2015 5:46am

David Feugey (2125) 2709 posts

After 30 minutes, failed to connect to web server. Connection does not want to come again. Ping still works, as ShareFS.

Apr 20, 2015 12:40pm

David Feugey (2125) 2709 posts

I switched to company server, as it’s not possible any more to use it like this :)
You should see “new server” in the title bar.
More to follow.

Apr 22, 2015 7:29pm

Rick Murray (539) 13840 posts

To follow this up…

I have been running a small site on my Pi for two days now without problems, using the current WebJames, the latest firmware, and the most recent (at the time) version of RISC OS (5th April). Yuck, all my times are an hour out according to CLib. :-/

I have just made a clone of riscos.fr and testing over the LAN using my phone shows no unexpected problems, even though the pages are somewhat more complex.

I did see David’s oflafofla (or, these days, dofla, that’s surely “ofla” in a post Homer Simpson world):

192.168.1.11 - - [22/Apr/2015:19:56:37 +0000] "dôﬂådôﬂådôﬂådôﬂådôﬂådôﬂådôﬂå[…]" 200 0 "" ""
192.168.1.11 - - [22/Apr/2015:19:56:43 +0000] "dôﬂådôﬂådôﬂådôﬂådôﬂådôﬂådôﬂå[…]" 200 0 "" ""

Looking in the other log, I see this:

22/04/2015 19:56:37 : CLOSE 
22/04/2015 19:56:43 : CLOSE

Does WebJames support KeepAlive? I didn’t think it did, but maybe…?

Just been at it with the iPad and doubled the size of the log file. ;-)

I can’t say the server has been taking it easy due to few people knowing it is available, as my logfile shows:

1.214.119.227 - - [21/Apr/2015:15:55:01 +0000] "GET /muieblackcat HTTP/1.1" 404 274 "" ""
1.214.119.227 - - [21/Apr/2015:15:55:01 +0000] "GET //phpMyAdmin/scripts/setup.php HTTP/1.1" 400 213 "" ""
1.214.119.227 - - [21/Apr/2015:15:55:02 +0000] "GET //phpmyadmin/scripts/setup.php HTTP/1.1" 400 213 "" ""
1.214.119.227 - - [21/Apr/2015:15:55:03 +0000] "GET //pma/scripts/setup.php HTTP/1.1" 400 213 "" ""
1.214.119.227 - - [21/Apr/2015:15:55:04 +0000] "GET //myadmin/scripts/setup.php HTTP/1.1" 400 213 "" ""
1.214.119.227 - - [21/Apr/2015:15:55:05 +0000] "GET //MyAdmin/scripts/setup.php HTTP/1.1" 400 213 "" ""

and so on. Lots of rubbish like this. Is the phpMyAdmin link broken? Surely the “//” can’t be right?
Also… muieblackcat? That ought to be nyankuroneko.

Right – I’m off to watch a cute zombie called Liv Moore… ho ho.

Apr 22, 2015 10:44pm

Rick Murray (539) 13840 posts

Tomorrow I’ll revert back to my modified older RISC OS. An unexpected consequence of using a different timezone to the UK expected seems to be that NetTime is getting things horribly wrong.¹ My computer thinks it is 0h53 (it is 0h28) and the status says NetTime last synced a day ago. What? ² :-/

Update: well . . . I entered NetTime_Kick in a task window and the machine has frozen. Since I have to reset, I’ve reverted back to my build of RISC OS.

Well, I guess that’s one way to clear all the crap in the log files, huh?

¹ Surely I can’t be the only person with Timezone +1 & DST ?

² Half an hour in a day is kind of poor. Could NetTime be a little more intelligent here and disable slewing if it can’t check the time for whatever reason? Surely it would be best to try to keep the time as it is rather than continue to slew and have it get more and more incorrect?

Apr 23, 2015 8:11pm

Malcolm Hussain-Gambles (1596) 811 posts

Talking of NetTime, this guy is to blame – Genius at work. No that’s not sarcasm.

Apr 25, 2015 11:05am

Rick Murray (539) 13840 posts

What’s today? The 25th? The server has been running through the week and my server has needed a couple of restarts (plus the code being updated requiring the update to be loaded).
WebJames was restarted twice. Once to alter the config to not list the contents of directories by default, and once when NetTime froze the machine, at which point I reverted to my earlier ROM image.
Other than that, it has been reliable. I have linked to the clone of riscos.fr from the landing page, so it is getting regularly spidered by the Israelis, Egyptians, and Chinese; not to mention random attempts to access php/sql admin pages and the like.
David mentioned that he uses VHOST so maybe there is a problem within WebJames? I’ve not yet tracked down that dofla issue either, or I’d have a go at fixing it…

Apr 25, 2015 11:19am

David Feugey (2125) 2709 posts

David mentioned that he uses VHOST so maybe there is a problem within WebJames? I’ve not yet tracked down that dofla issue either, or I’d have a go at fixing it…

No, as I used also HTTPServ without vhost, with the same problems.
Probably more a problem with ShareFS.
Or something else.

Apr 25, 2015 11:23am

Rick Murray (539) 13840 posts

I don’t use ShareFS. Did you ever try without it active?

Apr 25, 2015 11:37am

Steve Pampling (1551) 8170 posts

David mentioned that he uses VHOST so maybe there is a problem within WebJames?

I did suggest that as I’ve had problems with WebJames stability on Iyonix in the past. Adding the opportunity for alignment issues to the mix isn’t going to make it more stable.

Apr 25, 2015 12:52pm

Rick Murray (539) 13840 posts

Can you please be more specific about what you mean by the stabillity issues, namely did you notice if the problem was something repeatable? (I wonder if it is related to the dofla – that looks like a null pointer being used).
I don’t think alignment is a problem – that would bomb out on my system (exceptions enabled) and it is possible to build from source as well (though I have not).

A question for both of you – are you using the simple version or the PHP build? I didn’t need PHP so I’m using the simpler one. I did notice that the resolve IPs option is extremely crashy – logging in from 127.0.0.1 shouldn’t cause the server to instantly die. ;-)

Apr 25, 2015 1:06pm

Colin (478) 2433 posts

The dofla string will be caused by the printing of a NULL pointer – try printf(“%s\n”, (void*)0); As it happens when the command to fetch a page is printed to the logfile it would appear that the command is not always set when printing to the log file. It doesn’t mean there’s a problem – other than the display in the logfile. The command string may be checked for null after the logfile output.

Apr 25, 2015 1:17pm

Steve Pampling (1551) 8170 posts

Can you please be more specific about what you mean by the stabillity issues, namely did you notice if the problem was something repeatable?

It was a while back, but as I recall it was leaking sockets¹ when clients had intermittent connections (ropy old laptop) and randomly crashed or froze after a few hours or days use.
PHP made things worse.

¹ After a period of time it woud crash and any attempt to restart or run anything else using IP sockets would report a socket in use

Apr 25, 2015 4:57pm

Rick Murray (539) 13840 posts

it was leaking sockets1 when clients had intermittent connections (ropy old laptop)

It appears to be a little more stable in this respect.

*inetstat -a
Active Internet connections (including servers)
Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)
tcp        0      0  raspberrypi.home.http  akirei.home.38648      ESTABLISHED
tcp        0      0  raspberrypi.home.telne akirei.home.52736      ESTABLISHED
tcp        0      0  raspberrypi.home.http  akirei.home.46667      CLOSE_WAIT
tcp       37      0  raspberrypi.home.49196 91.203.57.172.443      CLOSE_WAIT
tcp       37      0  raspberrypi.home.49195 91.203.57.172.443      CLOSE_WAIT
tcp       37      0  raspberrypi.home.49194 91.203.57.172.443      CLOSE_WAIT
tcp       37      0  raspberrypi.home.49193 91.203.57.172.443      CLOSE_WAIT
tcp       37      0  raspberrypi.home.49192 91.203.57.172.443      CLOSE_WAIT
tcp        0      0  raspberrypi.home.49189 68.232.35.121.http     LAST_ACK
tcp        0      0  *.http                 *.*                    LISTEN
tcp        0      0  *.telnet               *.*                    LISTEN
udp        0      0  *.49152                *.*                   
udp        0      0  *.netbios-ns           *.*                   
udp        0      0  *.bootpc               *.*

Akirei is my phone, I just did a port scan to make sure RISC OS isn’t responding to anything else (though locally; the Livebox is only allowing telnet and http through from outside).
The 443 (https) is this very site. ;-) The LAST_ACK http is gravatar.

And since the last reboot, for tcp:

        70 connection requests
        293 connection accepts
        0 bad connection attempts
        0 listen queue overflows
        307 connections established (including accepts)
        367 connections closed (including 117 drops)

And for all:

        5812 packets for this host
        5952 packets for unknown/unsupported protocol
        0 packets forwarded
        0 packets not forwardable
        31171 packets received for unknown multicast group
        0 redirects sent
        4636 packets sent from this host

any attempt to restart or run anything else using IP sockets would report a socket in use

I got that when the IP lookup failed (as it frequently did). It seems that WebJames neither attempts to trap the exception and try to deal with it sensibly, nor does it then try to close open ports.
Granted, backtrace code is a pain, but it should at least attempt to longjmp() to a situation where it can tidy up after itself, if nothing else… :-/

Weak TCP/IP stack

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options

Apr 19, 2015 8:05pm David Feugey (2125) 2709 posts	I more and more think the two problems are the same. I did have a problem with disk accesses. But I have also a problem of non responding sockets. Network blinks (= request received), disk does not (= nothing in the WebJames’s log). The problem seems to be here for months/years, as the RC12a ROM has this bug too. To be honest, I don’t know what to do. I think, I’ll simply shut down all the riscos.fr website, the time to find an alternative, as I sink in Google rank. The curious point is that nobody did notice the problem until today. I have probably no visitors :) Nota: perhaps that this problem is the same as the “no sockets left”, that I have after around one month of WebJames use. (Would be cool to see this solved for ROS 5.22 :) )

Apr 19, 2015 8:24pm David Feugey (2125) 2709 posts	I came back to my fastest setup: PandaBoard ES. It’s the fastest to wake up. But also to see its socket going to death. The strange point is that once active, sockets don’t die. I should make an apps to ping on my server. Does anyone has simple Basic code to make an http call ? :)

Apr 19, 2015 8:55pm David Feugey (2125) 2709 posts	My batch file on a PC :START wget http://www.riscos.fr del index.html sleep 5 GOTO START Guess what. no lag any more on server. All socket stay active. If some sockets are dead, you must wait several minutes to revive them, but all will be back to normal later… or almost (sometimes a request fail). Problem: my SD card will be dead in two days with this ‘fix’.

Apr 19, 2015 8:56pm David Feugey (2125) 2709 posts	Now I need to code this in Basic, else I will need to use two servers. One for RISC OS & one to revive the it. That’s completely crazy :)

Apr 19, 2015 8:57pm Rick Murray (539) 13840 posts	I think, I’ll simply shut down all the riscos.fr website, the time to find an alternative, as I sink in Google rank. Two things. One – Google’s rank is unimportant. You will sink anyway because you are not an https site and Google prefers that these days. I’m the same, but I’ve never cared about my Google status. I got disillusioned with the whole thing when I took a peek in the “SEO” world and how obsessed people were with “monetizing” their website, talking about pulling content that was not turning a profit. These people aren’t interested in sharing information, they just want to cash in. And thanks to that, a lot of Google’s search results are crap. Two – perhaps you could write a short program to (periodically, like maybe on the hour?) shut down the webserver task, close open files, then call OS_Reset to kick the entire machine. I think a shutdown-reset cycle should take around 30s (a total guess!), that might not cure your problems, but may at least stop them from being show stoppers. I am looking at your site right now and sometimes it doesn’t respond, other times everything comes up quickly. If you want to make a basic HTTP call, you could perhaps call wget via OSCLI? Tell it to fetch something. ;-)

Apr 19, 2015 9:16pm David Feugey (2125) 2709 posts	One – Google’s rank is unimportant. You will sink anyway because you are not an https site and Google prefers that these days. Not only. The fact that the website crashes each evening when Google robots are coming is not good. And it means too that there is no cache and no indexation of the content. that might not cure your problems No. The problem appears only a few seconds/minutes after the launch of WebJames. If you want to make a basic HTTP call, you could perhaps call wget via OSCLI? Tell it to fetch something. ;-) As expected, it’s not really useful. Only one socket is alive. So another connection could make all failed. The script is only useful to revive sockets faster (or to get timeouts instead of you, if you prefer). The strange point is that ShareFS, that use sockets too, works perfectly. Ping too. So it’s really linked to sockets called via the C interface.

Apr 19, 2015 10:18pm Colin (478) 2433 posts	Have you tried any of the roms in my omap4 network test thread. They have changes to etherusb which may be relevant.

Apr 20, 2015 5:46am David Feugey (2125) 2709 posts	After 30 minutes, failed to connect to web server. Connection does not want to come again. Ping still works, as ShareFS.

Apr 20, 2015 12:40pm David Feugey (2125) 2709 posts	I switched to company server, as it’s not possible any more to use it like this :) You should see “new server” in the title bar. More to follow.

Apr 22, 2015 7:29pm Rick Murray (539) 13840 posts	To follow this up… I have been running a small site on my Pi for two days now without problems, using the current WebJames, the latest firmware, and the most recent (at the time) version of RISC OS (5th April). Yuck, all my times are an hour out according to CLib. :-/ I have just made a clone of riscos.fr and testing over the LAN using my phone shows no unexpected problems, even though the pages are somewhat more complex. I did see David’s oflafofla (or, these days, dofla, that’s surely “ofla” in a post Homer Simpson world): `192.168.1.11 - - [22/Apr/2015:19:56:37 +0000] "dôﬂådôﬂådôﬂådôﬂådôﬂådôﬂådôﬂå[…]" 200 0 "" "" 192.168.1.11 - - [22/Apr/2015:19:56:43 +0000] "dôﬂådôﬂådôﬂådôﬂådôﬂådôﬂådôﬂå[…]" 200 0 "" ""` Looking in the other log, I see this: `22/04/2015 19:56:37 : CLOSE 22/04/2015 19:56:43 : CLOSE` Does WebJames support KeepAlive? I didn’t think it did, but maybe…? Just been at it with the iPad and doubled the size of the log file. ;-) I can’t say the server has been taking it easy due to few people knowing it is available, as my logfile shows: 1.214.119.227 - - [21/Apr/2015:15:55:01 +0000] "GET /muieblackcat HTTP/1.1" 404 274 "" "" 1.214.119.227 - - [21/Apr/2015:15:55:01 +0000] "GET //phpMyAdmin/scripts/setup.php HTTP/1.1" 400 213 "" "" 1.214.119.227 - - [21/Apr/2015:15:55:02 +0000] "GET //phpmyadmin/scripts/setup.php HTTP/1.1" 400 213 "" "" 1.214.119.227 - - [21/Apr/2015:15:55:03 +0000] "GET //pma/scripts/setup.php HTTP/1.1" 400 213 "" "" 1.214.119.227 - - [21/Apr/2015:15:55:04 +0000] "GET //myadmin/scripts/setup.php HTTP/1.1" 400 213 "" "" 1.214.119.227 - - [21/Apr/2015:15:55:05 +0000] "GET //MyAdmin/scripts/setup.php HTTP/1.1" 400 213 "" "" and so on. Lots of rubbish like this. Is the phpMyAdmin link broken? Surely the “//” can’t be right? Also… muieblackcat? That ought to be nyankuroneko. Right – I’m off to watch a cute zombie called Liv Moore… ho ho.

Apr 22, 2015 10:44pm Rick Murray (539) 13840 posts	Tomorrow I’ll revert back to my modified older RISC OS. An unexpected consequence of using a different timezone to the UK expected seems to be that NetTime is getting things horribly wrong.¹ My computer thinks it is 0h53 (it is 0h28) and the status says NetTime last synced a day ago. What? ² :-/ Update: well . . . I entered NetTime_Kick in a task window and the machine has frozen. Since I have to reset, I’ve reverted back to my build of RISC OS. Well, I guess that’s one way to clear all the crap in the log files, huh? ¹ Surely I can’t be the only person with Timezone +1 & DST ? ² Half an hour in a day is kind of poor. Could NetTime be a little more intelligent here and disable slewing if it can’t check the time for whatever reason? Surely it would be best to try to keep the time as it is rather than continue to slew and have it get more and more incorrect?

Apr 23, 2015 8:11pm Malcolm Hussain-Gambles (1596) 811 posts	Talking of NetTime, this guy is to blame – Genius at work. No that’s not sarcasm.

Apr 25, 2015 11:05am Rick Murray (539) 13840 posts	What’s today? The 25th? The server has been running through the week and my server has needed a couple of restarts (plus the code being updated requiring the update to be loaded). WebJames was restarted twice. Once to alter the config to not list the contents of directories by default, and once when NetTime froze the machine, at which point I reverted to my earlier ROM image. Other than that, it has been reliable. I have linked to the clone of riscos.fr from the landing page, so it is getting regularly spidered by the Israelis, Egyptians, and Chinese; not to mention random attempts to access php/sql admin pages and the like. David mentioned that he uses VHOST so maybe there is a problem within WebJames? I’ve not yet tracked down that dofla issue either, or I’d have a go at fixing it…

Apr 25, 2015 11:19am David Feugey (2125) 2709 posts	David mentioned that he uses VHOST so maybe there is a problem within WebJames? I’ve not yet tracked down that dofla issue either, or I’d have a go at fixing it… No, as I used also HTTPServ without vhost, with the same problems. Probably more a problem with ShareFS. Or something else.

Apr 25, 2015 11:23am Rick Murray (539) 13840 posts	I don’t use ShareFS. Did you ever try without it active?

Apr 25, 2015 11:37am Steve Pampling (1551) 8170 posts	David mentioned that he uses VHOST so maybe there is a problem within WebJames? I did suggest that as I’ve had problems with WebJames stability on Iyonix in the past. Adding the opportunity for alignment issues to the mix isn’t going to make it more stable.

Apr 25, 2015 12:52pm Rick Murray (539) 13840 posts	Can you please be more specific about what you mean by the stabillity issues, namely did you notice if the problem was something repeatable? (I wonder if it is related to the dofla – that looks like a null pointer being used). I don’t think alignment is a problem – that would bomb out on my system (exceptions enabled) and it is possible to build from source as well (though I have not). A question for both of you – are you using the simple version or the PHP build? I didn’t need PHP so I’m using the simpler one. I did notice that the resolve IPs option is extremely crashy – logging in from 127.0.0.1 shouldn’t cause the server to instantly die. ;-)

Apr 25, 2015 1:06pm Colin (478) 2433 posts	The dofla string will be caused by the printing of a NULL pointer – try printf(“%s\n”, (void*)0); As it happens when the command to fetch a page is printed to the logfile it would appear that the command is not always set when printing to the log file. It doesn’t mean there’s a problem – other than the display in the logfile. The command string may be checked for null after the logfile output.

Apr 25, 2015 1:17pm Steve Pampling (1551) 8170 posts	Can you please be more specific about what you mean by the stabillity issues, namely did you notice if the problem was something repeatable? It was a while back, but as I recall it was leaking sockets¹ when clients had intermittent connections (ropy old laptop) and randomly crashed or froze after a few hours or days use. PHP made things worse. ¹ After a period of time it woud crash and any attempt to restart or run anything else using IP sockets would report a socket in use

Apr 25, 2015 4:57pm Rick Murray (539) 13840 posts	it was leaking sockets1 when clients had intermittent connections (ropy old laptop) It appears to be a little more stable in this respect. inetstat -a Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp 0 0 raspberrypi.home.http akirei.home.38648 ESTABLISHED tcp 0 0 raspberrypi.home.telne akirei.home.52736 ESTABLISHED tcp 0 0 raspberrypi.home.http akirei.home.46667 CLOSE_WAIT tcp 37 0 raspberrypi.home.49196 91.203.57.172.443 CLOSE_WAIT tcp 37 0 raspberrypi.home.49195 91.203.57.172.443 CLOSE_WAIT tcp 37 0 raspberrypi.home.49194 91.203.57.172.443 CLOSE_WAIT tcp 37 0 raspberrypi.home.49193 91.203.57.172.443 CLOSE_WAIT tcp 37 0 raspberrypi.home.49192 91.203.57.172.443 CLOSE_WAIT tcp 0 0 raspberrypi.home.49189 68.232.35.121.http LAST_ACK tcp 0 0 .http . LISTEN tcp 0 0 .telnet .* LISTEN udp 0 0 .49152 .* udp 0 0 .netbios-ns .* udp 0 0 .bootpc .* Akirei is my phone, I just did a port scan to make sure RISC OS isn’t responding to anything else (though locally; the Livebox is only allowing telnet and http through from outside). The 443 (https) is this very site. ;-) The LAST_ACK http is gravatar. And since the last reboot, for tcp: 70 connection requests 293 connection accepts 0 bad connection attempts 0 listen queue overflows 307 connections established (including accepts) 367 connections closed (including 117 drops) And for all: 5812 packets for this host 5952 packets for unknown/unsupported protocol 0 packets forwarded 0 packets not forwardable 31171 packets received for unknown multicast group 0 redirects sent 4636 packets sent from this host any attempt to restart or run anything else using IP sockets would report a socket in use I got that when the IP lookup failed (as it frequently did). It seems that WebJames neither attempts to trap the exception and try to deal with it sensibly, nor does it then try to close open ports. Granted, backtrace code is a pain, but it should at least attempt to longjmp() to a situation where it can tidy up after itself, if nothing else… :-/