Wiki offline
Andrew Hodgkinson (6) 465 posts |
I’ve been forced to take the Wiki offline. The PostgreSQL database supporting it has become corrupted at the filesystem level. The cause is unknown. Our hosting service is looking into possible issues with the new server we’re using. I’m attempting to recover data, but recent changes may be lost. I do not know when the Wiki will become available again. Apologies for the inconvenience. |
Andrew Hodgkinson (6) 465 posts |
OK, hopefully everything is running properly now and data has been either restored or reconstructed where necessary. We still don’t know what prompted the problem – the hardware has been running without a problem for quite a while before the ROOL account moved to it, for example, though it’s always possible that a new fault has developed. We’ll be keeping an eye on it anyway. If you added pages to the Wiki or edited any in the last couple of days, please make sure they’re in the page list and have your changes: If you spot any problems, please e-mail us at: Thanks. |
Jan Rinze (235) 368 posts |
Yet another case for having good backups.. |
Andrew Hodgkinson (6) 465 posts |
Backing up every single change to the database is impractical. Instead, backups are daily. With the precise point of filesystem corruption unknown and a window of roughly two days, purely restoring from backup would lose all Wiki changes made since that point. Instead, I elected to reconstruct the corrupted data by combining inspection of the missing elements with out of date equivalents in the backup, then producing an up to date replacement object. It was then further necessary to dump and re-import all databases in case the corruption had spread elsewhere (propagated indexing faults, for example, are a possibility). Creating new databases and importing the exported SQL data should clean up any such problems. This is the first fault of that nature seen since the site started in 2006 and there was nothing of use in any of the logs, so it’s the usual frustrating “once in a blue moon” bug which is hard to diagnose. We were actually lucky that the bit of database it took out was quite easy to replace. In a more severe case, we might be forced to roll back some or all of the databases to an earlier backup, losing some or all changes to all applications since that point. |
Andrew Hodgkinson (6) 465 posts |
It’s just kind of ironic that we move to a new server with a “this’ll help!” announcement, then promptly lose a bit of DB to a mysterious failure mode :-) Been one of those days really. We’re trying to organise new merchandise for Wakefield but the fire in the BT exchange this morning took out the payment processor for the company we were using, so when we came to pay, the process failed. We’re trying BACS now. The delivery window was already quite tight so fingers crossed that everything comes through in time. |
W P Blatchley (147) 247 posts |
My changes to *MemoryI and *MemoryA were fairly recent, and they’re still there. Hopefully you recovered everything. Sorry you’re having a bad day – it all come good in the end! The Wiki in particular is becoming extremely useful. In my opinion, this is set to become by far the most useful reference for RISC OS we have at the moment. Keep up the good work! |