T-SQL Tuesday is a monthly event on the second Tuesday of each month; various SQL Server bloggers write about a specific topic, and *you* can join the party. This month’s topic was chosen by Allen Kinsel (blog|twitter), is “Disasters & Recovery”. If you have a story, then go to Allen’s blog and follow the instructions.
What kind of disaster?
So just what constitutes a disaster? Steve Jones (blog|twitter) has an interesting list of disaster scenarios, most of which you probably wouldn’t initially categorise as a disaster within, as they say, the meaning of the act. However, each of those does provide the opportunity for data loss, corruption, downtime, system crashes, loss of income, tarnished corporate image – all sorts of tangible and non-tangible results.
One from the vaults
One situation I found myself in at a previous job was not a classic disaster – we had power, water, comms etc – but did have the potential to cost the company a fortune from which it may not have recovered.
I was working for a firm that did a lot of conveyancing work – the legal work associated with buying & selling houses. When you buy a house, as well as paying all those lovely legal fees for (what may seem like) very little, you also get stiffed with a big bill from The Man – for “Stamp Duty“. This nasty little bit of taxation is based upon the purchase price of the house you’re buying. And, to add to the fun, it’s banded, but the banding applies to the whole cost.
Anyway, we were working on a system that dealt with thousands of these transactions per month, with hundreds of (financial, time-sensitive) transactions per day. So imagine our shock when we found that, somehow, the property price for every case in the system had been set to the same value – I don’t remember exactly, but let’s say £60000 – something like that, anyway. Given that this was found to be the case after a significant amount of work had been done during that day, rolling back to the previous night’s backup was not an option.
Get out of that one, matey
What to do? We were, in retrospect, quite fortunate. We had a reporting system that was built every night by (basically) dumping out 25 or so tables from the live & archive databases. It wasn’t too difficult to knock together a script to alter the purchase price for every property in the system from the reporting database – but it did take a while to run.
That just left the problem of those properties whose prices either hadn’t been set, or might have been changed during the day. Again, fortunately we were working on a system whereby all incoming documents were scanned and electronically attached to the case, rather than having to run through several thousand incoming documents a day. Again, it was relatively straightforward to work out which cases had had incoming documentation during the day, and (the fun part) generate a list of documents / cases for the staff to check – focusing, of course, on those cases due to complete that day before going through the rest.
Homework: What would you have done
So. What would you have done if it had been one field of your data that was destroyed like that? How would you have got out if it?
And what would you do to make sure it didn’t happen again? (Incidentally, we never did figure out why that problem happened…)