I caught a cold or something last Tuesday that hit me harder than anything else I have had in years. Just as the cold was getting started I have a migration that I had to complete. This migration had been planned for about 3 months with the last 4 weeks just getting all the users that were going to be impacted ready. As I started to get ready for the migration I had realized the importance of checklists. First off we were taking a stand alone Windows 2003 SQL Server 2005 to a Clustered Windows R2, SQL Server 2008. This alone is a violation of one of the core rules that I like to live by.
Change only one thing at a time. – The more you change at one point in time the more possibilities that something will go wrong, and if they do go wrong then the complexity of fixing these just increases…
Well to address the rule first. A number of years ago I would have been very aggressive about this rule, and many of the rules that I think are best practice for a database servers. But the reality is really harsh here. If I would have stuck to this rule I would have impacted the clients not once, but three times. Once for the hardware upgrade, once for the OS upgrade and once for the SQL Server upgrade. So it may be best practice, but it would also be a major impact to customers.
So in a shop where you are supporting a customer that is paying for your service and that service is to provide a database that is to be online with little interruption is it really considered best practice to take the server down time and time again to upgrade it? I think it could be argued that the business needs to have a big say in how this systems is upgraded. So when the requirements are that the database is not to go offline for an extended period of time a new set of challenges arise. This is where the checklist is priceless.
The Migration started on Wed. afternoon last week and we did have a bit of a late start, but the impact to the customer was as advertised. We had some issues that impacted the users down the road and if it were not for the great sys admins we would still be working on those.
This fall I hope to present a session on how I managed to move over 300 Gigs of data, across 2 servers with an upgrade of an OS, hardware and SQL Server with the database only being down for 14 minutes. (Ok 14 min and 12 seconds). It’s not as hard as you think it is, but it does require a special attention to detail.