|Home > Articles > Disaster Recovery for Exchange||Bookmark page|
What would be your reaction if I told you I intended to format the C partition on your production Microsoft Exchange server to run a quick test? Your first reaction would probably be to grab the nearest blunt object for use on my head before I even got a chance to explain my self. Unfortunately, this is no joke and the dire consequences of viruses, incompatible patches, or other malicious events on a companies Exchange server is all too often the worst living nightmare of any Exchange admin and their bosses. In this article, I’ll show you how to protect your Exchange server with a method that recovers an Exchange server in 30 minutes! Some of you at this point may wonder, “But I backup my exchange database, why do I even care about this?” Unfortunately, an Exchange database is useless with out the original Exchange server in the original domain environment. If anything bad happened to the OS or Exchange application where the server was not recoverable, the data would be worthless because you can’t just mount that database on any old exchange server. Microsoft Exchange recovery is one trickiest things to execute, this article is meant to make that as painless and reliable as possible.
The “Official Microsoft method” for Exchange disaster recovery
If you’ve ever hired a Microsoft certified consultant or worked in a large corporation running Exchange, you’ve probably heard of or are using the “official Microsoft” method for Exchange disaster recovery. Unfortunately, anyone thinking about or running this “official” method is a glutton for punishment. The method involves maintaining a mirrored parallel universe of your Domain/AD and Exchange infrastructure. It involves building a backup domain controller on your existing network, and then putting it in an isolated LAN and promoting it to the PDC. Then you must carefully and meticulously build an Exchange server on that isolated LAN from scratch without making a single mistake in spelling using the identical settings of your production environment. Only then can this parallel universe be used in the event of a catastrophic failure of your production Exchange environment. This procedure is complex, prone to error, and expensive. Needless to say and without getting further into the details of this clumsy formal methodology, you don’t want to go down this road because there is a better way.
Background note: I’ve personally seen a high paid consultant spend 2 weeks implementing and documenting this disaster recovery plan for my company. In the end, that same consultant could not replicate that environment using the documentation he wrote him self and finally gave up after a day.
The “fast way” for Exchange disaster recovery
As a result of my personal experiences above, I refused to believe that this is what I must live with in order to have Exchange recoverability. Believe me, I got plenty of flack for it from my collogues and the very same consultant who couldn’t read his own documentation and in turn had expected us to use in case of a disaster. They insisted that this is the “official” way and is how all the big corporations do it. At the time in year 2000, I was just beginning to use system imaging (a process of coping an entire hard drive or partition on to a single file called an image) to begin a large deployment of new workstations running Windows 2000 and all new applications. I soon began to wonder, why not use system imaging for servers, and Exchange in particular. It seemed a daunting challenge because these were high end servers running complex SCSI RAID 5 configurations and it seemed like imaging wouldn’t work. As it turns out, since hardware RAIDs are completely transparent to the OS and applications, Norton Ghost or Power Quest Disk Image worked just as well on servers as they did on workstations. Once I managed to get this working, I managed to build a test Domain and Exchange sever in which I took an image of the OS and Applications partition and managed to restore it in less than 30 minutes after I formatted the C drive (data resided else where). I was confident that this would work to restore the basic Exchange server, but I wondered if the re-imaged Exchange server would recognize a more recent database if some additional emails were sent after the system image was taken. I tried this in the lab and indeed it was possible, the image-restored Exchange server would even mount a newer database with more recent data. What this meant was, even if I had made a month’s worth or any amount of updates to the database by just everyday usage, the Exchange server would have come up and brought up all the old and new data. I realized had come across the ultimate disaster recovery procedure for servers and all that were needed was some refinement in the process.
The following refinements are what I came up with:
Exchange Server architecture
To make this recovery method feasible, a fundamental design must be followed on the Exchange Server. Data must reside on a separate partition (physical preferred but logical is ok) from the OS and Application partition. The last thing you want to image is 1 Gigabyte of OS/Apps plus 100 Gigabytes of data on a single partition. You loose the granularity and convenience of being able to recover OS and Applications without affecting the Data. For existing servers that already have data mixed in with the OS and Applications, you could add an additional storage device and move the data store and log files to separate partitions on the newly added device. Ideally, one would go by the following guidelines for maximum safety, scalability, manageability and performance.
Disk imaging methodology
Before you start, be sure your Exchange server is in full operational order with all database store structure, anti-virus, backup agents, and anything else installed. In order to image a system, you generally need a dump your image onto a separate physical partition from DOS mode, you cannot image a boot partition while that partition is loaded. The easiest way to do this is to dump an image to a network file share. To avoid writing an entire 10 page chapter on how to create a TCP/IP network boot disk with SMB client capabilities and save you a ton of time, I’m going to say just one word; bootdisk.com. www.bootdisk.com is the one stop place you can freely download pre-made bootable images that pretty much work with all common network adapters. From there, you only need to make a few minor modifications to the drive mapping batch file to mount your network drives onto a drive letter. I would then recommend that you make a bootable CDROM image of the modified floppy disk for vastly improved boot times. Then you would simply boot up the CD with the network drivers and automatically map the network share. That network share should also contain a recent copy of Norton Ghost or PQDI (Power Quest Drive Image). From there, you simply run Ghost or PQDI and dump the C partition onto the network share. Additionally, I would create an image backup of all the Log file and Data Partitions as well with just the database structure and no data. Although imaging the data partitions is not mandatory, it will save you a lot of trouble by recreating the entire partition and database directory structures when doing a bare metal restore (starting from scratch with a new piece of identical or flushed hardware) of your Exchange Server. Note: The bare bone database structure partition images will be very small because they are almost all compressible.
With newer versions of Ghost or PQDI, they support a new hot backup feature where new changes to the OS and Application could be tracked while the system was operational. Note that you must first create an initial image of the C partition in DOS mode. Then you would track any changes to a production server even while the system is running by backing up all deltas to the OS and Applications with this new feature. This is a valuable feature because it is not feasible to down your server just to do a cold image backup every other week. Once all of this is in place, you can put your exchange server into production and start populating the Exchange data stores.
Up to this point, everything has been focused on backing up the OS and Application partition using disk imaging software. Backing up the data is equally important for an Exchange server. For anyone serious about performing hot backups of an Exchange server, you must use a reliable 3rd party enterprise solution that hooks into Exchange with a backup and restore agent. One of the better solutions I have seen is from Legato. Some other solutions that I have worked with were utter nightmares and never worked consistently and couldn’t always successfully restore, which meant someone’s head was going to roll. Since this article is not specifically about data backups, I will move on. For the large enterprises that can’t even afford to loose an hour’s worth of data, I would recommend that you go the extra mile of continuously making copies of the log files onto a separate physical device with some automated hourly batch process. Tape backup covers you up to the previous night and log files can cover you up to the last few minutes just before a database disaster.
There are several types of disasters that can happen to an Exchange Server such as database corruption and server corruption of the OS or Application. In some cases you may be able to recover from a database corruption by running the repair operation on the affected data store or calling Microsoft and finding a way to fix a severe OS or Application error on the Exchange server. However, in the even that the that a system completely dies where neither of the above options work or you simply loose an entire machine including all of the OS, Applications, Log files, and Data due to some catastrophic disaster, following the guidelines mentioned above is what is going to save you. In the even of a catastrophe, you would run the following procedures.
Note, it is rare that you would need to do a bare metal recovery of an Exchange server, but following the above procedure ensures you have maximum recoverability of your business critical Exchange servers in any event. On a last note, you should keep an off site tape copy of your images and database, this goes for any other critical server or application.
The above procedure is the easiest and most reliable of way of recovering from an Exchange disaster. However, you should note that the concepts apply to any other server or application. Some may ask “but Microsoft doesn’t support this do they?” or “Microsoft does not support disk imaging”. The truth is, I use disk imaging on all my servers and I have never been turned down for support at Microsoft, nor do they even ask if you using disk imaging at all. I even maintain a library of generic system images of every type of system configuration I have so that I can deploy or redeploy any new server rapidly and have never been turned down for support. Microsoft support is one of the more reasonable solutions out there, where else can you get unlimited attention until an issue is resolved for $250. Additionally, Microsoft has already announced their own image deployment strategy with a new API called ADS (Automated Deployment Services) in which major players have already pledged their support. Bottom line, disk imaging makes eminent sense and can be applied to any type of server or workstation.