Developing a Disaster Recovery strategy is one of those topics that seems to stay on the to-do list longer than it should. Whether it is due to the effort required to put one together, lack of resources or funding, or just having too many other priorities, it shouldn’t be ignored. Not having a disaster recovery plan doesn’t hurt you—provided everything is running perfectly—but the minute a disaster strikes, it can be the difference between a speedy recovery and costly failure.
Start with a Plan
The following may sound basic, but you must start somewhere. Creating a Disaster Recovery (DR) plan from scratch can be a daunting task, but don’t allow that to paralyze you. There are plenty of templates available to get you started and several professional services organizations that can provide help. If you can’t outsource the work, break it down into smaller, actionable items to make it easier to accomplish. You don’t have to get it all done at once. Set up a team, establish a meeting cadence, and assign small tasks. You will be amazed at how easily it will come together.
Be sure to include key stakeholders and business leaders from your organization in the planning stages.
Include the Basics
At a minimum, there are a few things that can be used as an outline to get your plan started.
Invoking the Plan & Declaring an Emergency
It is imperative to determine who in the company has the authority to declare an emergency, how that gets communicated out to the entire staff, and the details of how and when to invoke the plan.
Critical Hardware & Software Identified
Create a list of critical hardware and software to ensure that the right services are part of the DR process. Doing this allows you to establish priority on these items for restoration and business continuity.
Application Flow Diagrams
Building out application flow diagrams for your critical applications makes it easier to understand the complex interactions between all systems involved in the application. The visual representation of the components ensures that you don’t miss any key elements when planning your disaster recovery options.
RPO & RTO Recovery Point and Time Objectives
These are two critical metrics to consider when putting your plan together. They are often confused or interchanged, but each one should be planned out carefully.
- Recovery Point Objective: this is how far back BEFORE the disaster that you can tolerate data loss. Determining this metric upfront will help you plan your backup frequencies. For example, if your RPO is 4 hours and your backups are nightly, you will not meet your objective.
- Recovery Time Objective: this is the amount of time AFTER a disaster that your business can continue before your systems are fully restored.
Backup and Replication to DR Site
Whether you decide to replicate your backup data to the cloud or an offsite data center, make sure you consider geographic locations when making your selection. This will mitigate both sets of data being impacted by the same disaster.
Sufficient Bandwidth for Backups
Ensure that you have adequate bandwidth to support transferring data from the offsite location at rates that will allow you to meet your recovery time objectives.
Procedures should be formally documented, and the backup schedule published.
Frequency and retention of backups should be aligned with RPO.
Do you regularly run test restore jobs on a monthly or quarterly basis? Backups are only valuable if you can restore the data. ZAG recommends that test restores be done either quarterly or monthly. An untested backup should never be trusted.
- Maintenance Plans: You should establish a database maintenance plan to be implemented and maintained by an application developer.
- SQL Aware Backup: Backup software that is SQL aware can properly backup logs and other live file services while they are running and restore individual portions of databases.
- Transaction Logs: SQL transaction logs should be purged after each backup.
Document and Test
It’s not enough to have a plan, it needs to be documented and tested. Considerations you should be mindful of include creation of runbooks and a formalized annual testing schedule.
DR Documentation and Procedures
- Runbooks: A runbook documents the procedures required to get your organization back up and running in the event of a disaster.
- Change Management: Include updating your runbooks into your organization’s change management process to ensure they are kept up to date.
- Storage: In addition to electronic copies in multiple locations, you may also consider making hard copies and distributing them to key individuals quarterly. Doing this will ensure that at least one copy is readily available during a crisis.
- Annual Testing of DR Plan: Running through a mock disaster once a year can vet out missing components of your plan. Just reading through your plan doesn’t have the same impact as actually walking through it.
- Quarterly Updates to the DR Plan: Processes and procedures change constantly, and the DR plan should be updated as well. Putting a quarterly review on the calendar ensures that your plan remains relevant.
In summary, if you are one of the lucky ones with both budget and resources, then kick off a project and make a concentrated effort to complete your Disaster Recovery plan immediately. Accomplish this by either outsourcing the work or setting up a dedicated in-house team. If you don’t have that luxury, then at least get the ball rolling. Start the discussions, setup regularly scheduled checkpoints, break work down into realistic chunks. In time, your Disaster Recovery plan will come together. Taking no action now is only going to make recovering from a disaster more costly, lengthy, and detrimental to your business success.