So, you backup up your databases, but when was the last time you tried to restore them? Did you include the system databases which hold all your logins/passwords, SQL agent jobs, security, and server settings? You don’t want to find out your backups are no good, or that you don’t have a good set of instructions on how to restore them during a disaster. What if your backups have not been running? You set them up, started them, and assumed that everything was working. However, when the time comes and you go to look for those backups they are not there. Come to find out the backup job has not been running for over a month, what do you do now?
That is one of many possible scenarios which highlight why it is important to discuss SQL Backups, Restores, and recovery options for Microsoft SQL Server. That is what I do for a living. I am a Production Database Administrator (DBA). What does the “Production” part mean? It means I build, maintain, recover database servers and databases. My main job is to keep your SQL server up, running, and performing at its best. Part of that is recovering your database and server in the event of a disaster. I’ll be asking some questions that will get you thinking about how you are backing up and if you have tested those backups to make sure they are solid and can get you back in business as quickly as possible and with minimal data loss.
How are you backing things up?
There are several ways to backup your databases and your SQL instance:
- SQL Agent Jobs/maintenance plans
- Log Shipping
- Availability Groups
This will determine how you recover your database. Each has its own pros and cons which I won’t go into in this article as the details can fill-up another article on their own.
What are you backing up?
You should be backing up your User Databases of course, but do you also backup your System databases? These are the Master, Model, and MSDB databases you see under the System Database object in Management Studio. These hold all your logins/passwords (for the SQL logins), server settings, and SQL Agent jobs. If you don’t backup the system databases and can’t restore them successfully, you are in for a lot of work. You better have a list of every SQL login you have configured, its password, what database they access and what level of access they have, and have all your SQL agent jobs scripted out. When you go to restore that SQL server after a disaster, you’ll need to re-create those without the original system databases from your old server. Undoubtedly this will increase your recovery time and cost your business productivity and potentially lost data.
Do you have a backup of the file share where all your SSIS packages and scripts reside that are run by your SQL agent jobs? If they are on the SQL server and you lose that entire server in a disaster, you will need these at your DR site or somewhere off-host so that the new server can get to them.
Where are your backups stored?
If you are backing up your databases to the same server that SQL resides on and nowhere else, what will you use when the entire server or the entire Data Center is gone? What will you use for recovery? Best practice is to save your backup files off-host. There are several ways to accomplish this:
- Backup to a file share off-host directly.
- Backup locally and copy the files to a remote server/cloud environment. The enterprise edition of SQL server allows you to write a backup to up to 3 mirror locations so you can do a backup to another on-premise server and a cloud location at the same time.
When are you performing backups?
RPO (Recovery Point Objective) and RTO (Recovery Time Objective) determine how often you backup up your databases and how long it will take to recover. These two metrics will help determine what type of backup you should use. SQL has 4 basic backup types:
- Full backup: Backs up the entire database and enough of the transaction log to get the database recovered to the time the backup was taken.
- Differential backup: Backs up the changed data pages since the last FULL backup.
- Transaction log backup: Backs up all the transactions that have occurred in the database since the last transaction log backup was taken. You need to perform a full backup before you can create any T-log backups.
- Filegroup backups: Backups of just the filegroup of the database. This is a good option if you have large tables that are read-only that you only need to backup every so often, cutting down your backup and recovery times.
Usually the best approach is to do a combination of the first 3. A full backup daily if you can (weekly if you must), differential backups daily, and transaction log backups that coincide with your RPO be that hourly or every 5 minutes. This is going to determine the point in time where you can recover your database to. Guard those backups. If any log backups in the middle of the “chain” (which is the Full backup, differential backup, and any log backups you have) are missing, that is where your recovery stops. How long does your restore take? Minutes? Hours? While your ERP database is offline, your shipping/production is at a standstill. How long will that be for? If you have no idea how long it takes to restore that database, you cannot determine the next steps to keep your business going.
Who last tested your restores?
Who was the last person to test your restores? Was this you, a co-worker, or do you remember? Again, when was the last time you tested your company’s restores? You do not want to find out in the middle of trying to recover your ERP system that the only viable backup is from a week ago. Now you are manually redoing all the transactions from the last week (purchases, shipping, invoicing). Can you do that? While we are at it, do you have a process in place in the event your ERP system is offline for an extended period? These are questions you need to ask.