A Better Way to Back Up DocumentDB Databases
by Shirish Phatak on February 05, 2016
There are two types of organizations in the world. The first type backs up their databases regularly. The second type goes out of business. When data gets accidently corrupted or deleted, an intruder manages to get into the database, or any number of other potential disasters strike, a backup of your DocumentDB is the only thing standing between you and disaster. Here is a better way to get that done.
Protecting Your Document DB From the "Oops! I Hit Delete!" Plotline
Work the kinks out of your backup and restoration system before a disaster strikes. The cost of not learning this lesson is high, sometimes leading to the loss of most or all of your data.
There are a couple of ways to protect yourself from the accidental delete scenario. First, you can choose to go with in-cloud backup using Azure Data Factory. You can get the step-by-step instructions for using this option here. Your other choice is to use on-premises backups in addition to your cloud-based backup. This allows you to utilize both local and Azure Blob storage for geographic redundancy. In other words, the earthquake/tornado/flood/fire/name your disaster that takes out your primary systems likely won't affect your cloud-based backups and vice versa.
Use the DocumentDB Migration Tool
DocumentDB comes with a migration tool, which clearly doubles as a backup and disaster recovery solution. Use the steps laid out here, but when it comes time to execute the export operation from the Summary page, instead you should view the command (hit View Command at the top right of the window in the Data Migration Tool box). Now you can use, copy, and run the command line core.
Restoring Data From Azure Blob
The Azure Data Factory allows you to restore data that has been backed up in Azure Blob storage. This works well for larger data sets that have basically a uniform structure. Alternately, you can run the Migration tool and reverse the source and destination. It's just like backing up in reverse.
What to Do in an Outage Situation
Ideally, you should have both on-premises and cloud-based backups so that you're protected from any number of potential situations.
You can handle outages by performing double writes to a secondary Document DB account in another region. For example, design the data access for an application so that it transparently double-commit writes to both accounts. Or, run Azure Web App in order to expose a REST interface where you can pass through all your Document DB accounts. Alternately, you can run an Azure Cloud Service on a regularly scheduled backup. Do this by searching for all documents with a "ts" property greater than the value at the last run. Then create or update every document according to its IT.
Whatever methods you decide to employ, be sure to conduct thorough and rigorous testing to be sure you're familiar with the process before an actual disaster strikes. No time is worse than a disaster situation to discover you had a missing element in your DR plans. You can assure that you have remote access to your Azure storage environment so that even if your primary systems are destroyed, you can still access your cloud storage. Learn more in this overview of FAST™.