Here’s a multiple choice question for you. When using continuous data protection (CDP), how much storage does CDP need when compared to the amount of application data that it is protecting? Is it:
(E) 10x or more
I’ll get to the correct answer in a moment but a common misperception is that when using CDP to protect your application data, you need a lot of storage in order to implement it. This is not necessarily the case.
Here is what determines how much storage a CDP product needs. CDP initially needs an allotment of storage capacity that is equal to the size of the volume on which the data resides that is being protected. This is needed so the CDP product can make a copy of all of the blocks on the production volume. So if you are protecting 100GB of application data to start, CDP will also need to start with 100 GBs of storage capacity.
However, the wild cards in how much storage the CDP product requires are based not the size of the production volume but two other variables. They are:
- The daily change rate of the data on that volume. CDP’s purpose is to provide organizations with very granular levels of application recovery. To accomplish that, CDP products capture every application write of data that is new or changed. The key here is how fast the data changes but most organizations will find that their daily change rates for the majority of their applications are in the 2-3% range.
- The retention period of the CDP data. The next question that companies have to answer concerns how long they intend to retain the data under CDP’s management. Feedback received from CDP providers like InMage indicates that most organizations retain this data anywhere from 7 – 30 calendar days before it is removed.
So to answer the multiple choice question above, the most correct answer is (A). CDP requires roughly a 2x storage capacity required for most environments assuming a 30 day retention period.
If you compare CDP to the use of deduplication under the same circumstances, the results are almost the same. Using either CDP or deduplication over a period of 30 days to back up a 100GB data set with a 5% change rate (assuming 4 fulls and 24 incrementals per month with a 20:1 dedup ratio) both products require roughly 150GB of raw storage capacity.
Even more encouraging, CDP can become even more storage efficient going forward. Most organizations only need the granular level of recovery that CDP provides for 24 – 48 hours. After that period, retention policies can be established that decrease the granularity of retained data over time.
For example, organizations can opt to keep full granularity for the first 48 hours, retain one application-consistent recovery point every four (4) hours for the next 72 hours, and then retain one application-consistent recovery point per day thereafter to the end of the 30-day window. Application-consistent recovery points can then also be used as the source from which to create the occasional low cost tape-based copy of data to meet compliance requirements without negatively impacting production servers.
Arriving at the truth as to how much storage capacity CDP products like InMage need does not turn into a guessing game. Documenting how CDP works and then taking into account how factors like daily change rates and retention periods impact its total storage consumption make it much easier to understand and illustrate its true value proposition. In so doing, organizations can dispel some of the myths around CDP’s costs and subsequently take advantage of the recovery benefits that it provides.