Why placing applications into the cloud alone is not enough to provide you with the high availability and disaster recovery you desire.
Business spending on the cloud is increasing and will continue to grow to become the “Bulk of new IT spending [sic] by 2016”, according to Gartner. Many companies moving to the cloud are doing so with a belief that they will be reducing downtime. However, this is not always the case. Moving mission critical applications to the cloud without due diligence and taking the proper setup requirements to ensure high availability and disaster recovery can lead to downtime and increased recovery time.
First, many companies are under a false impression that simply placing applications into the cloud, or relying on another businesses SaaS, or other solution automatically provides protection, including high availability and disaster recovery. This is not the case.
The average unavailability of cloud services is 10 hours per year or more. While the average availability is estimated to 99.9% or less. This is far from the expected five nines (99.999%) uptime for critical systems. This amounts to downtime costs of more than $70 million U.S. dollars based on accepted hourly costs in the industry.
During 2013 all three of the major cloud providers suffered outages. Microsoft Windows Azure had an outage of 20 hours, and a sub-component of the system failed worldwide. Google lost all services for five minutes, Google drive for 17 total hours, and Gmail for a total 12 hours. Amazon Web Services had connectivity issues which disrupted a portion of internet activity in a single availability zone for 3 hours.
In 2014 we saw more of the same. Amazon EC2, which had the best uptime for the year, had a total downtime of 2.43 hours across all regions. Microsoft Azure, which had a highly publicized cross-region outage had the most at 40 hours.
Cloud environments are still susceptible to outages and regional disasters. For mission critical systems you need to be certain that you set up the needed high availability, redundancy and disaster recovery components as you would with any other system.
The good news
Creating a high availability disaster recovery environment in the cloud is achievable. Though depending on your provider you may need to take different steps. For instance in Azure, which has some redundancy built in, for web servers you will still need to implement different fault domains and enable load balancing. Other applications will require further configuration adjustments for high availability and disaster recovery.
You can implement cloud based clusters for added protection by implementing a SANLess cluster in some environments. This is an excellent and cost efficient way to protect SQL, SAP, Oracle, and other mission critical applications within a Windows environment.
Basic cloud configurations within an availability zone (Amazon) and fault domains (Microsoft Azure) are generally designed to protect against hardware and other unexpected failures within a zone, you still need to prepare for regional disasters. This can be achieved by configuring a multisite cluster that is geographically separated. By building a SANLess cluster, and adding additional node in an alternate data center or different geographical location. Adding a third geographically separated node will provide you with superior a recovery time objective and recovery point objective with near zero data loss.
Best of Both Worlds
Implementing a hybrid cloud solution may be your best bet. You can do this by utilizing your on premise data center as your production environment and using the cloud as your disaster recovery site. This is far more cost effective for businesses that do not want to spend dollars building their own redundant disaster recovery site.
For some clients with mission critical applications we have added extra layers of redundancy by adding on-site disaster recovery solutions with tertiary cloud backup systems.
i Gartner Newsroom – Gartner Says Cloud Computing Will Become the Bulk of New IT Spend by 2016, publishes October 24, 2013 – Las Accessed Aug. 6, 2015 – http://www.gartner.com/newsroom/id/2613015
ii Techtarget.com – Cloud outage report of 13 providers reveals downtime costs, published June 22, 2012 – Last accessed Aug 6, 2015 http://searchcloudcomputing.techtarget.com/news/2240158511/Cloud-outage-report-of-13-providers-reveals-downtime-costs
iii Tech-Gadgets – Downtime Report: Top Ten Outages in 2013 – Published Dec. 22, 2013 Last Accessed Aug 6, 2015 http://www.business2community.com/tech-gadgets/downtime-report-top-ten-outages-2013-0720582#Z2xCkBHqs5SrECkO.99
iv Techtarget.com – Cloud Outage Audit 2014 reveals AWS on top, Azure down. – Published Dec. 24, 2015. Last Accessed Aug. 6, 2015. http://searchcloudcomputing.techtarget.com/news/2240237323/Cloud-outage-audit-2014-reveals-AWS-on-top-Azure-down