Datrium’s ControlShift, previously known as Project CloudShift (as still seen in some of the screenshots below), is a Disaster Recovery runbook automation engine for virtualized workloads running on the Datrium DVX storage platform. DVX runs on-prem, in the cloud (using Cloud DVX) or in a service provider’s private cloud to provide a consistent experience across environments for data lifecycle management. ControlShift can do some very cool on-demand DR restores to VMware Cloud on AWS, but I’ll leave that for a next post. For now, I want to look at how Datrium actually deploys ControlShift to get a better understanding of the product.

In terms of service deployment of the product, Datrium’s done some cool stuff to deploy it to the cloud. I’ll walk through the deployment of ControlShift in a customer AWS account.

Walking through the deployment

The first step is connecting all of the accounts together in the DVX management plane.

Datrium publishes all of the code and binaries to create a ControlShift instance to the customer account. They also has a SaaS version for ControlShift, but the deployment model is identical, deploying the entire stack for each customer inside of Datrium’s AWS account in separate VPCs.

The green-field situation before deployment

Using a set of the customer’s access keys, they create the necessary IAM roles, and bootstap the deployment logic into the customer account. This deployment logic is packaged as Lambda functions, which executes to create a ControlShift-specific VPC, subnets and routes. Once all that is set up, the Lambda instantiates an EC2 AMI for ControlShift. This instance takes over the rest of the configuration, creating a DynamoDB, VPC endpoints and Aurora.

In parallel, a second Datrium service for Cloud DVX is deployed by the Lambda function as an AMI inside the same VPC. The Cloud DVX creates an S3 bucket to store the actual backups and instantiates a Datrium-specific VPN into the Cloud DVX to connect a customer’s datacenter to the VPC. Alternatively, a DirectConnect can be leveraged.

Now, the bootstrap deploys all this in a single Availability Zone for simplicity. Of course, all data in the databases is replicated across AZs. The bootstrap Lambda functions keep running, and know what services should be running and monitors the health of those services. If something is wrong, it’ll re-deploy in another AZ while using the existing databases for configuration. It will also maintain software versions of the ControlShift software, regardless of where the software runs.

The final picture looks like this, with the on-prem datacenter replicating forever incremental snapshots to the Cloud DVX, storing them on S3.

The final set up

Why this is important

I wanted to show you how ControlShift is deployed into a customer’s AWS account. ControlShift, along with Cloud DVX, are the building blocks for some very cool and well-thought out on-demand backup and DR to the cloud. Understanding how ControlShift and Cloud DVX are deployed is a fundamental piece of knowledge for the second post in this series, where I’ll show you how ControlShift leverages VMware Cloud on AWS for on-demand (cold) disaster recovery.