Introduction

As a demo/test I wanted to build a complete SRM setup with both sites being protected. This requires two SRM installations and configurations and two replicated LUN’s. Since I am rather limited in available hardware at the moment, I needed to run every component in a virtual Machine on a single laptop.

Luckily, LeftHand Networks has a virtual SAN appliance, which actually is supported by VMware SRM. This way, I can use a supported SAN, and still use ony one laptop.

All used software (VMware ESX, vCenter, LeftHand VSA and Microsoft SQL Server Express 2005) are available for free (or as evaluation software).

Hardware and host software

I’m running all this on a Zepto Znote 6324W with a Intel Core 2 Duo T9300 (Penryn, 2,5Ghz with Intel VT and Execute Disable Bit), 4GB of RAM and a 200GB 7200RPM SATA2 hard disk.

This laptop is running Microsoft Windows Server 2008 (64-bits), with VMware Workstation 6.5.118166.

Network

Since I do not want to make the setup needlessly complex, I’m going to use a single /24 subnet for all hosts. The subnet is 10.10.10.0/24. Domainname is srm.local.

I’m using 10.10.10.8[0-4] for Site 1 and 10.10.10.9[0-4] for Site 2:


Figure 1. Networklayout for SRM setup

ESX COS: 10.10.10.80 and 10.10.10.90
ESX VMKernel: 10.10.10.81 and 10.10.10.91
VC/SRM: 10.10.10.82 and 10.10.10.92
VSA: 10.10.10.83 and 10.10.10.93
VSA VIP: 10.10.10.84 and 10.10.10.94

Installing VMware Infrastructure 3

As a base for the SRM setup, I’ve installed two VI3 environments consisting of a ESX 3.5u3 and a vCenter 2.5u3 machine at each site.

Installing and configuring VMware ESX

First, I needed to make and modify a VM to get ESX working smoothly in a VM. I’ve preconfigured it as good as possible, including some modifications to the VMX-file. You can download it here.

Note: This VM does not have ESX installed. You’ll need to download the ISO and install it yourself.

The installation itself is pretty standard. I’ve even used the installer-proposed partitioning scheme because for this test-setup, it doesn’t actually matter much. IP-adressing is as stated above. Timezone configured to Europe/Amsterdam.

After installation, I granted the root user access through SSH, filled /etc/hosts with all applicable hosts, configured time synchronization, added a VMkernel port, changed the name of the ‘VM Network’ to ‘VMNET1’ and ‘VMNET2’ respectively and configured the iSCSI initiator. The iqn of the ESX host is iqn.2008-10.local.srm:esx[1-2]. The Target IP is the Virtual IP of either of the LeftHand VSA Management Groups (.84 and .94).

Installing and configuring VMware vCenter

I’ve used a default installation of Microsoft Windows Server 2003 R2 Enterprise SP2 as a base for both vCenter machines. I’ve edited the hostsfile to contain all applicable hosts.

The VM at Site1 will be running a Microsoft SQL Express 2005 service to accomodate four databases, two DB’s for vCenter at each site, and two for SRM at each site. I’ve used ICT-Freak’s blogpost to correctly configure SQL for VMware. The VM at the second site has the SQL Native Client installed. I am aware of the fact that running one SQL instance for both sites is not a recommended situation at all.

This VM will also run LeftHand’s CMC (Centralized Management Console) to manage the VSA’s. Installation of this tool is important for this howto, but too simple to actually document.

Installation of vCenter is default as well, but I chose to install the vCenter role only, skipping the client, Update Manager and Converter.

I’ve created a Datacenter (DC1 and DC2), a Cluster (C1 and C2) and added the ESX hosts into the cluster.

Installing and configuring Lefthand’s VSA

I’m using the slimmed-down laptop version of the VSA. This version requires only 224MB of RAM, which is a good thing in my case, as I only have 4GB of RAM.

Download and unpack the vsa_demo zipfile. Copy the extracted folder so you’ll have two VM’s. Change the network settings to match the other VM’s. Change the display name while you’re at it. Boot both VM’s, and change the hostname and IP-address.

Open the CMC, and start the ‘Find Modules’ Wizard. Enter both VSA IP-addresses.

Start the ‘Management Groups, Clusters and Volumes’ Wizard, and create a new Management Group.

  • Management Group Name: MG1
  • Storage Module: VSA1
  • Username: administrator
  • Password: password
  • NTP: 81.171.44.131 (IP-address from pool nl.pool.ntp.org)
  • Cluster: Standard
  • Cluster Name: C1
  • Virtual IP: 10.10.10.84/24
  • Volume Name: V1 (replication level: none, size 500 MB, provisioning: thin)

Repeat the wizard, reusing previous choices. Modify MG1, VSA1, C1, 10.10.10.84 and V1 to MG2, VSA2, C2, 10.10.10.94 and V2 respectively.

Next, run the ‘Access Volume’ Wizard.

  • Management Group: MG1
  • Volume: V1
  • Choose ‘New Volume List’
  • Volume List Name: VL1
  • Permission Level: Read and Write Access
  • Choose ‘New Authentication Group’
  • Authentication Group Name: AG1
  • Keep the load-balancing option disabled
  • Initiator Node Name: iqn.2008-10.local.srm:esx1

Repeat the wizard, reusing previous choices. Modify MG1, V1, VL1, AG1 and iqn.2008-10.local.srm:esx1 to MG2, V2, VL2, AG2 and iqn.2008-10.local.srm:esx2.

We’ve now created two iSCSI LUN’s. Each ESX-host has r/w access to the LUN on it’s site. SRM will take care of either ESX-host accessing the other LUN when running a Recovery Plan, so there’s no need to give the ESX-hosts access to the other site’s LUN.

We will need LUN-based asynchronous replication. LeftHand Networks uses ‘Scheduled Remote Snapshots fot this. First, we need to create two additional volumes to accomodate the replicated data from the other site. Right click the ‘Volumes’ icon in MG1, and choose ‘New Volume.

  • Volume Name: RS2
  • Size: 500 MB
  • Advanced / Type: Remote

Repeat this step for MG2, creating a volume called ‘RS1’. Now we need to set the time zone to the appropriate zone. Click on MG1, and choose the Time tab. Now choose ‘Time Zone’ in the tasks menu. Set it to your local timezone.

Now we can setup the replication itself:

Navigate to V1, and click the ‘Schedules’ tab. Under ‘Schedule Tasks’, choose ‘New Scheduled Remote Snapshot’.

  • Name: V1->RS1
  • Start at: point to a time about 30 minuten into the future, this gives us enough time to create a VMFS and install a VM on the LUN.
  • Recur: every 30 minutes
  • Remote Snaphot Management Group: MG2
  • Select RS1
  • Retain Maximum of 2 snapshots for both volumes.

Creating a test environment

First, I needed to create two VMFS datastores on the raw LUN’s I’ve created. The VMFS on ESX1 is named ‘ISCSI1, the VMFS on ESX2 is named ‘iSCSI2’. I’ve created two VM’s, I’ve named them VM1 (on iSCSI1 datastore) and VM2 (on iSCSI2), and installed FreeNAS, as it’s installation is dead simple and doesn’t use too much diskspace. By now the SAN replication should start pretty soon, so we’ll continue with installing SRM:

Installing VMware Site Recovery Manager

Installing SRM is quite easy, just fill in the correct details for the vCenter server (see figure 2), site data (see figure 3) and database (see figure 4).


Figure 2. Screenshot of SRM installation – vCenter


Figure 3. Screenshot of SRM installation -Site


Figure 4. Screenshot of SRM installation -Database

After the SRM installation, you’ll need to install the LeftHand Networks Storage Replication Adapter. This is as easy as ‘Next’ – ‘Next’ – ‘Finish’. Just remember to restart the VMware SRM service after installation.

Configuring VMware Site Recovery Manager

Note: I’m using Microsoft Windows Server 2008 (64 bits) as my primary OS. Configuring SRM gave me wierd errors while I was creating Recovery Plans. Using Windows Server 2003 (one of the VC/SRM hosts) with the VIClient worked like a charm.

We’ve now got allmost all bits and pieces in place. We only need to configure SRM. Be sure that the two VMFS volumes contain VM’s and have synchronised.

Launch the VI Client, and install the SRM plugin. Restart the Client, and enable the plugin. Now you’re ready to get going, but first, I want to explain how the SRM product is configured. We’re creating a slightly more complex SRM setup, because each site will be protected by the other.

Let’s start by connecting both VC/SRM sites. Notice that this will automatically create a reverse connection on the other SRM.

Figure 5. Screenshot of Remote Site connection

Now we can add an Array Manager.

Figure 6. Add Array Manager

Note that the protection array manager at site 1 is ‘10.10.10.83’. That’s the VSA at site 1. The Recovery array manager is ‘10.10.10.93’. When configuring this for SRM at site 2, reverse the IP-addresses. Since I’ve only got one VSA per site, I have to fill in the same IP-address twice to get it working.

Notice that if you’ll see a green ‘OK’ icon when SRM correctly identifies the connected LUN and it’s replicated snaphot.

Figure 7. Correctly configured the Array Manager.

Now configure the Inventory Mappings:

Figure 8. Inventory Mappings

The last entry in the ‘Protection Setup’ is the ‘Protection Group’. Add one:

Figures 8, 9 and 10. Adding a Protection Group.

We’ve now configured Site 1. Repeat above steps to configure Site 2. The Site Connection has already been made, so configure the Array Manager, Inventory Mappings and a Protection Group. Reverse IP-addresses and such.

Now we will be creating Recovery Plans. A Recovery Plan for Site 1 is configured at Site 2, and vice versa. This will ensure you can execute the plan when the VC/SRM at the other site is down (for which you bought SRM in the first place!).

Figures 11, 12 and 13. Creating a Recovery Plan.

Conclusion

I’ve shown you a way to completely set up two datacenters, with VMware ESX, vCenter and Site Recovery Manager roles, as well as a LeftHand Networks VSA SAN. All data is replicated between the sites, with SRM protecting both Virtual Infrastructures. This way, either of your locaties can burn down. I’ve show you all steps neccesary to get your SAN-replication in working order, as well as a (very simple) Site Recovery Manager setup. You’re now ready to test the Recovery Plans on both sites.

Please let me know if you’ve got your setups working using this guide! I’ll be posting a result of a test of one of the Recovery Plans and of a complete failover soon, so everyone can see what this post is eventually all about.