Wednesday, August 14, 2013

De-mystifying the multi-site SRM installation

De-mystifying the multi-site SRM installation


There are many complexities to a single-site VMware vCenter Site Recovery Manager (SRM) installation so when most people hear "multi-site SRM installation" their eyes roll back and they foam at the mouth. This shouldn't be the case. Unfortunately, there isn't much SOLID documentation on what is needed for a multi-site install and exactly how to do it (there is a link at the bottom of this page to the VMware-supplied documentation). I aim to fix that. This post is going to cover not only the theory behind the multi-site install but also a step-by-step walk through of the entire installation.

The Theory

What do I need for a multi-site configuration? Can I fail over from all sites to all sites? How do I connect to all of my sites? These are all questions that, unfortunately, the documentation out there doesn't cover very well. This is, by no means, an exhaustive list of all the questions you might have but I am aiming to hit the big ones. 

What do I need for a multi-site SRM configuration?

Well the first thing you need is a better term. "Multi-site" implies that vCenter is going to communicate with more than 2 sites at a time. This is wrong. The current limitation is that vCenter can communicate with 1 and only 1 pair of SRM servers at a time. That being said, vCenter CAN be paired to multiple pairs of SRM servers, hence "Multi-site". 

The illustration below is a typical 3 site SRM configuration:




In this configuration, we have 7 servers: Production VC, Production SRM Alpha, Production SRM Beta, DR VC Alpha, DR SRM Alpha, DR VC Beta and DR SRM Beta. This can be any mix of physical and virtual servers you like, the only limitation is that you can't have the 2 Production SRMs on the same box.  (This is showing best practices which is to have all services deployed on their own servers, be they virtual or physical. Some people like to consolidate this by putting the vCenter and SRM services on the same server. This WILL work, the only limitation is, as stated above, you can't have the 2 Production SRMs on the same box).

As you can see, the production vCenter is connected to both SRM pairs through 1 line. This is an important observation because, as I said before, the vCenter can only be connected to 1 pair of SRM servers at a time. 

Can I fail over from all sites to all sites?

Purple (yes, no, sort of). Since vCenter can only connect to 1 pair of SRM servers, you can't share a connection. In the example above, this would mean that you can not fail over a VM from DR Alpha to DR Beta. You can fail a VM from DR Alpha to Prod, from DR Beta to Prod, from Prod to DR Alpha and Prod to DR Beta. This means that, in a way, you could do a fail over from DR Alpha to DR Beta. To do this, you would need to fail over from DR Alpha to Prod and then from Prod to DR Beta. One might ask "Is there a better way to do this"? Don't worry, we are getting there.

So how do I fail over directly from all sites to all sites?

A MULTI multi site configuration (don't worry, it's not as bad as it sounds).

The illustration below is how you would accomplish this:



In this configuration, all sites have a direct link to each other. This means that you can directly fail over from any site to any site. This would be a great model if you have multiple production sites and you want them all to be able to protect each other. While this isn't a typical configuration, the potential here is great. You can greatly increase the flexibility by only adding 2 more SRM server and doing one more install. In my eyes, this is the best multi-site SRM configuration.

WARNING** The information above has NOT been tested in my labs so I cannot guarantee it will work. When I get the chance, I can test it or if somebody has this already let me know but use this method at your own risk. 

I'm sick of theory, let's get to the nitty gritty install


Alright you asked for it. Below is a step-by-step walk-through for the install of the SRM multi-site configuration. Below each picture is an explanation for exactly what is going on in the step as well as what to note for the next install (remember, you are going to do this 4 times). One thing to keep in mind is that this is one out of 4 installs. Each one will NOT be identical. Hopefully if you are walking through the first one, the next ones will make more and more sense (and if you have questions, tweet them to me @SRM_Guru). Also, for security purposes, I have blurred out any IP addresses, FQDNs or anything else that may have confidential information in it. I describe any fields that are not self explanatory in the description of the image.



Step 1.
To run the Multi site install, you need to run the installer from the command line. use the command

#VMware-srm-5.1.0-941848.exe /V"Custom_SETUP=1"

Step 2.


Step 3.


Step 4.


Step 5.
vSphere Replication is not required but you might as well install it and try it out unless you really don't want it


Step 6.
vCenter Server Address should be the Fully Qualified Domain Name (FQDN) rather than IP if at all possible to avoid issues in the future.

Step 7.

This Security warning is normal as long as you are using self-signed (not custom) certificates

Step 8.


Step 9.
This can be anything and doesn't make a difference when you are pairing sites. Make is something that makes sense but don't fret over what you make it.


Step 10.
Local site name should be the name of the site. Most people use the name of the vCenter here. Local host name should be FQDN instead of IP.


Step 11.
Make sure you use the Custom SRM Plugin Identifier here. This is the "Multi site" option.


Step 12.
This SRM ID is what is shared between sites. You can see these in the images above under theory. Make sure you write these down because each pair of SRM servers MUST share the same SRM ID.


Step 13.
User name and password should be the credentials for the SRM DB.


Step 14.
And you're done!

Step 15 is to rinse, lather, repeat. You will need to run the installer on all 4 SRMs the same way. The production pair and the DR pair should share the SRM ID. In the graphic below, you can see that the lines between the 2 SRM servers for Alpha and the line between the 2 SRM server for Beta share the same SRM ID (this is also what is used up in step 12). This is really the only difference in the multi site install versus the regular install.



Hope this help somebody, I know I wished this was out there my first time installing it and don't forget to follow me on Twitter @SRM_Guru thanks all!

VMware documentation:
**********************************************Disclaimer**********************************************
This blog is in no way sponsored, supported or endorsed by VMware. Any configuration or environmental changes are to be made at your own risk. Casey, VMware, and any other company and/or persons mentioned in this blog take no responsibility for anything.  














VMware documentation:

4 comments:

  1. Is this a supported configuration? For example if i had EMC RecoverPoint doing 3-site replication, would i be able to failover with SRM like this: A --> B then B --> C and then C --> A ??

    ReplyDelete
  2. Hi Marcin,

    I have never had a chance to actually deploy this in my lab but in theory, this should be supported and yes, if you set it up like this you could theoretically do this however, you would have to have 1 intermediary step between all of these transfers and that would be to storage vMotnion (or cold migrate) the VM from the replicated storage between the first 2 sites to the replicated storage between the second 2 sites as these can't be shared with any current SAN technology (that I know of at least).

    For example, let's say you have TestVM on site A, it's on datastore RepliactedLUN-AtoB. You could fail TestVM from site A to site B. Now that you are on site B though, the storage that the VM is currently sitting on isn't replicated to site C (remember, it's only replicated between site A and B). To get over this hump, you would need to now storage vMotion TestVM from ReplicatedLUN-AtoB to ReplictedLUN-BtoC (which is replicated between sites B and C. Once the storage vMotion was completed and the SAN replication had completed, THEN you could migrate the VM from site B to site C. Once at site C, to get to site A you would need to complete a similar process. This is obviously using SAN based replication but vSphere Replication would have very similar requirements, the need to replicate from the destination site of the first migration to the destination site of the subsequent migration. I hope this answers your question and if you do deploy this, please let me know how it goes!

    ReplyDelete
    Replies
    1. Thank you for your reply. Looks like that it might work but requires a lot more storage and some manual work. Hope VMware will come up with something on SRM to support 3 to 4 site. Thanks!!!

      Delete
  3. This has helped me setup multi-site SRM with 2 protected sites backing up to one DR site. So I do have one question. When i did the install I installed ver. 5.1 coz our vcenter was 5.1. Now we are going to upgrade our vCenter to 5.5 so whne i run the upgrade on the SRM should i use the command line custom install???

    ReplyDelete