Tuesday, August 20, 2013

Re-sync when everything is outa sync

Re-sync when everything is outa' sync


Since vSphere Replication hit last year, I have had to walk countless people through a vSphere Replication re-deploy. The re-deploy is for another post but what I want to cover here is the BEST way to re-create your replications with the least amount of hassle.

The setup:


I have my VM, cLevingerAD replicating successfully from our production site to the DR site. I have shown it here in SRM but it could be in the 5.1 web client as well.

The problem:


I need to stop replication and re-start it for some reason. This could be for a million reasons. Some of the common ones are you need to make a change to the VM, you need to re-deploy vSphere Replication, you failed over and now you need to reverse replication, you need to stop replication for some business reason but want to enable it later. Whatever the reason, you have a need, let's give you a solution.

So, how do I go about this? Well, you could hit the "Remove Replication" button and then just re-replicate everything AGAIN but this isn't the best way to go about this. Instead, we can preserve the remote VMDKs and use them as initial seeds. This means we don't have to replicate any of the already-replicated information. vSphere Replication will go through the disks at the Production and DR sites and compare them. It will figure out what is different and then only replicate the changes made while replication was off.

The procedure:


So how do we do this magic? Easy, first, we want to pause replication.
This ensures that no operations will go through while we make changes to the back-end storage.

Next, we need to change the name of the VMs folder at the DR site. I usually add "(hold)" to the end of the folder name. When we remove replication late, the vSphere Replication Appliance looks for the name of the folder from when replication was initially created. Since it's no longer there (cLevingerAD  ≠ cLevingerAD(hold)) vSphere Replication will leave this new folder alone.





After changing the folder name, we can safely remove replication.


Now we can make all the changes we want to the VM. Once we are done messing around with it, we can re-enable replication using the old disks as initial full seeds. The first thing we want to do is change the folder name back to the original name. This isn't absolutely necessary but a good idea so that all of the folder names are the same. 



After re-re-naming the folder, we can enable replication for the VM. Click the VM, then click the vSphere Replication tab and click "Configure Replication". This will bring up the vSphere Replication configuration page. 




Here is where we work our magic. When it asks for the destination, we are going to specify the old datastore. 




Hit OK and if you did everything right, a message should pop up saying that an initial seed was found and do you want to use it. Duh, of course we want to use it.




 Finish the configuration (note that on the summary page next to "initial seed found" we see "yes").






So you finish this and you expect to see a regular sync going through. WRONG. You will see "initial full sync" just like you would if you replicated from scratch. 




So what was the point of all of that?! This is normal. The initial full sync is mapping the 2 VMKDs and only replicating the changes made. It will take a little longer than a regular sync due to the mapping processes but not nearly as long as replicating ALL the data again. 

All in all, this process can be used for a lot of different reason, re-deploying being one of them that I will cover in another post, but hopefully this sheds some light on how to avoid re-replication and make your vSphere Replication experience a little better (and faster!).

Thanks for reading and don't forget to follow me on Twitter! @SRM_Guru

**********************************************Disclaimer**********************************************
This blog is in no way sponsored, supported or endorsed by VMware. Any configuration or environmental changes are to be made at your own risk. Casey, VMware, and any other company and/or persons mentioned in this blog take no responsibility for anything.  

Wednesday, August 14, 2013

De-mystifying the multi-site SRM installation

De-mystifying the multi-site SRM installation


There are many complexities to a single-site VMware vCenter Site Recovery Manager (SRM) installation so when most people hear "multi-site SRM installation" their eyes roll back and they foam at the mouth. This shouldn't be the case. Unfortunately, there isn't much SOLID documentation on what is needed for a multi-site install and exactly how to do it (there is a link at the bottom of this page to the VMware-supplied documentation). I aim to fix that. This post is going to cover not only the theory behind the multi-site install but also a step-by-step walk through of the entire installation.

The Theory

What do I need for a multi-site configuration? Can I fail over from all sites to all sites? How do I connect to all of my sites? These are all questions that, unfortunately, the documentation out there doesn't cover very well. This is, by no means, an exhaustive list of all the questions you might have but I am aiming to hit the big ones. 

What do I need for a multi-site SRM configuration?

Well the first thing you need is a better term. "Multi-site" implies that vCenter is going to communicate with more than 2 sites at a time. This is wrong. The current limitation is that vCenter can communicate with 1 and only 1 pair of SRM servers at a time. That being said, vCenter CAN be paired to multiple pairs of SRM servers, hence "Multi-site". 

The illustration below is a typical 3 site SRM configuration:




In this configuration, we have 7 servers: Production VC, Production SRM Alpha, Production SRM Beta, DR VC Alpha, DR SRM Alpha, DR VC Beta and DR SRM Beta. This can be any mix of physical and virtual servers you like, the only limitation is that you can't have the 2 Production SRMs on the same box.  (This is showing best practices which is to have all services deployed on their own servers, be they virtual or physical. Some people like to consolidate this by putting the vCenter and SRM services on the same server. This WILL work, the only limitation is, as stated above, you can't have the 2 Production SRMs on the same box).

As you can see, the production vCenter is connected to both SRM pairs through 1 line. This is an important observation because, as I said before, the vCenter can only be connected to 1 pair of SRM servers at a time. 

Can I fail over from all sites to all sites?

Purple (yes, no, sort of). Since vCenter can only connect to 1 pair of SRM servers, you can't share a connection. In the example above, this would mean that you can not fail over a VM from DR Alpha to DR Beta. You can fail a VM from DR Alpha to Prod, from DR Beta to Prod, from Prod to DR Alpha and Prod to DR Beta. This means that, in a way, you could do a fail over from DR Alpha to DR Beta. To do this, you would need to fail over from DR Alpha to Prod and then from Prod to DR Beta. One might ask "Is there a better way to do this"? Don't worry, we are getting there.

So how do I fail over directly from all sites to all sites?

A MULTI multi site configuration (don't worry, it's not as bad as it sounds).

The illustration below is how you would accomplish this:



In this configuration, all sites have a direct link to each other. This means that you can directly fail over from any site to any site. This would be a great model if you have multiple production sites and you want them all to be able to protect each other. While this isn't a typical configuration, the potential here is great. You can greatly increase the flexibility by only adding 2 more SRM server and doing one more install. In my eyes, this is the best multi-site SRM configuration.

WARNING** The information above has NOT been tested in my labs so I cannot guarantee it will work. When I get the chance, I can test it or if somebody has this already let me know but use this method at your own risk. 

I'm sick of theory, let's get to the nitty gritty install


Alright you asked for it. Below is a step-by-step walk-through for the install of the SRM multi-site configuration. Below each picture is an explanation for exactly what is going on in the step as well as what to note for the next install (remember, you are going to do this 4 times). One thing to keep in mind is that this is one out of 4 installs. Each one will NOT be identical. Hopefully if you are walking through the first one, the next ones will make more and more sense (and if you have questions, tweet them to me @SRM_Guru). Also, for security purposes, I have blurred out any IP addresses, FQDNs or anything else that may have confidential information in it. I describe any fields that are not self explanatory in the description of the image.



Step 1.
To run the Multi site install, you need to run the installer from the command line. use the command

#VMware-srm-5.1.0-941848.exe /V"Custom_SETUP=1"

Step 2.


Step 3.


Step 4.


Step 5.
vSphere Replication is not required but you might as well install it and try it out unless you really don't want it


Step 6.
vCenter Server Address should be the Fully Qualified Domain Name (FQDN) rather than IP if at all possible to avoid issues in the future.

Step 7.

This Security warning is normal as long as you are using self-signed (not custom) certificates

Step 8.


Step 9.
This can be anything and doesn't make a difference when you are pairing sites. Make is something that makes sense but don't fret over what you make it.


Step 10.
Local site name should be the name of the site. Most people use the name of the vCenter here. Local host name should be FQDN instead of IP.


Step 11.
Make sure you use the Custom SRM Plugin Identifier here. This is the "Multi site" option.


Step 12.
This SRM ID is what is shared between sites. You can see these in the images above under theory. Make sure you write these down because each pair of SRM servers MUST share the same SRM ID.


Step 13.
User name and password should be the credentials for the SRM DB.


Step 14.
And you're done!

Step 15 is to rinse, lather, repeat. You will need to run the installer on all 4 SRMs the same way. The production pair and the DR pair should share the SRM ID. In the graphic below, you can see that the lines between the 2 SRM servers for Alpha and the line between the 2 SRM server for Beta share the same SRM ID (this is also what is used up in step 12). This is really the only difference in the multi site install versus the regular install.



Hope this help somebody, I know I wished this was out there my first time installing it and don't forget to follow me on Twitter @SRM_Guru thanks all!

VMware documentation:
**********************************************Disclaimer**********************************************
This blog is in no way sponsored, supported or endorsed by VMware. Any configuration or environmental changes are to be made at your own risk. Casey, VMware, and any other company and/or persons mentioned in this blog take no responsibility for anything.  














VMware documentation: