Friday, April 19, 2013

Migrate vSphere 4.1 to new host and fresher infra

For the past few weeks one of my vSphere installations has been seeing some major problems, that I have not been able to drill down into properly. As I set it up a few years ago and then handed it over to one of my coworkers for management it became over utilized, poorly managed and started acting weird. Recently it has been crashing every three days or so due to the transaction log file filling up. The infrastructure (Essentials License) runs on SQL Express 2005 (the one bundled with vSphere 4.1U2), the DB runs at its default settings (simple recovery model, one transaction log file, auto grow, max size 2GB) and there is a poorly configured backup (my bad, I was a bit of a greenhorn back then). Now what happens is a transaction is running on the DB that fills up the log file and then crashes the VPXD. Not ideal.

Because there are a few other things that bother me about this environment, I figured its about time I moved it to a separated VM and reinstalled it properly, leveraging all the experience of the past 3 years or so and taking into account the best practices of my vSphere Install, Configure and Manage course and certification last year.

The new setup

Firstly I rolled out a Windows 2008 R2 server, 64bit, configured hostname, dns domain, added it to our Samba domain (remember to configure this), installed all available updates and gave it a few reboots for good measure. Once the box was up and running smoothly I installed a fresh SQL Express 2008 with Management Tools from here (I actually installed a few other versions before that only to constantly find compatibility issues with W2K8R2 and so forth, but this one is fully supported). Once that was out of the way I imported an older backup of the production database, all went smoothly.

Next step was to create a data source. Following the documentation (which I did not need for this project, I might say so proudly) I created a System DSN that points to the newly created and populated database and tested it. The appropriate SQL Client libs has already been installed along with the SQL Server Express Edition. Keep in mind for a client server setup you need to install a supported SQL Client, the OS's preinstalled client has a beard longer than Santa Clause... well, you get the point. The Express Edition of SQL 2008 R2 seems to have TCP/IP enabled by default, a thing I seem to remember is different on the full blown servers. Also 2008 does install an SQL agent but don't bother trying to activate it, it will not work, very confusing.

With SQL Express and the data source working smoothly it was time to install the VCenter. At this point I was doing merely a test installation, so I would have to keep in mind to be cautious as to not have it connect to my ESX-hosts right away. The installation is pretty straight forward. Just choose your precreated dsn, let the installer use the existing database instead of wiping it and choose to manually connect your hosts and update the agents. I found, however, that the installation should be done as local admin, not domain admin. This might have something to do with our Samba domain, might be related to something else or might even be in the documentation. It did cost me a few hairs but eventually it wasn't anything that could not be conquered.

Once the installation was finished, I connected to the newly created vCenter and found every thing in place, hosts, vms, folder structures, resource pools, permissions and so on and so forth. Tests finished, ready to roll.

The day of the migration


During off hours this morning (a 6 hour time shift can be very helpful sometimes) I went away, stopped the production and newly created VPXDs, web services and so on, created a new backup of the database and moved it to the newly created environment. Confidently after restoring the database, I wanted to start the VPXD, only to find it crashing right away. What went wrong?

With the newly created installation I had also used the latest (and greatest?) release of 4.1, Update 3. The database of U3 is not compatible to U2, naturally. There might be a quicker way to solve this, I opted for a quick reinstallation of vSphere as this would update the database accordingly. After all, I had no reason to expect failure and, in the unlikely event something did actually go wrong, I could still fail back to the old and tested (and buggy) environment. Everything went as expected though.

Once the setup was finished I connected to the vCenter, reconnected my three hosts and found everything to be working just like it should. Playing around with the vSphere client I noticed only a few things to be off.

  1. vSphere Service Status display was not working
  2. Two of the plugins available for installation are not working as expected.
I had an idea what these issues might be related to. The old setup was not based on FQDNs, but everything was hardwired to IP addresses. The Service Status module would thus try to connect to my old environment, which has been shut
down. This post however quickly resolved issue number one. Just replace the mentioned variables' values with the correct ones, restart vCenter for good measure and reconnect with vSphere client. Now the status is working.

As for the plugins, one is the converter plugin. I tried installing through vSphere client and it pointed to my old IP address. Just install converter on your vSphere host and it will prompt connection data for the new environment thus registering and the installation properly. The second plugin is vcIntegrity, which as far as I can tell, is Update Manager related. I could just go and install the Update Manager on my new box, but I don't want that. For such a small environment, I opt to manage updates manually in the future, so I will have to have a look into how to fix that minor issue (is it really minor?). vcIntegrity also shows up as an error in vSphere Service Status.

Cleanup

I have a Operations Manager Appliance running, which wouldn't reconnect to the new environment. Running very short on time and really treating it as a fire and forget I didn't know how to access the admin UI. I guess I would have been able to rerun the setup wizard, I just redeployed the appliance. The HP Virtualization Performance Viewer however had already been using an FQDN and moving it to the new environment works seamlessly.

I'm happy to see that the migration was rather simple with very little caveats. Log file usage so far is good, will have to wait and see how it develops in the days to come. The DB itself is rather resilient, I found that it has self repairing capabilities recently when I was trying to work around the issues I had. All in all I'm quite happy and think I can go on vacation tonight and have sound nights of sleep without having to worry about a crashing vSphere environment any more.

No comments:

Post a Comment