Friday, April 12, 2013

Raw Device Mappings, SCSI Bus Sharing and VMotion

I keep bumping into this issue time and time again and find myself not using the exactly right terminology to explain it, it seems. Just today I was talking to Ben and again we disagreed on the topic, at least to some extend. We did not end up arguing as I have before during a job interview, but settled for a draw.

So once and for all (and mostly just for my brain to remember the terminology by writing it down): VMs using Raw Device Mappings (applies to physical and virtual) and SCSI Bus Sharing (Option "Physical" for the SCSI controller, reads: "Virtual disks can be shared between virtual machines on any server.") cannot be vmotioned! See also KB1003797.

The reason being (correct me if I'm wrong, storage is not my strongest side) is that the VM's virtual SCSI controller is mapped through to the physical SCSI controller or rather HBA of the host giving the VM exclusive and direct access to the SCSI device.

When to use this configuration?

In order to run certain configurations of a few clustering products, such as Oracle RAC, on VMware ESXi you may need a shared storage device. If you want to run a two node cluster of any sort by putting VM1 on Host A and VM2 on Host B to maximize your failover capacity, you have quite a few options to set up your shared storage devices. Shared VMDK comes to mind, just add a VMDK to VM1 and reuse the same for VM2. However this setup does not support concurrent write access (for O10R2 RACs on RHEL4 and 5 this means node crashes). Software iSCSI inside the VM can also be utilized and will give you full VMotion capability, as it only relies on a network connection, but you may not get the performance you want/need. Lastly adding a raw device mapping on a separate virtual SCSI controller to maximize performance is an option. The SCSI controller has to be configured as "physical". When the above two configurations still allowed you to migrate the VMs, having this setup will greet you with an error message saying that the VM is configured with a device that prevents migration.

Light at the end of the tunnel!

There is, however, a fully supported way of doing things now. With the introduction of Fault Tolerance it became a necessity to be able to simultaneously write to a VMDK file.

Enter the multi-writer flag (KB1034165).

Disabling concurrent write access protection of a VMDK will solve the problem of cluster nodes blocking VMotion and thus DRS, and creating a nightmare scenario for host maintenance, where you have to go through the full lengths of your change management process including the shut down of the VMs on the host in question. I have heard numerous positive reports about this mechanism but have yet to give it a whirl myself. In any case VMFS is a capable cluster file system and given the underlying storage system did not fall off a dump truck you should be good to go with this scenario.

Happy clustering!

No comments:

Post a Comment