Isolation mode on VMware ESX servers | Rickard Andersson

Isolation mode on VMware ESX servers

Last week at work, we had a network engineer look into some issued we’d been having with one of the switches in the server room. The switch in question was the main LAN switch to which all our servers are connected. Turns out, the switch needed to be rebooted to solve the problem. We didn’t think anything of it apart from the fact that the servers would be off the net for a few minutes.

The morning after we discovered, as we logged in to a few of the servers, that all virtual machines had shut down unexpectedly the night before. My first instinct was that the network engineer must have disconnected the power to the VMware servers for some odd reason and that this was the cause for the disturbance. But after some digging around, we found that none of the ESX servers had rebooted. So why had all the virtual machines shut down?

Turns out, there’s a feature in ESX called isolation mode. When you’ve setup your ESX cluster for HA (High Availability) and an ESX server loses contact with the other ESX servers and with its gateway, the server considers itself to be isolated and the default action when isolated is to shut down all virtual machines! In other words, if you have to disconnect your ESX servers from the network, look into isolation mode first.

I’m still not sure why the default is to shut down the machines. I mean, won’t forcibly shutting down a virtual machine potentially cause more problems than letting it run without a network connection?

5 comments

  1. David
    Posted February 26, 2009 at 12:55 | Permalink

    “I’m still not sure why the default is to shut down the machines. I mean, won’t forcibly shutting down a virtual machine potentially cause more problems than letting it run without a network connection?”

    Oh yes. Definitely. No doubt at all in my mind.

    /David

  2. Theo
    Posted February 26, 2009 at 21:10 | Permalink

    The single isolated vmware host has no way to tell if the same thing happened to the other servers so it seems very logical.

    The only thing the vmware host can safely assume is that the other vmware hosts in the cluster will mark the hosts as failed and start the guests that were running on the failed vmware host.

  3. Posted February 26, 2009 at 22:17 | Permalink

    You have a point there. The scenario would be similar to when a host has failed.

    David: Hohoho :)

  4. Biju
    Posted March 5, 2009 at 12:26 | Permalink

    Yes, Theo is right. In Isolation mode the ESX server will shut down all the VM’s to prevent split brain scenario. That is One ESX Host won’t know what the Other ESX host in the cluster is doing with its VM.

    Hence it shuts down VM (not gracefully though). This will ensure that the VM files are unlocked from the VMFS and can be started from other ESX server.

  5. Paul
    Posted December 31, 2009 at 22:32 | Permalink

    Good news, probably. VMWare KB article 1003165 dated 8-Dec-09 says “For VMware VirtualCenter 2.5.x and later, the default isolation response is Leave VM Powered ON.”

Post a Comment

Comments are moderated. Your email is never published nor shared.