Somethings to think about when upgrading to vSphere 4

The main things to think about:

Upgrading from vSphere 4 is supposed to be a walk in the park. I prefer to scratch all the ESX hosts instead of hoping that the update process works 100% as it should do. vCenter upgrading is a next next finnish process (hopefully). One thing to remember. If you are scratching your vCenter, leave your old vCenter running until all your old hosts are migrated to ESX 4.0. Because the licensing system has changed, you will need to keep the license server running on you old vCenter and configure the vCenter vSphere version to delegate the work for the old ESX hosts to the old license server (Administration > VirtualCenter Management Server Configuration > License Server. ). You could also install an old license server on your new vCenter and transfer the licenses. However this is not worth the effort because once all the hosts are "upgraded" you don't need it anymore. Also a small remark here, if you have tools that intergrate with vCenter, you will need an own game plan. For example, VMware view is tightly integrated with vCenter

If you scratched your vCenter, remember that you will have to move all the settings from vCenter old to vCenter new. I'm talking about

  • VirtualCenter Management Server Configuration (Like smtp and snmp)

  • Resource pool settings and cluster settings (HA - DRS - DPM). For example I have a rule that put the DB for vCenter and vCenter on the same hosts so they can communicate localy

  • Virtual Machines and templates folders

  • Alarms and automated tasks

  • Permissions

  • Advanced settings, although mostly you don't change these

  • Copy sysprep from a to b. Just a reminder, it is under "[X]:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\sysprep\"

Don't kill if me if this list is incomplete

Next you will need to disconnect and remove the ESX from the original cluster and add the hosts to the new cluster. This will not stop the running VM's on your ESX hosts. However disabling HA might be a good thing while moving hosts around. If you don't disable DRS you will keep your resource pools but they will get ugly at the other site. So make sure you have overcapacity.

Some things you should know about your ESX hosts before you scratch them.

  • FQDN / IP / Subnet / Gateway / DNS / NTP server. You will need these settings while installing ESX. You also will need to know the mac address of the physical NIC that is being used for your service console + the vlan your service console is in

  • Partitioning. If you have custom agents installed this is especially true. I like to oversize my ESX console a lot if the local disk are not being used. Just because I can. Here is an example "oversized" esx console
    /dev/sdh9 4.9G 1.6G 3.1G 34% /
    /dev/sdg1 1.1G 75M 952M 8% /boot
    /dev/sdh6 2.0G 36M 1.8G 2% /home
    /dev/sdh8 2.0G 88M 1.8G 5% /opt
    /dev/sdh7 3.9G 72M 3.6G 2% /tmp
    /dev/sdh5 3.9G 211M 3.5G 6% /var
    /dev/sdh2 3.9G 74M 3.6G 2% /var/log
    The boot partition was configured automatically. I also create a higher swap size (1600mb). Ow and remember if you had agents installed, you will have to reinstall them :)

  • Time zone

  • Have a valid license key

  • Make scripts or a clear layout how your virtual networking is done. Important settings to think about are how you NICs are bound to your vswitches. Write down the mac address so you can link mac address to vswitches. I have seen NIC numbering (vmnicx vs mac) being changed while scratching. Also check your port groups. They might overwrite you vswitch settings. For example, I like to put vMotion on a active/standby NIC configuration. I mostly overwrite this in the port group if i don't have a lot of physical NICs. Also make sure you have consistent port groups names!

  • Know the firewall setup, it is there under advanced settings. For example you might need to open up the ssh client so that you can scp/ssh between hosts

Writing down all the settings/configuration before you begin helps. You can upgrade the hosts one by one following your scheme

and watch out for theses ..

  • If you are able to disconnect the SAN, do so! If not check carefully that you don't overwrite a lun while installing

  • On HS ibm blades, you will need to enable the vt instructions and the no execution disable bit in the bios. You'll find it under advanced settings -> cpu

  • Check if the local datastore is empty!

Before you add your scratched hosts to the cluster MAKE SURE THE NETWORK IS CORRECTLY SETUP. You don't want VMs to end in vlans that are not connected to the right uplinks. I also love to ping and vmkping from the console. Also check that you can lookup your hostname, fqdn and your reverse ip.

Some fun extra's here :

Ofcourse our job wouldn't be any fun if you didn't have troubles. Before you start check that all the VMs are using hardware 4/vmware tools. You cannot vmotion between 3.5 and 4 if the hardware is version 3. So before you do anything, check all the vms version and tools. (I learned the hard way that upgrading from 3 to 7 when everything is transfered can be a bummer).

While moving a vm, I also encoutered this nice error. "Troubleshooting Migration compatibility error: Virtual machine has CPU and/or memory affinities configured, preventing VMotion". Disabling the cpu affinity in the advanced settings of the VM didn't fix the issue. The VM was still starting on the wrong host. Removing the following out the vmx manually fixed the problem.
sched.cpu.affinity = "0"
sched.cpu.htsharing = "any"
sched.mem.affinity = "all"
Ofcourse this will require downtime of the VM :(

Installing an rpm via yum is possible. I always used rpm -Ivh but you can also use yum localinstall –disablerepo=* . This is nice when you are installing vmware tools on a redhat family VM and are using the vmware tools ISO (when you choose install guest tools in the menu).

PCoIP can frack up you VM's Resolution. We had a user that started a PCoIP session with a VM. We are not sure if the session was not cleanly shut down but when we connected with the console in vCenter our screen size was fracked. Rebooting the vm, didn't help. The only thing that worked was logging on with PCoIP and resize the screen with the PCoIP client.