An interesting issue came up recently with a customer doing a CX300 > CX3-40 migration, while keeping the hosts the same.
All VMs were shut down and the LUNs were SANCOPY’d across to the new SAN. The servers were then re-attached to the same LUNs, except ESX would not pick them up.
Looking into /vmfs/devices/lvm revealed that the usual host-WWN-serial device descriptors were prefixed by the snap- artefact.
Trawling the VMware forums revealed a few options that can be set per host.
Before we go anywhere, lets go through some simple steps. According to the KB here we need to make sure that the LUN Host IDs on the new SAN match those IDs as on the new SAN.
If the above is not possible due to other LUNs occupying those IDs or it doesn’t provide the result required:
On each host, in VirtualCenter/Virtual Infrastructure Client, go into Configuration -> Advanced Settings -> LVM. Set LVM.EnableResignature to 1 and LVM.DisallowSnapshotLun to 0. Then rescan the storage adapters again.
At this point the logical data-store names come up fine, except that similarly, a data-store that used to be called T1SAN is now called snap-T1SAN and renaming it is not possible.
Why does this happen?
It mainly has to do with the .vmx files for each and every VM and Template. This is the structure of the .vmx file:
#!/usr/bin/vmwareconfig.version = “8″virtualHW.version = “4″floppy0.present = “false”
nvram = “eswsdr02.nvram”
powerType.powerOff = “default”
powerType.powerOn = “default”
powerType.suspend = “default”
powerType.reset = “default”
displayName = “eswsdr02″
extendedConfigFile = “eswsdr02.vmxf”
numvcpus = “2″
scsi0.present = “true”
scsi0.sharedBus = “none”
memsize = “3000″
scsi0:0.present = “true”
scsi0:0.fileName = “eswsdr02.vmdk”
scsi0:0.mode = “independent-persistent”
scsi0:0.deviceType = “scsi-hardDisk”
ide0:0.present = “true”
ide0:0.clientDevice = “true”
ide0:0.deviceType = “cdrom-raw”
ide0:0.startConnected = “false”
ethernet0.present = “true”
ethernet0.wakeOnPcktRcv = “false”
ethernet0.networkName = “DRDat”
ethernet0.addressType = “vpx”
ethernet0.generatedAddress = “00:50:56:ad:16:c0″
ethernet1.present = “true”
ethernet1.wakeOnPcktRcv = “false”
ethernet1.networkName = “Oob”
ethernet1.addressType = “vpx”
ethernet1.generatedAddress = “00:50:56:ad:25:45″
ethernet2.present = “true”
ethernet2.wakeOnPcktRcv = “false”
ethernet2.networkName = “DRApp”
ethernet2.addressType = “vpx”
ethernet2.generatedAddress = “00:50:56:ad:5b:57″
guestOS = “winnetenterprise”
uuid.bios = “50 2d b9 14 f3 da 1c d9-22 2b 9c 11 82 1a e3 e3″
log.fileName = “vmware.log”
sched.cpu.min = “0″
sched.cpu.units = “mhz”
sched.cpu.shares = “6384″
sched.mem.minsize = “0″
sched.mem.shares = “3000″
cpuid.80000001.edx = “———–0——————–”
cpuid.80000001.edx.amd = “———–0——————–”
scsi0:0.redo = “”
tools.syncTime = “FALSE”
vmware.tools.requiredversion = “7202″
workingDir = “.”
sched.mem.max = “3000″
uuid.location = “56 4d a7 89 6e 4c 2a b8-0e 1e cf 4b 0b 1f 49 45″
migrate.hostlog = “./eswsdr02-95db24d5.hlog”
sched.swap.derivedName=”/vmfs/volumes/474c0c81-b90c9cd0-c6e1-0015173a4 82a/eswsdr02/eswsdr02-95db24d5.vswp”
sched.cpu.max = “11042″
In the above we can see that the sched.swap.derivedName parameter refers to a volume ID. When we SANCOPY, that ID changes but the metadata remains the same, confusing ESX.
Furthermore, when a machine is registered in ESX, it is registered to a certain LUN by the volume ID.
You cannot remove volumes/LUNs that appear to be in use by the ESX server. That’s where the problem lies.
In VirtualCenter make a note of which VMs lie on which server. Unplug all Fibre cables from the back of the ESX servers that lead to any storage array. You don’t want to make a mistake here.
Then Remove from Inventory all disconnected VMs and Templates. On each host, in VirtualCenter/Virtual Infrastructure Client, go into Configuration -> Advanced Settings -> LVM. Set LVM.EnableResignature to 1 and LVM.DisallowSnapshotLun to 0. We no longer need these.
Once all VMs and Templates have been removed, in the Virtual Infrastructure Client/VirtualCenter, hit the Ctrl+Shift+D combination on your keyboard. Under Datastores you should see only the local VMFS on the local disks of your servers.
Congratulations, you’ve just cleaned out your ghost LUNs.
Plug in the fibre again and rescan. Voila! Your LUNs should be back without any weird prefixes.
Now comes the fun part, re-registering all your VMs – this can be done by double clicking on every datastore re-found and browsing the VM folders within, adding the .vmx files to the inventory.
However this may be more time consuming as each addition requires the typing in of the server name as it should be called in VirtualCenter. This may not always amtch up to the folder name in the datastore.
The best way to overcome this is to open a console to each ESX server, look at your list of VMs that you noted down before unplugging the Fiber. On each console, do this for every VM:
vmware-cmd –s register \ /vmfs/volumes/datastore/vmachine/vmchine.vmx
As an example, if I had a datastore called LUN1 and a VM stored in a folder called ABC I would do:
vmare-cmd –s register /vmfs/volumes/LUN1/ABC/ABC.vmx
What this does, is it reads all the data in the .vmx file and assigns the previously assigned name – the correct one, in other words.
Done.
Great, just the technote I was looking for! You wouldn’t believe it, but I was working at a site just before xmas 2007 and ran into this exact problem!