Leo’s Ramblings Rotating Header Image

A royal pain: VCB not backing up a VM

We had an issue with a VM not backing up – it kept failing with the following error:

[2009-04-15 13:49:39.230 'vcbMounter' 908 error] Error: Could not back up config file: [VM10_Production01_DC1_T4] FBEAV//vmware-59.log
[2009-04-15 13:49:39.246 'vcbMounter' 908 error] An error occurred, cleaning up...

Now, as anyone can see, this is a wholy annoying and useless error, as it literally does not explain anything.

These are the steps I took:

  • rm -f vmware-59.log didn’t work as the file was locked
  • VMotioning the VM did not fix the issue
  • A cold reboot didn’t fix the issue.

I was at my wits’ end and had no idea how to unlock a file on a VMFS datastore.

So here’s what I did:

  • On the server where the VM was running I ran vmkfstools -D /vmfs/volumes/VM10_Production01_DC1_T4/FBEAV/vmware-59.log - this dumps some info into the /var/log/vmkernel log file:
Apr 15 16:21:55 infpevm003g vmkernel: 6:06:19:50.297 cpu0:1049)FS3: 130: <START vmware-59.log>
Apr 15 16:21:55 infpevm003g vmkernel: 6:06:19:50.297 cpu0:1049)Lock [type 10c00001 offset 51625984 v 684, hb offset 3907072
Apr 15 16:21:55 infpevm003g vmkernel: gen 664, mode 0, owner 46c60a7c-94813bcf-4273-0022191524b9 mtime 5537093]
Apr 15 16:21:55 infpevm003g vmkernel: 6:06:19:50.297 cpu0:1049)Addr <4, 94, 184>, gen 666, links 1, type reg, flags 0x0, uid 0, gid 0, mode 644
Apr 15 16:21:55 infpevm003g vmkernel: 6:06:19:50.297 cpu0:1049)len 134434, nb 1 tbz 0, zla 1, bs 4194304
Apr 15 16:21:55 infpevm003g vmkernel: 6:06:19:50.297 cpu0:1049)FS3: 132: <END vmware-59.log>

What we need from the above is the bold bit – it’s the uuid of the ESX server which has a lock on that file.

  • On every ESX server in the cluster, run the following: esxcfg-info | grep -i “system uuid” | awk -F”-” ‘{print $NF}’
  • On the server where the returned result is the bold uuid above, run the following command: ps -elf | grep FBEAV and then kill the resulting process.

However, in my case, the last step produced nothing. After some research, I just gave up and ran: service mgmt-vmware restart

And hey presto! It worked. I can now VCB that server again.

:)

Cheers,

Leo

2 Comments

  1. vamsi says:

    hey

    just love to see such research in vcb

    cheers leo
    vamsi

Leave a Reply