Leo’s Ramblings Rotating Header Image

VMFS3 recover-ability

Ever had a datastore go pear-shaped on you? I know I have, back in the days when I didn’t know ESX well. There are really only very few reasons this can happen – a failed replication or you’ve accidentally deleted the VMFS partition.

Thankfully, there is a way to restore both virtual machines that have been deleted from VMFS3, and to recover VMFS3 datastores in some cases.

Capturing the VMFS3 datastore header:

Playing with VMFS3 through manual partitioning on the Service Console or errors during LUN replication can corrupt the partition information.

Therefore a regular backup of the partition table and the header information should be done. The way I do it is via a script that rotates over a week-long period – ie. I store VMFS partition data for a period of 7 days.

Here’s how to do it:

On your console, create a file called /usr/bin/part_save and enter the following information into it:

#!/bin/bash
#Create an image of every VMFS datastore header/partition table and metadata
/usr/sbin/esxcfg-vmhbadevs -m |while read i
do
	device=`echo $i |awk -F" " '{print $2}' | awk -F"/" '{print $3}'`
	lunid=`echo $i | awk -F" " '{print $3}'`
	dd if=/dev/$device of=/tmp/vmfsmetadump-$device.bin bs=1024 count=20480
	cp /vmfs/volumes/$lunid/.vh.sf /tmp/$device-vh.sf.bu
	tar cvzf /tmp/archive-$device-`date -I`.tar.gz /tmp/$device-vh.sf.bu /tmp/vmfsmetadump-$device.bin
	rm -f /tmp/$device-vh.sf.bu /tmp/vmfsmetadump-$device.bin
done
#Create a rule to delete every 8'th day image
rm -f /tmp/archive-*-`date -I --date '8 days ago'`.tar.gz

The above will create as many archive-device-date.tar.gz files as you have datastores – this will mean you can recover your partition information with VMware’s support staff to help. And the reason we’re taking 20MB of information in the above dd is because VMFS3 datastores store all partition information and most metadata in the first 20MB of the disk, and in a file on the datastore called .vh.sf.bu

Now, the way to get this to run:

chmod a+x /usr/bin/part_save

crontab -e

At this point you should see the all-familiar crontab screen – add the following information in:

MAILTO=""
0 23 * * * /usr/bin/part_save

This means that every night at 11pm, your partition information will be dumped into /tmp and old partition information will be deleted.

Protecting VMs with vmfs-undelete:

As of ESX 3.5 update 3, VMware have provided us with a utility that can store data about possibly deleted VMs, and allows a way of bringing them back. Which is kind of cool.

Only one problem: the vmfs-undelete utility is interactive and written in Python.

I know nothing of Python – not my skill – I dabble in perl and love my bash scripting. So thank Allah, God and Yahweh that Mike Laspina exists. His excellent post and scripting on Protecting ESX VMFS Stores with Automation ties in neatly with this article and what I’d been planning to do for a while. In fact he uses the same functionality to do a similar thing as the section above but without daily archives. The below is taken entirely from that excellent article:

The scripts that were originally developed by VMware are designed to be user interactive and cannot be used as originally coded therefore I have modified them in order to provision some basic automation. You can access the modified scripts named vmfs-undelete-auto-script and menuauto.py here.

The modified menuauto.py script needs to be placed within the /usr/lib/vmware/python2.2/site-packages/vmware/undeletemods directory. While I could have just modified the existing menu.py script it is subject to change so this method prevents potential conflict issues. The vmfs-undelete-auto-script script location is optional and can be placed where ever you find appropriate. I chose to place it in the /root directory. The script requires a single argument which is to direct it output location with a path and file name. Since there is a potential for conflict with other snapshot based services the script should be invoked using a cron job outside of the daily or other predefined jobs. This cron job can be implemented using the crontab facility. Here is an example of how to create it while logged in as root.

The links in the above lead to Mike’s page – I’ve also attached them to my blog – download the zip file vmfs-undelete. Also, please put the vmfs-undelete-auto-script into /usr/bin

Then run: chmod a+x /usr/bin/vmfs-undelete-auto-script

Let’s modify the file /usr/bin/part_save from the previous section and edit it so it looks like this:

#!/bin/bash
#Create an image of every VMFS datastore header/partition table and metadata
/usr/sbin/esxcfg-vmhbadevs -m |while read i
do
	device=`echo $i |awk -F" " '{print $2}' | awk -F"/" '{print $3}'`
	lunid=`echo $i | awk -F" " '{print $3}'`
	dd if=/dev/$device of=/tmp/vmfsmetadump-$device.bin bs=1024 count=20480
	cp /vmfs/volumes/$lunid/.vh.sf /tmp/$device-vh.sf.bu
	tar cvzf /tmp/archive-$device-`date -I`.tar.gz /tmp/$device-vh.sf.bu /tmp/vmfsmetadump-$device.bin
	rm -f /tmp/$device-vh.sf.bu /tmp/vmfsmetadump-$device.bin
done
#Create a rule to delete every 8'th day image
rm -f /tmp/archive-*-`date -I --date '8 days ago'`.tar.gz
#Create a VMFS map of every single VMFS datastore
python /usr/bin/vmfs-undelete-auto-script /tmp/vmfs-undelete-`date -I`
#Create a rule to delete every 8'th day backup
rm -f /tmp/vmfs-undelete-`date -I --date '8 days ago'`

Crontab will take care of the rest.

Now, for restoring the .bin header images, you can call VMware support and they will guide you through it. Or you can attempt it yourself using the reverse process:

dd if=vmfsmetadump-sda1.bin of=/dev/sda bs=512 count=20480

I recommend going with the VMware support option though – safer. You’ll need to provide them with the following:

  • A recent vm-support dump from the host
  • Current vm-support dump
  • Backup copy of system files
  • dd dump of first 20MB of Disk

Cheers,

Leo

5 Comments

  1. [...] Leo Raikhman публикует скрипт для планировщика заданий, позволяющий в [...]

  2. [...] Leo или Леня радует манулом по восстановлению данных с vmfs. К прочтению [...]

  3. Babuluk says:

    Посмеялся. Норм картинки =))

  4. Great post, i stumbled onto your site and really enjoy the posts. Keep em coming.
    ~ greg

  5. Kesslinsgep says:

    Вы не ответите, почему предыдущий комментарий по этой теме не отображается в комментариях (писал три дня назад)?

Leave a Reply