Leo’s Ramblings Rotating Header Image

A worthy cause: donate

What seems like a lifetime ago, but is in fact only about 3 years ago, I worked at a small Sydney integrator called ENSTOR.

ENSTOR went to the wall but this is neither here nor there – what matters is that I worked with two wholly remarkable an unique women.

One is Siobhan Ellis, the other was Kirsty Rae.

Kirsty Rae died very recently after a battle with cancer.

Through the blog of my former boss, Preston de Guise, I found out Siobhan was doing a 2 week ride on a 1964 Lambretta TV 175 Series 2 from Sydney to Perth to raise money for Breast and Prostate Cancer Research – a distance of some 4100km (2550 miles). All the money will go to Sydney’s St. Vincent’s Hospital and not one cent of your donation will go to administrative functions – all expenses will be met by Siobhan and her supporters.

So please, donate a bit of your hard-earned cash to the memory of a friend of mine and by extensions, to friends of yours, because the truth is, is that we have all been touched by cancer.

Please donate here.

In appreciation,

Leo Raikhman

Powershell: another Dusan Solution(tm) for VCB

The university where I contract, has a solution whereby their VCB policies are based on datastores in which the machines are residing in. ie Netbackup will have a policy that calls a VCB script to snapshot all VMs in a datastore.

Except, there’s a small problem – how do you size the holding tank for the image snapshots if all the VMs in a datastore are small system (C:) volumes and their data disks are spread around other datastores?

The aforementioned colleague of mine – Dusan from DMTECH has (with help from the VMware Communities and a little from yours truly) created a VCB script that will sort VMs per datastore based on which datastore their largest VMDK lies in. This means that the holding tank is always, no matter what, smaller than the previous situation required.

The script outputs to to a bunch of csv files (c:\leo as per below) and outputs the name of the machine prefixed with the holding-tank location (in my case V:\VCBholding) and suffixed with the type of backup (in my case -FullVM):

$file = import-csv C:\temp\CustomField.csv
Get-VM | where-object {$file -notmatch $_.name} | %{
  $hds = @($_ | Get-HardDisk | Sort-Object -property CapacityKb -descending)
  $dsName = ([regex]"^\[(\w+)\]").Match($hds[0].Filename).Groups[1].Value
  $name = "V:\VCBholding" + $_.Name + "-FullVM"
  $name | Out-File -filePath ("C:\leo" + $dsName + ".txt") -append
}

The contents of those text files can then be directly imported into VCB policies in Netbackup.

N.B. The other thing of note is that you might not want all VMs backed up in VCB – so for that, we use a file which gets read in (c:\temp\CustomField.csv). Into that put in (in CSV format) the names of VMs you want to exclude. ie:

Name
virtualmachinename1
virtualmachinename2
virtualmachinename3

One thing to note – the script will not output correctly if you don’t clear the dump folder (c:\leo in this case) – I haven’t implemented a delete function.

:)
Cheers,
Leo

Quick heads up re. vShield: RTFM

So… I got bitten by not reading the manual, or in this case – the admin guide for vShield.

See, enabling vShield makes all VMs communicate via the internalised network – vShield will actually inform you of an error during a migration. The error states that the VM is attached to a virtual intranet.

This intranet is the network that the virtual machine connects to through the vSwitch on the protected side of the vShield, and which does not home a physical NIC. In this case, the vShield is bridging traffic to the unprotected network that is connected to a physical NIC

Disable the virtual intranet check by editing the vpxd.cfg file of the VC server:

  • Locate and edit the vpxd.cfg file on the vCenter Server. This file is typically installed at C:\Documents and Settings\All Users\Application Data\VMware\VMware vCenter by default. Add the following lines as a sub‐level to the config section, and at the same level as the vpxd section:
<migrate>
	<test>
		<CompatibleNetworks>
			<VMOnVirtualIntranet>false</VMOnVirtualIntranet>
		</CompatibleNetworks>
	</test>
</migrate>
  • Save the vpxd.cfg file.
  • Restart the VMware vCenter Server service. You can access the service menu by going to Control Panel > Administrative Tools > Services.

Then, you’ll need to exclude the vShield VMs from being migrated via DRS and make sure to leave the Isolation Response HA settings as:

  • VM Restart Priority: Disabled
  • Host Isolation Response: Leave VM powered on

Cheers,

Leo

Powershell VI3 charge-back scripts

A colleague of mine – Dusan from DMTECH has created two powershell scripts that allow charge-back accounting on an ESX farm that is used to host customer data:

Assuming a CSV file like this:

Name,Business Owner,Chargeable,Commence Date,Cost (Upfront),Cost (Yearly)
Dept1PWAP003,Bob Marley,Y,,”$2,722.41″,$815.00
Dept1PWAP004,Bob Marley,Y,,”$1,707.96″,$476.85
Dept1PWAP005,Bob Marley,Y,,”$2,722.41″,$815.00
Dept1PWAP006,Bob Marley,Y,,”$2,722.41″,$815.00
Dept1PWAP007,Bob Marley,Y,,”$2,722.41″,$815.00
Dept1PWAP008,Bob Marley,Y,,”$5,199.25″,”$1,310.37″
Dept1PWAP009,Bob Marley,Y,,”$2,722.41″,$815.00
Dept1TWAA010,Bob Marley,Y,,”$1,231.27″,$381.51
Dept1TWAA011,Bob Marley,Y,,”$1,231.27″,$381.51
Dept1TWAA012,Bob Marley,Y,,”$1,231.27″,$381.51
Dept1TWAA013,Bob Marley,Y,,”$2,245.72″,$719.66
Dept1TWAA014,Bob Marley,Y,,”$1,231.27″,$381.51

The following script will import the data above by matching VM names and attaching data to the annotations/custom attributes you’ve created (as per the headings in the CSV file – “Name”, “Business Owner”, “Chargeable”, “Commence Date”, “Cost (Upfront)”, “Cost (Yearly)”):

$fields = "Business Owner","Chargeable","Commence Date","Cost (Upfront)","Cost (Yearly)" (Upfront)"
$vms = Get-VM
foreach ($f in (import-csv C:\temp\CustumField.csv)) {
   $vm = $vms | Where { $_.Name -eq $f.Name }
   foreach ($cf in $fields) {
      $vm | Set-CustomField -name $cf -value $f.$cf
   }
}

Now, assuming you have tiered storage, you can charge customers for storage on the basis of Tier – for this your datastore names have to end in _T2/_T3/_T4 according to this script:

$report = @()
get-vm | % {
	$vm = $_
	$T2 = ($_ | get-harddisk | Where-Object {$_.Filename -match "T2"} | measure-object -property CapacityKB -sum).Sum
	$T3 = ($_ | get-harddisk | Where-Object {$_.Filename -match "T3"} | measure-object -property CapacityKB -sum).Sum
	$T4 = ($_ | get-harddisk | Where-Object {$_.Filename -match "T4"} | measure-object -property CapacityKB -sum).Sum
	$_ | Get-Datastore | Where-Object {$_.Type -eq "VMFS"} | % {
		$row = "" | Select Name, MemoryGb,NumCPU,"T2 (GB)","T3 (GB)","T4 (GB)","Business Owner","Chargeable","Commence Date","Cost (Upfront)","Cost (Yearly)"
		$row.Name = $vm.Name
		$row.MemoryGB = "{0:f0}" -f ($vm.MemoryMb / 1Kb)
		$row.NumCpu = $vm.NumCpu
		$row.{T2 (GB)} = "{0:f0}" -f ($T2 / 1Mb)
		$row.{T3 (GB)} = "{0:f0}" -f ($T3 / 1Mb)
		$row.{T4 (GB)} = "{0:f0}" -f ($T4 / 1Mb)
		$row.{Business Owner} = $vm.CustomFields.Values[2]
		$row.Chargeable = $vm.CustomFields.Values[3]
		$row.{Commence Date} = $vm.CustomFields.Values[4]
		$row.{Cost (Upfront)} = $vm.CustomFields.Values[5]
		$row.{Cost (Yearly)} = $vm.CustomFields.Values[6]
		$report += $row
	}
}
$report | select -unique Name, MemoryGb,NumCPU,"T2 (GB)","T3 (GB)","T4 (GB)","Business Owner","Chargeable","Commence
Date","Cost (Upfront)","Cost (Yearly)" | sort -property Name |Export-Csv "C:\chargeback.csv" -noTypeInformation

This produces the following output which specifies the total amount of storage per VM per tier:

Name MemoryGb NumCPU T2 (GB) T3 (GB) T4 (GB) Business Owner Chargeable Commence Date Cost (Upfront) Cost (Yearly)
Dept1PWAP003 4 2 0 70 0 Bob Marley Y $2,722.41 $815.00
Dept1PWAP004 4 1 0 0 70 Bob Marley Y $1,707.96 $476.85
Dept1PWAP005 4 2 0 70 0 Bob Marley Y $2,722.41 $815.00
Dept1PWAP006 4 2 0 70 0 Bob Marley Y $2,722.41 $815.00
Dept1PWAP007 4 2 0 70 0 Bob Marley Y $2,722.41 $815.00
Dept1PWAP008 4 2 0 320 0 Bob Marley Y $5,199.25 $1,310.37
Dept1PWAP009 4 2 0 70 0 Bob Marley Y $2,722.41 $815.00
Dept1TWAA010 4 1 0 0 70 Bob Marley Y $1,231.27 $381.51
Dept1TWAA011 4 1 0 0 70 Bob Marley Y $1,231.27 $381.51
Dept1TWAA012 4 1 0 0 70 Bob Marley Y $1,231.27 $381.51
Dept1TWAA013 4 2 0 0 70 Bob Marley Y $2,245.72 $719.66
Dept1TWAA014 4 1 0 0 70 Bob Marley Y $1,231.27 $381.51

Cheers,
Leo

EMC Storage Viewer Update – v1.1

EMC have released Storage Viewer 1.1 which now supports Virtual Infrastructure Client 2.5 update 4.

The objective of the EMC® Storage Viewer for Virtual Infrastructure Client is to provide a new tool that facilitates discovery and identification of EMC storage devices which are allocated to VMware ESX servers and virtual machines.
This brings the storage details up the to user through the VI Client interface, merging all of the functionality of several different tools that perform storage mapping-related functions into a single tool that integrates seamlessly with the VI Client.
Using a plug-in to the VI Client management interface, ESX Server Datastores and VMFS virtual disks are resolved to the underlying storage array target and LUN assignments.

Get it here.

Update: In comments Chad Sakac states

Working on vSphere support now – stay tuned, it will be soon!

:)

Cheers,

Leo

Migrating Ubuntu Servers

Update: thanks to Charles (in comments) for reminding me about the extra MAC space VMware uses

Ever cold-migrated a Ubuntu VM or de-registered it, then re-registered it on another host?

You’ll know that what happens is that Ubuntu immediately loses network access because the underlying MAC has changed.

The way ESX generates MACs is based on its UUID and is in the format 00:0c:29:* or 00:50:56:* which are VMware’s OUIs – search 00-0C-29 and 00:50:56 here

This is because those MAC addresses are persistently bound in Ubuntu. What I’ve done previously is to simply edit the /etc/iftab file with the right MAC address as per ethernet0.generatedAddress in the VM’s .vmx file.

Interestingly however, I’ve found another solution – simply remove the persistent MAC bindings – thanks to the Professional VMware blog:

Edit by adding the following to /etc/udev/rules.d/70-persistent-net.rules:

# ignore VMware virtual interfaces
ATTR{address}=="00:0c:29:*", GOTO="persistent_net_generator_end"
ATTR{address}=="00:50:56:*", GOTO="persistent_net_generator_end"

That will permanently fix the issue – in fact, I’m adding this to my Linux Template best practices – as it simplifies deployment by a lot.

Cheers,

Leo

A real pain in the arse: CDP in ESX 3.5/4.0

So, everyone loves CDP? Hands up if yes?

Well… that’s everybody.

I love CDP too – when VMware implement it correctly. See, here’s the thing – CDP will not work with Intel NICs of a certain type (Intel MT and PT dualport NICs – and some quad ports) if the port they are attached to has a native VLAN that is not VLAN 1.

Now, at this point, all the Cisco people reading this blog would be shaking their head.

Why?

Link 1 for ESX 4.0: On Cisco switches, the device drivers for Intel MT or PT Dual Port NICs do not update the filter to allow traffic from VLAN 1. The driver drops the Cisco Discovery Protocol (CDP) packet before it gets to the kernel, and so, ESX/ESXi does not see the CDP information.

Link 2 for ESX 3.5: When the native VLAN on a switch is changed from the default of 1, Intel MT and PT dualport NICs drop all traffic with the new native VLAN tag. This results in CDP packets being dropped, due to which CDP appears not to function correctly.

The solution in both cases is to reset the native VLAN on each port you want to see CDP on, to use native VLAN 1.

Except, this is Cisco’s take on the situation:

Precautions for the Use of VLAN 1

The reason VLAN 1 became a special VLAN is that L2 devices needed to have a default VLAN to assign to their ports, including their management port(s). In addition to that, many L2 protocols such as CDP, PAgP, and VTP needed to be sent on a specific VLAN on trunk links. For all these purposes VLAN 1 was chosen.

As a consequence, VLAN 1 may sometimes end up unwisely spanning the entire network if not appropriately pruned and, if its diameter is large enough, the risk of instability can increase significantly. Besides the practice of using a potentially omnipresent VLAN for management purposes puts trusted devices to higher risk of security attacks from untrusted devices that by misconfiguration or pure accident gain access to VLAN 1 and try to exploit this unexpected security hole.

To redeem VLAN 1 from its bad reputation, a simple common-sense security principle can be used: as a generic security rule the network administrator should prune any VLAN, and in particular VLAN 1, from all the ports where that VLAN is not strictly needed.

Therefore, with regard to VLAN 1, the above rule simply translates into the recommendations to:

  • Not use VLAN 1 for inband management traffic and pick a different, specially dedicated VLAN that keeps management traffic separate from user data and protocol traffic.
  • Prune VLAN 1 from all the trunks and from all the access ports that don’t require it (including not connected and shutdown ports).

Similarly, the above rule applied to the management VLAN reads:

  • Don’t configure the management VLAN on any trunk or access port that doesn’t require it (including not connected and shutdown ports).
  • For foolproof security, when feasible, prefer out-of-band management to inband management. (Refer to [3] for a more detailed description of a out-of-band management infrastructure.)

As a general design rule it is desirable to “prune” unnecessary traffic from particular VLANs. For example, it is often desirable to apply VLAN ACLs and/or IP filters to the traffic carried in the management VLAN to prevent all telnet connections and allow only SSH sessions. Or it may be desirable to apply QoS ACLs to rate limit the maximum amount of ping traffic allowed.

If VLANs other than VLAN 1 or the management VLAN represent a security concern, then automatic or manual pruning should be applied as well. In particular, configuring VTP in transparent or off mode and doing manual pruning of VLANs is commonly considered the most effective method to exert a more strict level of control over a VLAN-based network.

So it would seem that VMware, through their inability to fix the Intel driver, are recommending against Cisco’s suggestions. This is simply not on! Especially when the two companies are getting closer and closer with the release of the Nexus 1000V.

VMware, get your arses into gear and fix this up. I refuse to violate my customers’ switch/network security for the purposes of having a feature that you advertise as being available and working.

Cheers,

Leo

Revisiting VMFS 3 Recoverability

Summary:

Hi,

My last post on the matter was rather simplistic, as it did not back up .vmx files or template .vmtx files – this meant that a recovered VMDK (while useful) would require a new VMX/VMTX to make sure there wasn’t too much client OS reconfiguration.

Here’s an updated RPM I have created that will back up .vmtx and .vmx files on each server that it’s deployed on, as well as header, metadata and vmfs-undelete data:

RPM

The RPM will work on all versions of ESX3.5u3 and higher.

Explanation:

As of ESX 3.5u3, VMware have released a python utility called vmfs-undelete which is an interactive backup of the file location data for each VM running on each ESX server. The key problem with this application is that it is interactive – in its native form it cannot be scripted to run automatically.

What this utility does, is back up to a file the location on disk of each VM running on a host and in case of failure, that file can be referenced to bring back the VMDK files of the machine. However, it can only bring back VMDK files and to this end, I (with thanks to Mike Laspina) have designed a solution that is interactive and backs up VMX files. This means that given a certain amount of time and assuming the VMDKs are not corrupt, we can bring back a VM to full function. The benefits of this are many:

- Metadata corruption (while serious) will no longer cripple us
- Automated deletion of orphaned VMDKs will not cause rebuilds or restores
- Accidental deletions of VMs from disk is recoverable

How it works:

The installation of the attached RPM creates a cron job in /etc/cron.d/ which runs the vmfs-undelete utility interactively and backs up the VM disk info to /tmp in the format vmfs-undelete-date. The same cron job backs up all .vmx and .vmtx files to /tmp in the format vmx-undelete-date.tar.gz and it also backs up all metadata and header information for each LUN to /tmp in the format archive-sdu1-date.tar.gz.

The last thing it does is maintain a rotating backup by deleting all backups older than 8 days – meaning that as long as we react to a fault within a week, we’re covered.

To restore a VMDK, run the vmfs-undelete utility in interactive mode from the console of the ESX server on which the VM ran last and follow the prompts (the target file for restoring is /tmp/vmfs-undelete-date)

Then restore the VMX file and copy it manually to the restored directory, register the VM and power it up.

Cheers,

Leo

Installing ESX 4.0 and vCenter 4.0: Best Practices

VMware have an article out, relating to best practices in installing vCenter and ESX 4.0

It’s fairly basic in only stating system requirements and basic suggestions, but it’s not too bad – I will be creating my own best practices extensions when I finish my testing phase.

But for now, the KB is here.

Cheers,

Leo

A long-awaited feature in ESX 3.5

I recently patched my development ESX servers to the latest patch sets for ESX 3.5.

And then I had to reconfigure my ESX firewall to install Dell’s OpenManage

Imagine my surprise when I saw this:

[root@infdevm001e root]# esxcfg-firewall -h
esxcfg-firewall
-q|--query                                      Lists current settings.
-q|--query                             Lists setting for the
                                                specified service.
-q|--query incoming|outgoing                    Lists setting for non-required
                                                incoming/outgoing ports.
-s|--services                                   Lists known services.
-l|--load                                       Loads current settings.
-r|--resetDefaults                              Resets all options to defaults
-e|--enableService                     Allows specified service
                                                through the firewall.
-d|--disableService                    Blocks specified service
-o|--openPort
        Opens a port.
-c|--closePort
            Closes a port previously opened
                                                via --openPort.
   --ipruleAdd   Adds a rule
                                                to block/allow hosts to access
                                                specific COS service;'cport' can
                                                be specified like 'a:b'. For ex:
                                                0:65535 blocks all the ports;
                                                'host' can specified like 'a/b'.
                                                For ex: 0.0.0.0/0 blocks all the
                                                hosts.
   --ipruleDel     Deletes the host rule
                                                previously added via --ipruleAdd
   --moduleAdd                          Loads an iptables module, and
                                                adds it to the peristent
                                                firewall configuration.
   --moduleDel                          Removes an iptables module, and
                                                removes it from the persistent
                                                firewall configuration.
   --blockIncoming                              Block all non-required incoming
                                                ports  (default value).
   --blockOutgoing                              Block all non-required outgoing
                                                ports (default value).
   --allowIncoming                              Allow all incoming ports.
   --allowOutgoing                              Allow all outgoing ports.
-h|--help                                       Show this message.

The ipruleAdd, ipruleDel, moduleAdd, moduleDel switches are new to ESX and allow more fine-grained control with host-based rules for the firewall, as well as the allowing of loading iptables modules from the script.

The patch that unlocks this functionality is ESX350-200904402-SG. The patch that updates the man page for the usage of the esxcfg-firewall script is ESX350-200904409-BG. Neither of these patches require a host reboot so they can be installed any time as long as dependencies are satisfied.

:)

Cheers,
Leo