One of my ESX-servers is hosted in a remote datacenter. I don’t have easy physical access to the device, and I don’t have much means of remote access (like a KVM, iDRAC or iLO). There’s no second host, no SAN and no vCenter. Basically, it’s an ESX-host surrounded by Xen-powered VPS-hosts. Suffice to say that their average uptime is only 238.7 days…

When I originally installed ESX on the machine, I mistakenly used default partitioning for the Service Console. This left me with a (roughly) 8GB Service Console disk (wrapped in the ‘esxconsole.vmdk’ file within a VMFS-datastore). Because the server is running some pretty important virtual machines and I simply cannot walk into the datacenter that easily, I never really bothered to reinstall the machine, this time ’round using a recommended partitioning scheme.

The machine needs regular updating (using the ‘esxupdate’ Service Console utility), thus needing a lot of space in ‘/var/cache/esxupdate’ to store the unpacked binaries. As time passes, I tend to forget to do this kind of maintenance, and the number of patches to be installed grows and grows.

Finally,  after more than seven months, I decide to catch up on patching and other maintenance. The host is running ESX 4.0 build 244038. A quick look in the patch repository show I have to unleash four patches (five actually, including the esxupdate pre 4.1 upgrade patch) on the host to reach current level, ESX 4.1 build 260247.
Occupying more than 2.5 gigabytes in zipped format and over 4.5 GB when unpacked (using the ‘esxupdate –bundle bundle.zip stage’ command), this is just way more than my meager 8GB Service Console VMDK can handle.

Ok, so this needs some rethinking. I can’t afford to have a lot of downtime on the VM’s to do every patch separately. I don’t want to drive to the datacenter to reinstall ESX (incurring even more downtime). So I though about resizing the Service Console VMDK (which by the way motivated @Andrea_Mauro to write about it), but it meant having some semi-physical console access (like KVM, iDRAC, iLO or a serial console). Andrea’s post did however get me thinking. I got to the source of his how-to (written up by Toni Westbrook here).

The fix (also known as ‘Dirty Workaround’)

There I found a real gem in the form of a small bash script file by the name of ‘/etc/vmware/init/init.d/66.vsd-mount‘. This little fellow mounts the ‘esxconsole.vmdk’ file during boot-up. That got me thinking. If ‘vsd’ can open just any virtual SCSI device, why not create a new VMDK, open it with ‘vsd’ and mount it to /var/cache/esxupdate?

I started by creating a new VMDK

vmkfstools -c 20G -a buslogic /vmfs/volumes/lun1/
 esxconsole-[UUID]/var_cache_esxupdate.vmdk

I then told ‘vsd’ to open the virtual SCSI device
vsd -scu -f /vmfs/volumes/lun1/esxconsole-[UUID]/
 var_cache_esxupdate.vmdk

Using FDisk, mkfs.ext3, mount and the right Fatality, I had a usable partition mounted to the right place (/var/cache/esxupdate) in no-time.

Now for the tricky bit: opening and mounting the VMDK file automatically after a reboot. For this, I turned back to 66.vsd-mount file, and edited it slightly to incorporate the second VMDK.

Lastly, I made sure I edited /etc/fstab so the disk actually gets mounted (use ‘blkid’ to find out your disk’s UUID):

UUID=[long UUID] /var/cache/esxupdate  ext3    defaults        0 0

Conclusion

While this fix actually did solve my problem, I cannot recommend anyone use this in their production environment, or any environment for that matter. I wanted to show you how I solved my problem by using a totally unsupported and untested solution. I guess I’m just glad I don’t have to schedule an appointment to go to the datacenter and physically reinstall the hypervisor, or at least not in the short term.