The last couple of days, I’ve been working on a work-around or solution for missing symlinks making virtual machines unmanageable in the vSphere Inventory. In KB1033591, VMware has described a work-around to re-create the symlink when it has disappeared.

HP (and VMware) support have informed me on way to prevent the symlink from disappearing. I immediately posted this on Twitter. @vConsult, @gklooste, @Scorpinus and myself suspect we we’re experiencing the disappearing symlink only on hosts installed with the HP customized install CD, although this has not been verified to be the cause of this bug by VMware or HP. I’m not sure if the bug will not manifest itself when reinstalling the hosts with a vanilla ESXi CD or updating the host to 4.1 Update 1. I will try to verify if hosts installed using the vanilla ESXi installation CD are experiencing this bug as well.

Anyways, I wanted to share the instructions on how to prevent the symlinks from disappearing:

Engineering are aware and are currently investigating this issue. They have suggested the following work around to stop the issue occurring:
To workaround this issue:
Stop SFCBD on the ESXi host with the command:

/etc/init.d/sfcbd-watchdog stop

To make this change persistent on reboot, run these commands:
chkconfig sfcbd-watchdog off
chkconfig sfcbd off

Note: When you disable SFCBD, you cannot view the hardware status on the ESXi host.If hardware monitoring is an environmental requirement, you can extend the amount of time before the issue recurs:
Note: Depending on the system workload, this change may temporarily resolve the issue.
From the ESXi Shell, edit the
/etc/sfcb/sfcb.cfg

file using a text editor.
Search for the entry
provProcs: 16

and change the value from 16 to 12.
Run this command to restart sfcbd for the changes to take effect:
/etc/init.d/sfcbd-watchdog restart

Update: In the comments section below, Michael has pointed out that VMware KB 1035564 contains related information on disabling the sfcbd-watchdog:

“This issue occurs due to exhaustion of VMkernel resources. It occurs more frequently on hosts where OEM CIM providers have been installed, which may be caused by the added load of having CIM Providers under sfcbd.”

Update 2: After evaluation all nine ESXi-hosts in this environment, it seems that only the hosts installed with the HP customized version are suffering from this bug. Checking if the advanced settings ‘UserVars.CIMoemProviderEnabled’ is present on the hosts is a good way to determine if the host was installed with a HP customized version (value is present) or vanilla version (value isn’t present).