Recent Articles
Odd numbers matter (or how I got vExpert 2011)
Jul 3, 2011 Blogs 3 Comments
I just received an e-mail from VMware informing me I made vExpert 2011! This is a great surprise, as I did not expect to make the cut this year. Having missed the boat in 2010, I feel esspecially honoured this year. I would like to thank my fellow bloggers, VMware employees and the VMware community at large for giving me the opportunity to share and contribute.
Of course, special thanks go out to John Troyer and his social media team for recognizing my effort this year.
StarWind goes freestyle
May 17, 2011 Blogs Leave a comment
Back in the day, I used StarWind iSCSI Target a lot in as an instructor in training sessions. I let students configure a simple iSCSI Target to get some storage up-and-running for them to attach to the vSphere cluster they were building, so they could do all the funky VMware-stuff, like vMotion, HA and DRS. StarWind was an ideal solution, because, well, it was free and Windows-based. The effort it took to get the product in working order was minimal enough not to distract the students from focussing on what’s important.

So I was bummed out to see the free version of StarWind disappear. With an equally strong positive reaction, I noticed that the free version has somehow made it back into the wild. As of May 16th, StarWind Free iSCSI SAN has been made available again. It still installs on top of Windows, provides full snapshot and backup capabilities, it even does some dedup and caching magic. Best of all, no license is restricting you from using all this in a production environment.
Download it right here.
Locked yourself out of vCenter?
We recently took over operational administration of a big school in the Netherlands. The previous system administrators had fiddled around with permissions and roles a little too much, effectively removing all permissions for all users and groups for the root object in VMware vCenter.
Obviously, this resulted in a unmanageable environment, and actions had to be taken.
First off, I thank the previous system administrator for not messing up the SQL permissions: I could still access the database-instance using my credentials. Also, luckily, a group called ‘VMware Administrators’ still had some permissions in vCenter: this group had ‘Read Only’ permissions on an individual virtual machine. This made editted the database a bit easier, but it is in no way required.
By simply replacing two values in the vCenter database, I changed two settings:
- Changing the permissions from ‘Read Only’ to ‘Administrator’ for the given group
- Changing the object to which these permissions are applied from the virtual machine to the root object
So, how did I do this? Using the database administration tools, I did the following:
- In the VPX_ACCESS table I changed ROLE_ID from ‘-2′ (‘Read Only’) to ‘-1′ (‘Administrator’) for the given PRINCIPAL (which contains the value ‘VMware Administrators’, the group I was looking for)
- In the same table, I changed ENTITY_ID from the given value to ’1′, which stands for ‘Datacenters’, otherwise known as the root object.
After a quick restart of the vCenter services, I was able to access the environment with proper permissions for the given group again, including all nested objects, indicating that inheritance was set up properly, too.
Welcoming a new sponsor to my blog
May 11, 2011 Blogs Leave a comment
As of today, Virtual Lifestyle is sponsored by StarWind. I would like to welcome them aboard and thank them for choosing to work together with me!
Taking this opportunity, I would like to invite you to join StarWind in a webinar:
Denny Cherry: Best Practices in Enterprise Storage Virtualization and Backup
Denny Cherry, Microsoft MVP in SQL Server will explain how Active-Active High Availability simplifies your endpoint data protection solution and the importance of Deduplication in your storage virtualization and backup strategy.
Register now to learn about these topics and more:
- Use Active-Active High Availability to simplify storage management and increase continuous data and application availability.
- Benefits and features of High Availability Storage for server failover clustering, eliminating a single point of failure.
- Deduplication – reduce data storage costs by eliminating duplicated blocks of data.
Title: Best Practices in Enterprise Storage Virtualization and Backup
Date:Tue, May 24, 2011 2:00 PM – 3:00 PM EDT - Register Now
Denny Cherry has over a decade of experience managing SQL Server, including MySpace.com’s 175+ million user installation, one of the largest in the world. Denny’s areas of technical expertise includes system architecture, performance tuning, replication and troubleshooting. Denny uses these skills on a regular basis in his current role as a Sr. Database Administrator and Architect at Awareness Technologies. Denny currently holds several Microsoft Certifications related to SQL Server as well as being a Microsoft MVP. Denny is a longtime member of PASS and Quest Software’s Association of SQL Server Experts and has written numerous technical articles on SQL Server management.
Oh, and don’t forget to vote for me in the ‘Server Room Beauty Contest‘:

“Various Issues seen on ESXi 4.1 with OEM CIM providers”
Apr 26, 2011 Blogs 2 Comments
VMware Support Insider has just posted an article about a workaround previously posted on Virtual Lifestyle:
We have become aware of a number of customers reporting various seemingly random behaviors on their ESXi hosts, specifically on hardware in which OEM CIM providers have been installed.
You may not be able to power on certain virtual machines. Other virtual machines may enter an invalid state randomly. You might also encounter failed vMotion migrations at 82%.
Your first reaction to this might be to restart the hostd daemon, but in this case the problem does not go away, at least not for long.
Knowledge Base article 1035564 – Virtual machines in an invalid state fail to power on with the error: FoundryVMDirectlyOpenSocketToVMX: Failed to create socket pair, has been written about this issue, and will help you determine whether this is the problem impacting you, and how to deal with it.
While VMware works to create a permanent fix for the problem, there is a work-around which involves stopping the hardware monitoring process sfcbd (details in the KB). Depending on the system workload, this change may only temporarily resolve the issue.
Thanks to Support Engineer Solomon Wen for creating this KB article.
Update on missing symlinks (KB1033591)
Mar 25, 2011 Blogs 5 Comments
I just received an update from my assigned HP support engineer:
I got some info from VMware, the latest on the Problem Report is that engineering are currently testing a fix for the issue. Once complete there will be a patch available for the issue. At the moment VMware cannot give a time line of when the patch will be available. From looking at the SR attached to the bug the issue only impacts ESXi images. Both installable and embedded had encountered the issue.
I’m handling another case where the issue is reported with the vanilla version. So till the time the patch is released we need to use the workaround specified.
I can confirm that this bug is present on all versions of ESXi (vanilla, vendor customized images, installable and embedded versions) and on different storage platforms (local SAS RAID, SD-card). I don’t know exactly which versions of ESXi are affected, but I believe that version 4.1 and 4.1 Update 1 are affected.
Disabling both sfcbd-watchdog and sfcbd on the ESXi-host seems to be the only work-around available, and is confirmed to eliminate the problem of the disappearing symbolic links.
More on VMware KB 1033591
The last couple of days, I’ve been working on a work-around or solution for missing symlinks making virtual machines unmanageable in the vSphere Inventory. In KB1033591, VMware has described a work-around to re-create the symlink when it has disappeared.
HP (and VMware) support have informed me on way to prevent the symlink from disappearing. I immediately posted this on Twitter. @vConsult, @gklooste, @Scorpinus and myself suspect we we’re experiencing the disappearing symlink only on hosts installed with the HP customized install CD, although this has not been verified to be the cause of this bug by VMware or HP. I’m not sure if the bug will not manifest itself when reinstalling the hosts with a vanilla ESXi CD or updating the host to 4.1 Update 1. I will try to verify if hosts installed using the vanilla ESXi installation CD are experiencing this bug as well.
Anyways, I wanted to share the instructions on how to prevent the symlinks from disappearing:
Engineering are aware and are currently investigating this issue. They have suggested the following work around to stop the issue occurring:
To workaround this issue:
Stop SFCBD on the ESXi host with the command:/etc/init.d/sfcbd-watchdog stopTo make this change persistent on reboot, run these commands:
chkconfig sfcbd-watchdog off chkconfig sfcbd offNote: When you disable SFCBD, you cannot view the hardware status on the ESXi host.If hardware monitoring is an environmental requirement, you can extend the amount of time before the issue recurs:
Note: Depending on the system workload, this change may temporarily resolve the issue.
From the ESXi Shell, edit the/etc/sfcb/sfcb.cfgfile using a text editor.
Search for the entryprovProcs: 16and change the value from 16 to 12.
Run this command to restart sfcbd for the changes to take effect:/etc/init.d/sfcbd-watchdog restart
Update: In the comments section below, Michael has pointed out that VMware KB 1035564 contains related information on disabling the sfcbd-watchdog:
“This issue occurs due to exhaustion of VMkernel resources. It occurs more frequently on hosts where OEM CIM providers have been installed, which may be caused by the added load of having CIM Providers under sfcbd.”
Update 2: After evaluation all nine ESXi-hosts in this environment, it seems that only the hosts installed with the HP customized version are suffering from this bug. Checking if the advanced settings ‘UserVars.CIMoemProviderEnabled’ is present on the hosts is a good way to determine if the host was installed with a HP customized version (value is present) or vanilla version (value isn’t present).
VM becomes unmanageable after vMotion
Mar 7, 2011 Blogs 2 Comments
I’ve experienced weird errors and faults with random virtual machines in a 5-host HA/DRS cluster. During a vMotion or after restarting Management Agents on the ESXi-hosts, the virtual machine becomes unmanageable. Tasks, like vMotion, become stuck at the 82% mark or an ‘invalid’ state is detected by vCenter because it doesn’t know where the VM is actually running anymore.
Symptoms
Let’s take an example I experienced this morning:
- a virtual machine ‘VM01′ was running on ESX04.
- DRS invoked a vMotion of this virtual machine to another host, ESX02.
- The vMotion task remained stuck at 82% for over two hours.
- The VM itself still responded to ping and RDP-connections
- When viewing the hosts’ inventory (directly connected the vSphere Client to the two hosts), neither had the virtual machine registered and running.
- Using ‘esxtop’, I discovered that the virtual machine was actually running on ESX02, so the vMotion did complete successfully.
- hostd.log on the destination host stated:
['vm:/vmfs/volumes/4c713f9b-9b799538-a0f9-78e7d1f7b584/VM01/VM01.vmx'] connect
to /var/run/vmware/root_0/1299491418874549_26415352/testAutomation-fd:
File not found
Resolution
Duco Jaspars gave me a helping hand and pointed me to VMware KB 1033591
- The Virtual Machine’s vmware.log stated:
- This VM did have an entry, so I skipped to step 4 of the resolution
- I recreated the symlink with
- I restarted the Management Agents on the affected host
- After reconnecting the host in vCenter, the VM is fully operational (the vMotion task failed at the moment I restarted the management agents) again.
vmx| /vmfs/volumes/4c713f9b-9b799538-a0f9-78e7d1f7b584/VM01/VM01.vmx: Setup
symlink /var/run/vmware/1a79f8eb60648d355323caa9ae6e4cae
-> var/run/vmware/root_0/1299491418874549_26415352
Contents of /var/run/vmware/root_0/*
root_0/1299491418874549_26415352:
configFile -> /vmfs/volumes/4c713f9b-9b799538-a0f9-78e7d1f7b584/VM01/VM01.vmx
ln -s /var/run/vmware/root_0/1299491418874549_26415352
/var/run/vmware/1a79f8eb60648d355323caa9ae6e4cae
Weird thing about this situation is that this happens with random virtual machines and only so often. Most vMotion tasks complete successfully, while some (lets say 1 in 20) experience this problem.
Duco told me that he experienced this problem on HP hosts using B-spec SD-cards for the ESXi hypervisor, but VMware Technical Support stated that they’ve seen this with hosts using local (SAS-)disks as well.
Has anyone experienced this as well? Are you using HP ProLiant DL360/380 G7 hosts? I’m very curious if you found a permanent solution for this..
Dutch VMUG Event 2010
In a couple of days, the sixth Dutch VMUG Event will be held at Nieuwegein’s Business Centre. On December 10th, over 700 attendees will get the chance to meet ‘n’ greet with Duncan Epping and Frank Denneman (but only if you have a good question on VMware HA and/or DRS), see well-known bloggers like Eric Sloof, Gabrie van Zanten and Alan Renouf present a technical session (see the session list here), see what my friends over at Veeam (who are sponsoring both the event and this humble blog) mean when they shout ‘vPower’ and enjoy a day packed with VMware goodness.
Besides the four session tracks (with a whopping 20 sessions total), you can register for workshops by Quest, Veeam and XTG on various topics. If that’s not your thing, why not meet some of the Dutch VMUG Bloggers, or chat up a member of VMware’s PSO team?
No matter what, you should stick around for the reception at the end of the day. Besides a good chance of winning an iPad, you might even bump into a fellow-VMUG-member you don’t recognize until he calls out his VMUG forum name.
When Lab Manager’s SSMove doesn’t cut it anymore…
Nov 29, 2010 Blogs 2 Comments
I’ve been busy migrating a VMware vCenter Lab Manager installation from an EMC AX150i SATA-II iSCSI SAN to a CX3-10c 10k FC disk SAN using SSMove. As I’ve experienced many times before, SSMove isn’t as robust as it should be. In this case, it refused to move a linked clone disk chain from a LUN on the AX150i to the CX3-10c. It kept telling me that some virtual machine folders were already present on the target LUN, no matter how often I deleted them from the target and started the move again. The exact error was
“Host “esx01″ has reported an error. Requested file “/vmfs/volumes/4cea58db-75a2ccd2-8774-0019b9e37a1d/lm/978″ is already present.”
I went to the VMware Communities site, and stumbled upon a post by IamTHEEvilONE where he hinted at being able to do SSMove’s work manually:
(…) the data changed location (new UUID) so update the DB and VMDK headers. It’s just that we have tools to help recover from this, and is intended to be used when SSMove fails.
Since SSMove actually already moved the files to the destination, I wanted to try to edit the VMDK’s/VMX’s and DB myself.

Denny Cherry has over a decade of experience managing SQL Server, including MySpace.com’s 175+ million user installation, one of the largest in the world. Denny’s areas of technical expertise includes system architecture, performance tuning, replication and troubleshooting. Denny uses these skills on a regular basis in his current role as a Sr. Database Administrator and Architect at Awareness Technologies. Denny currently holds several Microsoft Certifications related to SQL Server as well as being a Microsoft MVP. Denny is a longtime member of PASS and Quest Software’s Association of SQL Server Experts and has written numerous technical articles on SQL Server management.