My ESXi-machine has been running for a long time, but I never took the effort to get some form of hardware-monitoring working. After the network at the datacenter went dark for a while, I thought that my machine had physically crashed. Luckily, the machine was still up-and-running, but it did scare me. Therefore, I decided to tinker around with some form of monitoring, and settled on Nagios (which I already had set up), and an addon script called ‘check_esx_wbem.py‘.
Firstly, I needed to enable CIM on my ESXi box. After searching around the advanced settings for a while, I found the following three options, and changed the first two from a ‘0’ to a ‘1’:
- Misc.CimEnabled=1
- Misc.CimOemProvidersEnabled=1
- Misc.CimWatchdogInterval=60
After that, I logged into the Tech Support Mode, I started the CIM-daemon:
/etc/init.d/sfcbd
/etc/init.d/sfcbd-watchdog
Lastly, I went into the VI Client and executed a ‘Reset Sensors’ on the ‘Health Status’ page of the configuration. Afterwards, the ‘Health Status’ showed me the hardware-status of the machine:
With the ESXi configuration done, all I had to do was to configure Nagios, which was simple enough. I used an article on NagiosExchange. Thanks to Michiel Dijcks for helping me out with the Nagios part.
For an example how this would look in Nagios, please go here.
nice one Joep! I’ve been putting off looking into the esxi cim stack a bit more, now i don’t have to :)
Stu
Joep,
Can you post your plugin conf for Nagios. I have the python script working fine, but cant get Nagios to execute it w/o returning null as the output?
Ritmo: on the ESXi-side, I haven’t done anything special. On the nagios side, I just loaded the python script:
$USER1$/check_esx_wbem.py https://10.10.100.2:5989 root $USER2$ $ARG1$
$USER2$ is the password. Please mind the escaping of the slashes!
Be sure to backslash-escape the slashes in your esx(i) server’s address, for instance:
$USER1$/check_esx_wbem.py https://your.esx.server.com:5989 [verbose]
My nagios status information for this check shows only the first line “20090116 13:09:20 Connection to https://10.10.100.2:5989” and not the 30+ lines showed in “https://www.virtuallifestyle.nl/wp-content/uploads/2009/01/extinfocgi.htm”.
I’m on nagios 2.8, python 2.6.2, pywbem 0.7.
any ideas?
seems to be a nagios version limitation.
I ran “$USER1$/check_esx_wbem.py https://your.esx.server.com:5989 [verbose]” on nagios 3.1.2 and the output contained every CIM check.
How do you get Nagios to report on criticals and warnings with this script?
I got the plugin working — when I run it from the command line as the nagios user, it outputs problems with the server (I yanked the redundant power).
However, Nagios still shows a green OK and has checked several times…
Thanks!
has anyone configured CIM to talk to HP Insight? I installed the HP integrated ESXi 3.5 u4 version. I am unsure after searching where to actually set the community string on the ESXi host. I don’t see anything such as a snmpd.conf?
Thanks!
-Erick
FYI Misc.CimOemProvidersEnabled begun UserVars.CIMOEMProvidersEnabled with ESX 4.0+
Hiya,
I’m using ESXi 4.1 and i get a nice list of devices in the healt status tab of the vSphere client. But the storage devices arent shown… anyone knows what to do?
Is there any way to get the RPM’s from the fans in the “Reading” column to show up in the output?
Thanks!
Hi,
I’m enabled CIM in ESXi. But in nagios still shown null value(signal SEGV) (with core dump). any ideas?