Thursday, May 24, 2012

vCenter Server 5 Service Fails

We had an issue with the vCenter Server 5 Service failing recently. Basically what happened was the VMware VirtualCenter Server service failed (out of the blue) with the following Informational Event ID 1000 logged in the Application log:

The description for Event ID 1000 from source VMware VirtualCenter Server cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Starting VMware VirtualCenter 5.0.0 build-623373

the message resource is present but the message is not found in the string/message table

Followed by another Info Event 1000:

The description for Event ID 1000 from source VMware VirtualCenter Server cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Log directory: C:\ProgramData\VMware\VMware VirtualCenter\Logs.

the message resource is present but the message is not found in the string/message table
And followed by an Error Event 1000 (upon it attempting to auto-restart the service):

The description for Event ID 1000 from source VMware VirtualCenter Server cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Failed to intialize VMware VirtualCenter. Shutting down...

the message resource is present but the message is not found in the string/message table
First try was to restart all the vCenter services - only the VMware VirtualCenter Server service was offline; all other services (like VMware VirtualCenter Management Webservices) were still online. This obviously failed with the same error as before.

All of the SQL services were verified online (these are kept on a separate server due to the size), and accessible. I ran across a similar Discussion thread in VMware Communities, which pointed to issues inside of the SQL Database. Based off of this, I decided the first order was to look at my SQL server to see what was going on there. I was able to log in, and verified that everything was up. Then I looked at my logs, and saw Event ID 17053:
C:\VCDB.ldf: Operating system error 112(There is not enough space on the disk.) encountered.
 And instantly I knew my problem. Some yahoo (namely the yahoo writing this article) must not've been paying attention when installing the VCDB database, and stuck it in the root of the C drive. Naturally, that yahoo had to fix his own problem...

So I went into Microsoft SQL Server Management Studio, and ran the following command:

alter database VCDB modify file ( name = vcdb , filename = 'E:\SQL\MDF\VCDB.mdf' )
alter database VCDB modify file ( name = vcdb_log , filename = 'E:\SQL\LDF\VCDB.ldf')
go
Then I Offlined the database (Right click, Tasks, Take Offline), and moved the files to their new homes. Then I Onlined the database, and verified that the path was correct for it. You can see the full process of how to move a SQL database here.


Once this was done, I went back to my vCenter Server, and was able to bring all my services online without incident, and was able to again go in and manage my vCenter 5 Server via vSphere Client.

------
Dustin Shaw
VCP

Tuesday, March 13, 2012

WNLB and VMware

So I believe I've found an (known) issue with 2003 Windows Network Load Balancing and VMware.
VMware reports that WNLB on Windows 2003 Servers does not behave as expected here. Basically the article says the the NLB will point to one of the servers, not all of them, when running in unicast. They give you two fixes: use multicast; or reconfigure your Port Groups (or vSwitches) to prevent RARP Packet Transmissions. Interesting thing, with the current environment these servers are hosted in, I am unable to either run multicast or disable Switch Notify. I'll have to take that up with the Network Team, or perhaps investigate some NLB hardware.

So here's where I'm left with - these servers are not supposed to go down (hence the NLB), but they are everyday at 4AM (fun call). This is when backups are running, so I'm adjusting the time to see if the issue follows.

What appears to be happening is that when snapshots are taken for backups, the NLB seems to freak out at the one dropped ping. Currently the backups all run at once, which makes them all hiccup at the same time, killing the NLB for a good reported 45 minutes (no idea why so long). If the issue follows the backups, perhaps staggering the backups might solve the problem (let the NLB roll from one server to another).

I'll keep you posted on what I find.
------
Dustin Shaw
VCP

Thursday, February 23, 2012

Host update fails after updating to 4.1u2

After updating vCenter Update Manager (right after updating my vCenter Server) from 4.1u1 to 4.1u2, I received the following error when trying to update on of my hosts:



Remediation did not succeed for esxihost: SingleHostRemediate: esxupdate error, version: 1.30, operation: 7: ('http://esxihost:9084/vci/hostupdates/hostupdate/vmw/vibs/cross_oem-vmware-esx-drivers-net-vxge_400.2.0.28.21239-1OEM.vib','/var/tmp/cache/-1699692350','[Errno 14] HTTP Error 404: Not Found')

After doing some research, I discovered what happened is that Update Manager 4.1u2 is case sensitive, whereas 4.1u1 was not. Any patches that were downloaded previous to the 4.1u2 update that contain upper case letters will fail with this error. VMware has a KB article about it here.

The patches affected are the ones below. I've put the correct capitalization on them for you. If you go to your patch repository, you can rename the files that you have to the below, and you should then be able to update your hosts. The repositories are located here:
  • Windows2008: C:\ProgramData\VMware\VMware Update Manager\Data\Hostupdate\vmw\vib\
  • Windows2003: C:\Documents and Settings\All Users\Application Data\VMware\VMware Update Manager\Data\Hostupdate\vmw\vib


bind-libs-9.3.6-4.P1.el5_5.3.i386.vib
bind-libs-9.3.6-4.P1.el5_5.3.x86_64.vib
bind-utils-9.3.6-4.P1.el5_5.3.x86_64.vib
bind-libs-9.3.6-4.P1.el5_5.3.i386.vib
cross_oem-vmware-esx-drivers-net-vxge_400.2.0.28.21239-1OEM.vib
cross_oem-vmware-esx-drivers-scsi-3w-9xxx_400.2.26.08.036vm40-1OEM.vib
vmware-esx_swMgmt_provider-4x.1.0.1-1.4.348481.vib



------
Dustin Shaw
VCP

Wednesday, February 22, 2012

Syslog not configured on ESXi 4.1u2

When I updated my ESXi host from 4.1u1 to 4.1u2, I got the following error message:


Configuration Issues
Issue detected on esxihost in datacenter: Warning: Syslog not configured. Please check Syslog options under Configuration.Software.Advanced Settings in vSphere client.

I thought it was odd since the host previously never complained about the Syslog before. I compared the settings between it and the other 4.1u2 hosts that I had, and indeed, it was missing the Syslog.Local.DatastorePath setting.

The setting on my hosts was:
[] /scratch/log/messages

Once I copied this into my Syslog.Local.DatastorePath setting on the server, it was happy. I went ahead and copied the setting to my remaining 4.1u1 servers so that they will be happy when updated as well.

So apparently ESXi 4.1u2 has issues with the Syslog running on ramdisk, but ESXi 4.1u1 doesn't.

I found the following VMware KB Article that explains why it complains about it:
Syslog not configured messages on ESXi host console or in logs

------
Dustin Shaw
VCP

Tuesday, February 21, 2012

Installing PowerCLI on Server 2008

In attempting to install VMware vSphere PowerCLI on Windows Server 2008 x64, the following error comes up. This also happens on Server 2008R2.

Error 1406. Could not write value InstallPath to key \Software\VMware, Inc.\VMware vSphere PSDK Runtime. Verify tha tyou have sufficient access to that key, or contact your support personnel.



The prescribed fix for this is to remove the following registry key (and subkeys):
HKLM\Software\Wow6432Node\VMware, Inc.

*** Please make sure you know what you are doing in the registry before you do anything in there!!!

After that, installation proceeds successfully.

The only thing under the VMware, Inc. registry key is a "volatile" key with a UUIDHost DWord under it. Since I don't like to mess with other programs installed on the server (the one I was using has Quest vRanger installed), I exported the key, removed it, installed PowerCLI, and the imported my volatile key back in.

------
Dustin Shaw
VCP

Thursday, February 9, 2012

ESXi Host Unable to Update

I had a VMware ESXi 4.1 host that I was unable to update/patch using Update Manager the today.

It was coming up with the following errors when I tried:

VMware vCenter Update Manager had an unknown error. Check the Tasks and Events tab and log files for details.

And when I did that, I found this:

Could not install patches on esxihostname
Remediation did not succeed for esxihostname: SingleHostRemediate: Install error on host: esxihostname, error details: vim.fault.NoHost.

So I did some quick googling, and ran across this website that had the answer I was looking for.

The particular host that I was trying to update did indeed have an Unknown (inaccessible) VM on it (left over from yanking the storage and being sloppy...). I removed the VM from inventory, then was able to successfully patch the host.

------
Dustin Shaw
VCP

Friday, January 27, 2012

Move vRanger Repository

I needed to move a vRanger Repository, but didn't want to start all my backups over, so I found a handy article from Quest that identifies how to do it.

FYI, I believe it requires vRanger 5.3 and up.
The basics are you go into SQL Management Studio, expand down to dbo.Repository to identify the repository, and change the host or target directory. Then go to dbo.RepositoryCIFS to change the sharename.

Make sure you restart the vRanger Services after making the changes.

------
Dustin Shaw
VCP