2010/12/01

Netextender corrupts mDNSResponder

A great way to start the day. My boss had some problems with his mac today. He wasn't able to surf at all. I checked all the layers and stranded at the highest level :).
  • nslookup www.google.com resolved the domain (same with dig)
  • ping 8.8.8.8 replied
  • ping www.google.com did not reply (can not resolve hostname)
  • internet in a VM worked correctly
I was quite sure that it had something to do with the way Mac OS X resolved the domain names. I also quickly learned that mDNSresponder is responsible for DNS since 10.6.4. Focusing on that I stumbled onto an article where NetExtender corrupts the xml that the services uses to boot. Since we were using NetExtender I thought I should give it a shot. First of all I copied my own xml from my working mac to a usb stick
sudo cp /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist /Volumes/stickname/

I then went in console at my boss computer (spotlight > terminal) and executed the sudo su command to enter root mode , the sh# prompt should appear. I entered the correct directory
  • cd /System/Library/LaunchDaemons/
and checked the files
  • ls -al com.apple.mDNSResponder*
This showed up 3 files, the regular plist file an nxbk file and the helper file which is not important.

I then compared the nxbk file and the regular file with the one back upped from my computer by using diff
  • diff com.apple.mDNSResponder.plist /Volumes/stickname/com.apple.mDNSResponder.plist
  • diff com.apple.mDNSResponder.plist.nxbk /Volumes/stickname/com.apple.mDNSResponder.plist
The nxbk file made by netextender before altering the plist was completely the same as the one I had on my USB stick. I decided that netExtender had corrupted the plist file so shut down the service, copying back the backup and relaunching the service could be a solution. So I executed the following (actually I backed up the files by cping them to /Users/someuser/)
  • launchctl unload /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist
  • rm com.apple.mDNSResponder.plist
  • mv com.apple.mDNSResponder.plist.nxbk com.apple.mDNSResponder.plist
  • launchctl load /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist
After that everything was working again.

BTW here is an output of my plist



2010/11/29

ESX disconnects from vCenter and Corrupt IBM ESXi embedded sticks

Sometime ago I was installing a vCenter and connecting the the ESX hosts. The host kept disconnecting after 1 minute or something. Reconnecting helped for 1 minute. The problem? I thought I disabled the firewall but this wasn't the case. The firewall was blocking the incoming connections. Disabling the firewall completely solved the problem. This leaves me to believe that the heart-beating is done in all directions ESX to ESX but also ESX to vCenter and vice versa ... interesting.

Another great thing I encountered is a corrupted ESXi embedded IBM USB stick. The ESX hosts was asking for a Diagnostic dump partition. This was strange because I worked with the sticks before and they never needed one (dumping on a local partition / vicfg-dumpart -l to show). Also installing the VMware tools was giving errors about missing the cd's. Watching under the / root partition (unsupported mode/ls -al) was not showing the /store partition. After examining a "working" and a "non-working" ESXi and comparing the fdisk -l output, I noticed that a partition was missing. I reinstalled the ESXi with the recovery cd that you get when buying the embedded USB keys, fixed the problem.

I attached some screenshots so you can see the differences. The last partition (store partition was missing 615 to 900)