VMware DNS bug
While working with VMware ESX 3 lately me and a college discovered a bug. Thing was we could not connect the ESX nodes to Virtual Control Center. It had something todo with DNS lookups, lets look at what it was.
After a while of head scratching we thought lets try resolving the Virtual Control Center from one of the ESX nodes.
$ dig vcc.test.stevenkroon.com
;; Truncated, retrying in TCP mode.
;; Connection to 192.168.7.254#53(192.168.7.254) for
vcc.test.stevenkroon.com failed: connection refused.
Aha! That’s strange, it can’t resolve the Virtual Control Centers ip address, probably ESX tries to resolve the hostname of the Virtual Control Center machine machine and can’t resolve it. But why is that?
If you know a little bit about DNS (if not you might consider reading Dig it! which gives a basic overview of DNS) you know that queries are send over UDP per default however, if there tend to be too much data to send back, resolvers drop the UDP packet and retry in TCP mode. Lets just ignore the truncation, and just receive the UDP packet and blindly accept its data was not mangled in transit.
$ dig vcc.test.stevenkroon.com +ignore
; <<>> DiG 9.2.4 <<>> vcc.test.stevenkroon.com +ignore
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42395
;; flags: qr aa tc rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 23, ADDITIONAL: 0
;; QUESTION SECTION:
;vcc.test.stevenkroon.com. IN A
;; ANSWER SECTION:
vcc.test.stevenkroon.com. 86400 IN A 192.168.7.2
;; AUTHORITY SECTION:
test.stevenkroon.com. 86400 IN NS ns10.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns11.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns12.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns13.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns14.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns15.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns16.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns17.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns18.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns19.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns20.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns21.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns22.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns23.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns24.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns25.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns26.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns27.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns28.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns29.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns30.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns31.test.stevenkroon.com.
test.stevenkroon.com. 86400 IN NS ns32.test.stevenkroon.com.
;; Query time: 5 msec
;; SERVER: 192.168.7.254#53(192.168.7.254)
;; WHEN: Tue Sep 12 20:11:57 2006
;; MSG SIZE rcvd: 495
Ah so that does work, so the problem is related to DNS query that are done in TCP mode. Because this zone has so many NS records the size of the reply probably gets beyond the threshold to put this in a UDP packet, so our resolver tries to ask for the same query but then in TCP mode. Lets look at the network traffic that gets send. From the nameserver I’ll be doing a tcpdump, just to look at what is happening out there.
$ sudo tcpdump -qnei eth1 port 53 and host 192.168.7.3
1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
2 listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
3 14:32:40.339322 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 84:
. IP 192.168.7.3.32769 > 192.168.7.254.53: UDP, length: 42
4 14:32:40.343649 00:10:5a:37:15:ae > 00:50:56:49:aa:45, IPv4, length 550:
. IP 192.168.7.254.53 > 192.168.7.3.32769: UDP, length: 508
While the tcpdump was running I did a normal query for vcc.test.stevenkroon.com, as you can see on line number 3 and 4, these packets are UDP. However we don’t get a retry in TCP mode, we only see UDP packets on our nameserver. So why isn’t ESX sending us the TCP packets?
Firewalling could prevent us from sending out TCP DNS packets. Since this is a testing environment we can without troubles turn of the firewall.
$ sudo /etc/init.d/firewall stop
Stopping firewall [ OK ]
With the firewall turn off lets try again.
$ dig vcc.test.stevenkroon.com +short
;; Truncated, retrying in TCP mode.
192.168.7.2
Got ‘em! So there is something with this firewalling that prevents TCP DNS packets going out, we know this because we ware not receiving any TCP packets on our nameserver. Lets start the firewall again and look at the firewalling rules.
$ sudo /etc/init.d/firewall start
Starting firewall [ OK ]
$ sudo /sbin/iptables -nL OUTPUT
Chain OUTPUT (policy DROP)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
valid-tcp-flags tcp -- 0.0.0.0/0 0.0.0.0/0
icmp-out icmp -- 0.0.0.0/0 0.0.0.0/0
ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp spts:1024:65535 dpt:53
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:902 state NEW
ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp spts:67:68 dpts:67:68
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpts:2050:5000 state NEW
ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp dpts:2050:5000 state NEW
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpts:8042:8045 state NEW
ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp dpts:8042:8045 state NEW
ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp spt:427
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp spt:427 state NEW
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:27000 state NEW
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:27010 state NEW
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:21 ctstate NEW,RELATED
ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp dpt:902 state NEW
REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
Line number 6 on the output says:
ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp spts:1024:65535 dpt:53
This is the rule that allows UDP DNS packets going out, however there is no such rule for TCP packets, so this must be the root cause of why we are not getting any TCP DNS packets on our nameserver. Lets fix it!
sudo /usr/sbin/esxcfg-firewall -o 53,tcp,out,dnsClientTCP
This will add the following line to the OUTPUT firewall chains:
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:53
That’s the theory, lets see if it actually works. The firwall is still turned on so we are good to go.
$ dig vcc.test.stevenkroon.com +short
;; Truncated, retrying in TCP mode.
192.168.7.2
There we go! To wrap it up lets check the tcpdump output this time.
$ sudo tcpdump -qnei eth1 port 53 and host 192.168.7.3
1 15:06:38.178860 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 84:
. IP 192.168.7.3.32769 > 192.168.7.254.53: UDP, length: 42
2 15:06:38.183241 00:10:5a:37:15:ae > 00:50:56:49:aa:45, IPv4, length 549:
. IP 192.168.7.254.53 > 192.168.7.3.32769: UDP, length: 507
3 15:06:38.184584 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 74:
. IP 192.168.7.3.33003 > 192.168.7.254.53: tcp 0
4 15:06:38.184630 00:10:5a:37:15:ae > 00:50:56:49:aa:45, IPv4, length 74:
. IP 192.168.7.254.53 > 192.168.7.3.33003: tcp 0
5 15:06:38.184758 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 66:
. IP 192.168.7.3.33003 > 192.168.7.254.53: tcp 0
6 15:06:38.184813 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 110:
. IP 192.168.7.3.33003 > 192.168.7.254.53: tcp 44
7 15:06:38.184846 00:10:5a:37:15:ae > 00:50:56:49:aa:45, IPv4, length 66:
. IP 192.168.7.254.53 > 192.168.7.3.33003: tcp 0
8 15:06:38.189732 00:10:5a:37:15:ae > 00:50:56:49:aa:45, IPv4, length 1235:
. IP 192.168.7.254.53 > 192.168.7.3.33003: tcp 1169
9 15:06:38.190741 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 66:
. IP 192.168.7.3.33003 > 192.168.7.254.53: tcp 0
10 15:06:38.191177 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 66:
. IP 192.168.7.3.33003 > 192.168.7.254.53: tcp 0
11 15:06:38.191905 00:10:5a:37:15:ae > 00:50:56:49:aa:45, IPv4, length 66:
. IP 192.168.7.254.53 > 192.168.7.3.33003: tcp 0
12 15:06:38.192035 00:50:56:49:aa:45 > 00:10:5a:37:15:ae, IPv4, length 66:
. IP 192.168.7.3.33003 > 192.168.7.254.53: tcp 0
As you can see the first 2 lines are the orignal UDP packets, and after that are the TCP packets.
We talkd to the guys of VMware and agreed that this was indeed a bug, and it will be resolved in the upcoming versions of ESX 3.

