Showing posts with label dns. Show all posts
Showing posts with label dns. Show all posts

Tuesday, February 5, 2013

Fun when mail server receives SERVFAIL instead of NXDOMAIN...

Ok, I got log files overflowed with error messages like this one:
Feb 5 11:01:35 mail named[994]: error (host unreachable) resolving 'sbs-music.com/NS/IN': 50.56.243.69#53
In essence, name server for this domain (50.56.243.69) is unreachable from the DNS server used by mail server. Trying to manually query the server, I get:
$ host -t ns sbs-music.com
Host sbs-music.com not found: 2(SERVFAIL)
Note the status, it's SERVFAIL. The result is that mail server thinks it is a temporary error and retries later, with the same results. Trying this on another host (that uses another DNS server) I get:
$ host -t ns sbs-music.com
Host sbs-music.com not found: 3(NXDOMAIN)
Well, this time it tells me that there is no such domain. An error message like this would tell mail server to give up and return error response.

So, why is there discrepancy between the two? Using tcpdump in the first case (i.e. when we get SERVFAIL) the following requests/responses are exchanged (slightly edited for readability):
192.168.x.y.51892 > 206.72.97.238.53: 39504 [1au] NS? sbs-music.com. (42)
206.72.97.238.53 > 192.168.x.y.51892: 39504 4/0/5 NS ns4.shepherdhosting.com., NS ns1.shepherdhosting.com., NS ns2.shepherdhosting.com., NS ns3.shepherdhosting.com. (194)
192.168.x.y.10749 > 50.56.243.69.53: 40104 [1au] NS? sbs-music.com. (42)
timeout
192.168.x.y.63081 > 206.72.97.238.53: 56636 [1au] NS? sbs-music.com. (42)
206.72.97.238.53 > 192.168.x.y.63081: 56636 4/0/5 NS ns1.shepherdhosting.com., NS ns2.shepherdhosting.com., NS ns3.shepherdhosting.com., NS ns4.shepherdhosting.com. (194)
192.168.x.y.31948 > 50.56.243.69.53: 27220 [1au] NS? sbs-music.com. (42)
timeout
So, let me interpret this trace. The first query is to IP address 206.72.97.238 and it asks for the name server of a domain sbs-music.com. Doing reverse DNS query, we get:
# host 206.72.97.238
238.97.72.206.in-addr.arpa domain name pointer sh214.shepherdhosting.com.
So, it's some hosting provider. Now, what we get in response is that name servers for that domain are ns1.shepherdhosting.com through ns4.shepherdhosting.com. Ok, our DNS server choose ns4.shepherdhosting.com with IP address 50.56.243.69. Then, it queried it for sbs-music.com domain. This time the query timed out. So, our DNS server decided to query again 206.72.97.238 for name servers. It again received the same list and then it again queried ns4 which didn't answer the query.

Let us manually try some other server. So, querying ns1.shepherdhosting.com we get:
# host -t ns sbs-music.com. 206.72.97.238
Using domain server:
Name: 206.72.97.238
Address: 206.72.97.238#53
Aliases:

sbs-music.com name server ns2.shepherdhosting.com.
sbs-music.com name server ns3.shepherdhosting.com.
sbs-music.com name server ns4.shepherdhosting.com.
sbs-music.com name server ns1.shepherdhosting.com.
This is the same response we saw in the first part of the trace. Trying ns2:
$ host -t ns sbs-music.com. 206.72.100.134
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached
Well, ns2 doesn't respond. Neither do ns3, nor as was always obvious, ns4. What does the trace looks like (again, slightly edited):
12:20:22.532877 IP 192.168.x.y.34364 > 206.72.100.134.53: 31539+ NS? sbs-music.com. (31)
12:20:27.532864 IP 192.168.x.y.34364 > 206.72.100.134.53: 31539+ NS? sbs-music.com. (31)
12:20:32.533445 IP 192.168.x.y.45322 > 206.72.100.134.53: 5827+ NS? sbs-music.com. (31)
12:20:52.733389 IP 206.72.100.134.53 > 192.168.x.y.34364: 31539 ServFail 0/0/0 (31)
12:20:52.733410 IP 192.168.x.y > 206.72.100.134: ICMP 192.168.x.y udp port 34364 unreachable, length 67
12:20:52.734042 IP 206.72.100.134.53 > 192.168.x.y.45322: 5827 ServFail 0/0/0 (31)
12:20:52.734053 IP 192.168.x.y > 206.72.100.134: ICMP 192.168.x.y udp port 45322 unreachable, length 67
Now, we have interesting situation here. First, DNS server for sbs-music.com takes significant time to answer, and when it answers our local DNS isn't listening any more (thus those ICMP error messages). But, in the end local DNS concludes correctly that something's wrong with the name servers for that domain.

The final piece of puzzle comes from querying com name server for sbs-music.com domain:
$ host -t ns sbs-music.com l.gtld-servers.net.
Using domain server:
Name: l.gtld-servers.net.
Address: 192.41.162.30#53
Aliases:

sbs-music.com has no NS record
Obviously, this domain was removed. It is clear now that this sbs-music.com domain existed for some time, and DNS server that produces error cached IP address of its domain name. If it were to query com domain name server again, it would receive NXDOMAIN error and properly notify mail server.

To see currently cached entries of BIND name server use the following command:
rndc dumpdb
Then look for file cache_dump.db in /var/named/data (or in /var/named/chroot/var/named/data if you are running BIND in chroot). It is a textual file that you can inspect with text editor, less or something similar. In my case there were the following lines there:
; glue
sbs-music.com.          172669  NS      ns1.shepherdhosting.com.
                        172669  NS      ns2.shepherdhosting.com.
                        172669  NS      ns3.shepherdhosting.com.
                        172669  NS      ns4.shepherdhosting.com.

To flush a single entry use the following command
rndc flushname sbs-music.com internal
This removes sbs-music.com from caches in internal view (I configured viewes so that server behaves differently depending on who asks it). Yet, this didn't help. Then I tried to flush everything in internal view using:
rndc flush internal
But this, while helped, didn't actually solve the problem. Namely, looking into packet trace it turns out that  BIND server receives from b.gtld-servers.net. that given domain doesn't exist and from somewhere it pulls the old IP address 206.72.97.238?!

So, I finally decided to look into log and there are a lot of the following messages:
error (FORMERR) resolving 'sbs-music.com/NS/IN': 206.72.97.238#53
along with the one that triggered all this. Ok, I also tried to upgrade bind, but no luck, still SERVFAIL errors.

Now, its time for heavy artillery, or Wireshark. So I saved packet trace and loaded it into Wireshark. Guess what! Wireshark crashed on requests sent by local DNS server!?

Ok, after a bit more fiddling I realised that some com domain name servers do know for this domain. But note the difference between the output of host command (that I've used previously) and nslookup command:
$ nslookup -type=ns sbs-music.com 192.5.6.30
Server: 192.5.6.30
Address: 192.5.6.30#53

Non-authoritative answer:
*** Can't find sbs-music.com: No answer

Authoritative answers can be found from:
sbs-music.com nameserver = ns1.shepherdhosting.com.
sbs-music.com nameserver = ns2.shepherdhosting.com.
sbs-music.com nameserver = ns3.shepherdhosting.com.
sbs-music.com nameserver = ns4.shepherdhosting.com.
ns1.shepherdhosting.com internet address = 206.72.97.238
ns2.shepherdhosting.com internet address = 206.72.100.134
ns3.shepherdhosting.com internet address = 206.72.97.237
ns4.shepherdhosting.com internet address = 50.56.243.69
Ok, if this com server tells me that there is no this domain, why is then it pointing me to those nameservers? dig command is a bit more informative:
$ dig @192.5.6.30 sbs-music.com

; <<>> DiG 9.9.2-P1-RedHat-9.9.2-6.P1.fc18 <<>> @192.5.6.30 sbs-music.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64382
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 5
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;sbs-music.com. IN A

;; AUTHORITY SECTION:
sbs-music.com. 172800 IN NS ns1.shepherdhosting.com.
sbs-music.com. 172800 IN NS ns2.shepherdhosting.com.
sbs-music.com. 172800 IN NS ns3.shepherdhosting.com.
sbs-music.com. 172800 IN NS ns4.shepherdhosting.com.

;; ADDITIONAL SECTION:
ns1.shepherdhosting.com. 172800 IN A 206.72.97.238
ns2.shepherdhosting.com. 172800 IN A 206.72.100.134
ns3.shepherdhosting.com. 172800 IN A 206.72.97.237
ns4.shepherdhosting.com. 172800 IN A 50.56.243.69
xxx
;; Query time: 149 msec
;; SERVER: 192.5.6.30#53(192.5.6.30)
;; WHEN: Tue Feb  5 13:49:22 2013
;; MSG SIZE  rcvd: 194
What a mess?!

Then, I googled a bit to find why BIND is returning SERVFAIL instead of NXDOMAIN. This is something interesting that I found:
  • BIND could have returned SERVFAIL instead of NXDOMAIN responses for nonexistent resource records from the unsigned child zone if the parent zone was signed. (BZ#643012)
Trying to lookup that bug in RedHat's Bugzilla gives be big red square which tells me that I'm not allowed to see it (despite being logged in) so it's some security issue obviously!?

Looking at how different BIND versions behave I get the following results:
  • bind-9.3.6-20.P1.el5_8.6 and bind-9.9.2-6.P1.fc18.x86_64 return NXDOMAIN.
  • bind-9.8.2-0.10.rc1.el6_3.6.x86_64 and bind-9.8.2-0.10.rc1.el6_3.5.x86_64 return SERVFAIL.
Could it be somehow related to DNSSEC?

Ok, let me conclude. The problem is that name servers for domain sbs-music.com aren't correctly configured, while the domain itself is registered with com domain servers. This triggers different behavior from BIND. So, there are two possibilities from here:
  1. Persuade somehow BIND to return NXDOMAIN instead of SERVFAIL.
  2. Find what is causing queries for this domain in the first place.
Stay tuned... :)

Saturday, July 14, 2012

VMWare Workstation DNS server...

I just figured out very interesting thing about DNS server used by VMWare virtual machines. It uses /etc/hosts file on host machine to resolve names. The trick is that those names have to have .localdomain suffix. So, for example, if you have the following entry in /etc/hosts file of host machine:
1.1.1.1        test
and then you ask with nslookup within guest machine for that name, you'll receive:
# nslookup test
Server: 192.168.178.2
Address: 192.168.178.2#53

Name: test.localdomain
Address: 1.1.1.1
Note that localdomain is automatically appended. This is specified with search directive in /etc/resolv.conf of guest machine (there is a line search localdomain or something similar). Otherwise, without localdomain, the name wouldn't be found:
# nslookup test.
Server: 192.168.178.2
Address: 192.168.178.2#53

** server can't find test.: NXDOMAIN
In this example, the dot I appended after the name prevents anything from being appended and only name test is looked for.

To conclude, the rule is simple. If the name to be looked up ends with localdomain, then it is searched within /etc/hosts file of the host operating system.

I discovered this by accident. Namely, I did nslookup within guest machine expecting to receive error message about non-existing host name, but I got good response!? At first, I was confused, and it took me several minutes to figure out what's happening. Well, so the things go when you don't read manuals. ;)

Friday, June 29, 2012

BIND and network unreachable messages...

Sometimes you'll see messages like the following ones in your log file (messages are slightly obfuscated to protect innocent :)):
Jun 29 14:32:11 someserver named[1459]: error (network unreachable) resolving 'www.eolprocess.com/A/IN': 2001:503:a83e::2:30#53
Jun 29 14:32:11 someserver named[1459]: error (network unreachable) resolving 'www.eolprocess.com/A/IN': 2001:503:231d::2:30#53
What these messages say is that network that contains address 2001:503:231d::2:30 is unreachable. So, what's happening?

The problem is that all modern operating systems support IPv6 out of the box. The same is for growing number of software packages, among them is BIND too. So, operating system configures IPv6 address on interface and application thinks that IPv6 works and configures it. But, IPv6 doesn't work outside of the local network (there is no IPv6 capable router) so, IPv6 addresses, unless in local networks, are unreachable.

So, you might ask now: but everything otherwise works, why is this case special! Well, the problem is that some DNS servers, anywhere in hierarchy, support IPv6, but not all. And when our resolver gets IPv6 address in response, it defaults to it and ignores IPv4. It obviously can not reach it so it logs a message and then tries IPv4. Once again, note that this IPv6 address can pop up anywhere in hierarchy, it isn't necessary to be on the last DNS server. In this concrete case name server for eolprocess.com doesn't support IPv6, but some name server for the top level com domain do support it!

To prevent those messages from appearing add option -4 to bind during startup. On CentOS (Fedora/RHEL) add or modify the line OPTIONS in /etc/sysconfig/named so that it includes option -4, i.e.
OPTIONS="-4"

Tuesday, June 26, 2012

Setting up reverse DNS server...

In the post about DNS configuration I skipped reverse DNS configuration. But, it is necessary to have it in some cases, like FreeIPA installation or for mail servers. So, I'm going to explain how to configure reverse DNS server.

While "normal" DNS resolution works by names, from root server down to the authoritative one for the name we are looking for, reverse DNS resolution works within special top-level domain (in-addr.arpa). Within this domain, sub-domains are comprised from octets within IP address in reverse order. Now, if your block of IP addresses ends on byte boundary (e.g. /8, /16, /24) the setup is relatively simple. Otherwise, you upstream provider (the one that holds larger IP address block) has to point to your domain on a per address base.

Let us bring this to more concrete values. Suppose that our public IP address space is 192.0.2.0/24. Also, suppose that your mail server has public IP address 192.0.2.2. In that case, reverse query is sent for name 2.2.0.192.in-addr.arpa and query type is set to PTR, i.e. we are looking for a name 2 within 2.0.192.in-addr.arpa zone.

So, it's relatively easy to setup reverse DNS. You need to define appropriate zones that include only network part of your IP addresses. In our case we have two zones, but IP addresses used for one of them depends on who's asking (client from the local network or client on the Internet). So, we have three zones in effect:
  1. DMZ, when asked by local clients, is in the network 10.0.0.0/24. This means we have reverse zone 0.0.10.in-addr.arpa for local clients.
  2. DMZ, when asked by internet clients, is in the network 192.0.2.0/24. This means that for them reverse zone is 2.0.192.in-addr.arpa.
  3. Finally, clients in local network (non-DMZ one) have IP addresses from a block 172.16.1.0/24 and so they are placed within reverse zone 1.16.172.in-addr.arpa.
So, within internal view you should add the following two zone statements:
zone "0.0.10.in-addr.arpa" {
    type master;
    file "example-domain.com.local.rev";
};

zone "1.16.172.in-addr.arpa" {
    type master;
    file "example-domain.local.rev";
};
And within internet view you should add the following zone statement:
zone "2.0.192.in-addr.arpa" {
   type master;
    file "example-domain.com.rev";
};
Then, you should create the three zone files (example-domain.com.local.rev, example-domain.local.rev, and example-domain.com.rev) with the following content:
# cat example-domain.com.local.rev
 $TTL 1D
@    IN    SOA    @ root.example-domain.com. (
            2012062601 ; serial
            1D         ; refresh
            1H         ; retry
            1W         ; expire
            3H )       ; minimum
           NS    ns1.example-domain.com.

1          PTR    ns1.example-domain.com.
# cat example-domain.local.rev
 $TTL 1D
@    IN    SOA    @ root.example-domain.com. (
            2012062601 ; serial
            1D         ; refresh
            1H         ; retry
            1W         ; expire
            3H )       ; minimum
           NS    ns1.example-domain.com.

1          PTR    test.example-domain.local.
# cat example-domain.com.rev
$TTL 1D
@    IN    SOA    @ root.example-domain.com. (
            2012062601 ; serial
            1D         ; refresh
            1H         ; retry
            1W         ; expire
            3H )    ; minimum
           NS    ns1.example-domain.com.

1          PTR    ns1.example-domain.com.
Don't forget to change permissions on those files as explained in the previous post. Now, restart BIND and test server:
# nslookup ns1.example-domain.com 127.0.0.1
Server:  127.0.0.1
Address: 127.0.0.1#53

Name:    ns1.example-domain.com
Address: 10.0.0.1
[root@ipa ~]# nslookup 10.0.0.1 127.0.0.1
Server:  127.0.0.1
Address: 127.0.0.1#53

1.0.0.10.in-addr.arpa    name = ns1.example-domain.com.
As it can be seen, DNS server correctly handles request for IP addres 10.0.0.1 and returns ns1.sistemnet.hr. Let's try with a name from LAN:
# nslookup test.example-domain.local 127.0.0.1
Server:  127.0.0.1
Address: 127.0.0.1#53

Name:    ipa.example-domain.local
Address: 192.0.2.1

[root@ipa named]# nslookup 192.0.2.1 127.0.0.1
Server:  127.0.0.1
Address: 127.0.0.1#53

1.2.0.192.in-addr.arpa    name = test.example-domain.local
That one is correct too. So, that's it, you have reverse DNS correctly configured. Testing from the outside I'm leaving to you as an exercise. ;)

Monday, June 25, 2012

Setting up DNS server...

In the previous post I described how to install minimal CentOS distribution with some additional useful tools. In this part I'm going to describe how to install DNS server. This DNS server will exhibit certain behavior depending on where the client is, in other words we are going to setup split DNS. Note that since DNS server is accessible from the Internet, it is placed in DMZ network, i.e. 10.0.0.0/24.

Environment and configuration parameters


In the following text we assume network topology from the previous post, but we need some additional assumptions and requirements before going further. First, our public domain is example-domain.com and all hosts within DMZ will belong to that zone, i.e. FQDN of mail server will be mail.example-domain.com. But, when some client from the Internet asks for certain host, e.g. mail.example-domain.com DNS server will return public IP address. On the other hand, when client from a local network, or even DMZ, asks for the same host private IP address will be returned, i.e. 10.0.0.2. The reason for such behavior is more efficient routing. In other words, if client from a local network would receive public IP address, then all the traffic would have to be NAT-ed. We'll also assume that we are assigned a public block of IP addresses 192.0.2.0/24.

There will be  also additional domain example-domain.local in which all the hosts from the local domain will be placed. Additionally, this domain will be visible only from the local network and DMZ, it will not exist for clients on the Internet.

To simplify things we are not going to install secondary DNS server nor we are going to install reverse zones. For a test purposes this is OK, but if you are doing anything serious, you should definitely install slave DNS servers (or use someone's service) and configure reverse zones. Also, I completely skipped IPv6 configuration and DNSSEC configuration.

Installation of necessary packages


There are a number of DNS server implementations to choose from. The one shipped with CentOS is BIND. It is the most popular DNS server, with most features, but, in the past it had a lot of vulnerabilities. Nevertheless, we'll use that one. Maybe sometime in future, I write something about alternatives, too.

So, the first step, is to install necessary packages:
yum install bind.x86_64 bind-chroot.x86_64 bind-utils.x86_64
Server itself is in bind package, bind-chroot is used to confine server to alternative root filesystem in case it is compromised. Finally, bind-utils contains utilities we need to test server. This is 4.1M of download data, and it takes around 7M after installation. Not much.

Note that it is a good security practice to confine server into alternative root. As I already said, in the past bind had many vulnerabilities, and in case new one is discovered and exploited, we want to minimize potential damage.

That's all about installation.

Configuration

Next, we need to configure server before starting it. Main configuration file for bind is /etc/named.conf . So, open it with your favorite editor as we are going to make some changes in it. Remember that our domain is called example-domain.com and that we want private addresses to be returned for internal clients, and public ones for external clients. Additionally, we have a local domain example-domain.local. Here is the content of the /etc/named.conf file:
acl lnets {
    10.0.0.0/24;
    172.16.1.0/24;
    127.0.0.0/8;
};

options {
  listen-on port 53 { any; };
  directory     "/var/named";
  dump-file     "/var/named/data/cache_dump.db";
  statistics-file "/var/named/data/named_stats.txt";
  memstatistics-file "/var/named/data/named_mem_stats.txt";
  allow-update     { none; };
  recursion no;

  dnssec-enable no;
};

view "internal" {
    match-clients { lnets; };
    recursion yes;

    zone "." IN {
        type hint;
        file "named.ca";
    };

    zone "example-domain.local" {
        type master;
        file "example-domain.local";
    };

    zone "example-domain.com" {
        type master;
        file "example.domain.local";
    };

    include "/etc/named.rfc1912.zones";
};

view "internet" {
    match-clients { any; };
    recursion no;

    zone "example-domain.com" {
        type master;
        file "example-domain.com";
    };
};
Let me now explain what this configuration file actually specifies. In global, there are four blocks, i.e. acl, options, and two view blocks.

acl block (or blocks, because there may be more such statements) specifies some symbolic name that will refer to group of addresses.  This is good from maintenance perspective because there is only one place where something is defined. In our case, we define our local networks, i.e. DMZ and local LAN. There is also loopback address because when some application on DNS server itself asks something, DNS server should treat it as if it is coming from a local network.

Next is an option block. In our case we specify that DNS server should listen (listen-on) port 53 on any IP address (any). In case you want to restrict on which addresses DNS server listens, you can enumerate them here instead of any keyword. Better yet, create ACL and use symbolic name. In option block we also disabled DNSSEC (dnssec-enable no), disabled updates to server (allow-update no), and disabled recursion (recursion no). Apart from that we defined directories, e.g. for zone files.

Finally, there are two view statements. They define zones depending on where the client asking is. If it is on the local networks, then first view (i.e. internal) is consulted, and if the client is anywhere else, then the second view is consulted (i.e. internet). match-clients statement classifies clients, and it is here that we use our ACL lnets. When query arrives, the source address from the packet is taken and compared to match-clients. The first one that matches is used. So, it is obvious that the second view matches anything but since it is the last one, then it is a catch-all view. Anyway, for our local clients we allow recursive queries (recursion yes), while we disallow it for a global clients (recursion no). This is a good security practice! Also, note that for internal clients we define example-domain.local zone, which isn't defined for global clients. And, for the same example-domain.com zone, two different files (i.e. databases with names) are used. This reflects the fact that we want local clients to get local IP addresses, while global clients should get globally valid IP addresses.

Zone configurations

There are three zones in our case. First one is a global one, used to answer queries to clients on the Internet. Looking into the main configuration file the name of the file containing this zone has to be example-domain.com and it has to be placed in /var/named directory! The content of this file is:
$TTL 1D
@    IN    SOA    @ root.example-domain.com. (
            2012062501    ; serial
            1D            ; refresh
            1H            ; retry
            1W            ; expire
            3H )          ; minimum
           NS    ns1.example-domain.com.

ns1        A    192.0.2.1
There are three records in the file. The first one defines zone itself, it is called Start of Authority, or SOA. The @ sign is shorthand for zone name and it is taken from the /etc/named.conf file. Then, there is a continuation that defines name server (NS) for the zone. It is continuation because the first column (where @ sign is) isn't defined so it is taken from the previous record. Name server is ns1.example-domain.com. Finally, there is IP address of the name server (A record). You should keep in mind that any name that doesn't end with dot is appended with a domain name. A lot of errors are caused by that omission. For example, if you wrote ns1.example-domain.com (without ending dot) this would translate into ns1.example-comain.com.example-domain.com. which is obviously wrong. But note that we are counting on this behavior in A records since we only wrote a name, without a domain!

The other zone files are similar to the previous one and also have to be within /var/named directory. I.e. for local view of example-domain.com. the content of zone file is:
$TTL 1D
@    IN    SOA    @ root.example-domain.com. (
            2012062501 ; serial
            1D         ; refresh
            1H         ; retry
            1W         ; expire
            3H )       ; minimum
        NS    ns1.example-domain.com.

ns1        A    10.0.0.2
Note that it is almost identical, except that IP address for name server is now from a private range! Finally, here is the content of zone that contains names from a local network:
$TTL 1D
@    IN    SOA    @ root.example-domain.local. (
            2012062501 ; serial
            1D         ; refresh
            1H         ; retry
            1W         ; expire
            3H )       ; minimum
        NS    ns1.example-domain.com.

test        A    172.16.1.1
Again, almost the same. Note that I added one test record with phony IP address. This is so that I can test DNS server if it is correctly resolving IP addresses.

One final thing before starting DNS server! You should change file ownership and SELinux context. To do so, use the following two commands (run them in /var/named directory):
chcon system_u:object_r:named_zone_t:s0 example-domain.*
chown root.named example-domain.*
The first command changes SELinux context, and the second one owner and group of files. So, it is time to start BIND, i.e. DNS, server:
/etc/init.d/named start
If there were some error messages, like the following one:
zone example-domain.com/IN/internal: loading from master file example-domain.com.local failed: permission denied
zone example-domain.com/IN/internal: not loaded due to errors.
Then something is wrong. Look carefully what it says to you as the error messages are quite self explanatory! In this particular case the problem is that DNS server can not read zone file. If you get that particular error message, check that SELinux context and ownership were changed properly.

Testing


Now, if you get just OK, it doesn't yet mean that everything is OK. You have to query server in order to see if it responds correctly. But, if there were any errors, they will be logged in /var/log/messages.

To test DNS server, use nslookup like this:
nslookup <name> 127.0.0.1
Here, I assume you are testing on a DNS server itself. If you are testing on some other host then you should change address 127.0.0.1 with proper IP address of DNS server (public one in case you ask from the internet, local one if you are asking from some computer in DMZ or local network). In any case you should receive correct answers. Here are few examples:
# nslookup ns1.example-domain.com 127.0.0.1
Server:        127.0.0.1
Address:    127.0.0.1#53

Name:    ns1.example-domain.com
Address: 10.0.0.2
# nslookup test.example-domain.local 127.0.0.1
Server:        127.0.0.1
Address:    127.0.0.1#53

Name:    test.example-domain.local
Address: 172.16.1.1
As expected, when internal host sends request internal IP address are returned. You should also check that any valid name is properly resolved (recursive queries). For example, you could ask for www.google.hr:
# nslookup www.google.hr 127.0.0.1
Server:        127.0.0.1
Address:    127.0.0.1#53

Non-authoritative answer:
www.google.hr    canonical name = www-cctld.l.google.com.
Name:    www-cctld.l.google.com
Address: 173.194.70.94
Which, in this case, resolves properly.

Now, there is a question how to test if names are properly resolved for clients on the Internet. The obvious solution is to try from some host on the Internet and see what's happening. In case you don't have host on the Internet, you can cheat using NAT. ;)

Do the following. First, add some random IP address to your loopback interface, i.e.:
ip addr add 3.3.3.3/32 dev lo
Then, configure NAT so that any query sent via loopback gets source IP address changed:
iptables -A POSTROUTING -o lo -t nat -p udp --dport 53 -j SNAT --to-source 3.3.3.3
And now, try to send few queries:
# nslookup www.slashdot.org. 127.0.0.1
Server:        127.0.0.1
Address:    127.0.0.1#53

** server can't find www.slashdot.org: REFUSED
That one is OK. Namely, we don't want to resolve third party names for outside clients (remember, recursive is off!). Next test:
# nslookup ns1.example-domain.local. 127.0.0.1
Server:        127.0.0.1
Address:    127.0.0.1#53

** server can't find ns1.example-domain.local: REFUSED
This one doesn't work either. Which is what we wanted, i.e. our local domain isn't visible to the outside world. Finally, this one:
# nslookup ns1.example-domain.com. 127.0.0.1
Server:        127.0.0.1
Address:    127.0.0.1#53

Name:    ns1.example-domain.com
Address: 192.0.2.1
Which sends us correct outside address.

So, all set and done. Don't forget to remove temporary records (i.e. test), iptables rule, lo address, and also don't forget to modify /etc/resolv.conf so that your new server is used by default by local applications.

Tuesday, December 6, 2011

Problems with resolver library...

I just had a problem that manifested itself in a very strange way. I couldn't open Web page hosted on a local network, while everything else seemingly worked. The behavior was same for Chrome and Firefox. In due course I realized that every application had this problem. On the other hand, resolving with nslookup worked flawlesly. This was very confusing. To add more to the confusion, while running tcpdump it was obvious that there were no DNS requests sent to the network! So, it was obvious that the problem was somewhere in the local resolver. At first, I suspected on nscd that was used as a caching daemon on Fedora, but in Fedora 16 this daemon is not installed. So, how to debug this situation? Quick google query didn't yield anything useful.

Reading manual page of resolv.conf there is section that says that you can use directive option debug. But trying to do that yielded no output! Neither there were any results using the same option but via RES_OPTIONS environment variable. This is strange, and needs additional investigation as why it is so, and more importantly to know how to debug local resolver.

In the mean time I figured out that the ping command behaves the same as browser and since ping command is much smaller it is easier to debug it using strace command. So, while running ping via strace I noticed the following line in the output:
open("/lib64/libnss_mdns4_minimal.so.2", O_RDONLY|O_CLOEXEC) = 3
which immediately rung a bell that the problem could be nsswitch! And indeed, opening it I saw the following line:
hosts:      files mdns4_minimal [NOTFOUND=return] dns myhostname
which basically said that, if mdns4 returns not found dns is not tried. It seems that mdns4 is used whenever the domain name ends in .local, which was true in my case. So, I changed that line into:
hosts:      files dns
and everything works as expected.

Since I didn't install explicitly mdns, I decided to remove it. But then it became clear that wine (Windows Emulator) depends on it. So, I left it.

About Me

scientist, consultant, security specialist, networking guy, system administrator, philosopher ;)

Blog Archive