Chapter 8 – DNS
DNS (the Domain Naming System) is a critical part of today’s Internet. Without it, we would have to keep massive (and always out-of-date) directories (like telephone books), where you could look up the name of some site (such as Dell’s pages about their PCs), then find the “telephone number” (IP address) of that page, which you would then “dial” (type into your browser). This is clearly not very practical. DNS is such a complex and critical topic for both TCP/IPv4 and TCP/IPv6 that I have included a chapter just for it.
8.1 – How DNS Evolved
Various schemes have been used to keep track of node names and their corresponding IP addresses. The end result is a remarkably powerful and flexible system called DNS.
8.1.1 – Host files
In the early days of TCP/IP, a file of known “hosts” (nodes) was kept in a special file called hosts in the /etc directory of all UNIX computers (the complete filename was /etc/hosts). This file included one or more lines, each of which contained an IP address, followed by one or more nodenames (e.g. www) or even fully qualified nodenames (e.g. www.ibm.com). If you told your copy of UNIX to use the hosts file for “name resolution”, when you used one of the nodenames listed in your hosts file, it would use the IP address associated with that name. This still works today – you can override DNS with your hosts file if you specify it first in the search order. Even Windows systems have a hosts file, typically located at
C:\ WindowsSystem32driversetchosts
A typical hosts file might look like:
172.20.0.11 ws1 ws1.hughesnet.local
172.20.0.12 ws2 ws2.hughesnet.local
172.20.0.13 us1 us1.hughesnet.local
8.1.2 – Network Information Service (NIS)
In organizations with many UNIX computers, and especially once people started linking networks together with TCP/IP, it became necessary to keep everyone’s hosts file up to date and synchronized. This was done manually for a while, then NIS (Network Information Service) was created by Sun to automatically distribute copies of the official hosts file (in addition to other important configuration files for UNIX) to every node, on a periodic basis.
8.1.3 – DNS is invented
Soon even this became unwieldy, so in 1993, at the request of Jon Postel, Paul Mocapetris designed DNS as a distributed database engine with distributed data. We are still using this system today. You will find that with minor extensions, it even supports IPv6 and Dual Stack networks. There is a gigantic, worldwide hierarchical system of DNS servers that allow each network administrator to manage the names and IP addresses of nodes in their network that users anywhere in the world might need to know about (e.g. that organization’s web servers, e-Mail servers, etc). DNS is also used to keep track of the nodenames and IP addresses of all internal nodes in their network, which only users in their organization need to know about (the organization’s file servers, network printers, intranet web servers, etc).
8.2 – Domain Names
Domain names refer to the hierarchical name space as defined by RFC 1035 (above); RFC 1123 “Requirements for Internet Hosts – Application and Support”, October 1989; and RFC 2181, “Clarifications to the DNS Specification”, July 1997. Briefly, domain names consist of a list of names (e.g. atlanta, usa, exampleco and com, in most specific to least specific order, separated by periods (e.g. atlanta.usa.exampleco.com). Here, com is the TLD (or Top Level Domain) name for commercial organizations. The name exampleco is the name of a hypothetical company, which is a commercial organization, and all parts of it use the domain exampleco.com. Within ExampleCo, there is a branch in the U.S., which uses the subdomain usa.exampleco.com. Finally, there is an office in Atlanta GA, which uses the subdomain atlanta.usa.exampleco.com. If there is a web server named www in that office, it would have a fully qualified domain name of www.atlanta.usa.exampleco.com. There is no way to tell (without more information) if the first name in such a string is a node’s name or the first component of a domain name.
8.2.1 – Top Level Domain Names
There are a number of TLDs including generic ones that have been in use for a long time:
com for commercial organization (company)
org non-commercial organization
edu Educational organization
net Internet related, e.g. ISP
gov Government related
mil Military related
There are also many ccTLDs (country code Top Level Domains), each of which uses the ITU two letter code for the country, such as us, uk, jp, and ph). There are a few exceptions to the ITU code usage, e.g. the ITU code for Great Britain is gb, while their ccTLD is uk. Each country manages subdomains under their ccTLD as they best see fit. Certain ccTLDs appear to have other meanings, like tv for the country of Tuvalu, which sells domains in their space to people who want to use it to mean television. Under ccTLDs, there are usually (but not always) second level domains, such as co for commercial, or for organization, etc. Actual organization names would then be third level domain names. Hence a UK based commercial entity called Warmbeer, Ltd. might have the domain name warmbeer.co.uk. Their web server might be www.warmbeer.co.uk. A few ccTLDs, like ph for the Philippines use the full three letter code for organization type instead of the more common two letter codes, as second level domain name. For example, our Philippine domain name is infoweapons.com.ph.
Recently, some new generic TLDs have been introduced, including info, ws, museum, aero, asia, biz, coop, int, mobi, name, pro, tel and travel. None of those have become really popular yet. The main reason people use them is that they can’t get the domain name they want using the more common TLDs like com, org or net.
8.2.2 – Internationalized Domain Names
There are also internationalized domain names (IDNs) that use 16-bit Unicode characters to allow domain names in languages that have non-Latin alphabets. Your browser will translate these Unicode domain names into strings in UTF8, using the punycode algorithm (shown in the last column below). This is defined in RFC 3492, “Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)”, March 2003. For example, the following (believe it or not) are syntactically valid URLs (although they do not current point to real sites):
8.3 – DNS Resolver
All operating systems today include a DNS client, called a resolver. All network applications use the resolver to look up nodenames and obtain their corresponding IP addresses, whether those nodes are local (on the organization LAN) or external (out on the Internet). The resolver contacts one of the DNS servers specified in their TCP/IP configuration (either local, or at your ISP). If that server is authoritative for the requested domain names, it returns the addresses immediately. Otherwise, that server can either return a hint of where to look (“I don’t have that information.. try here”) or do a recursive lookup (“I didn’t have that information, but I went and found it for you”). Of course, the lookup could fail (“I couldn’t find that domain name anywhere”).
8.4 – DNS Server Configuration
The full process of setting up DNS servers (usually two or more) for an organization and populating them with node information is too complicated to cover in this book. If you are using the Microsoft DNS server (included free with Windows Server), see their documentation for details. If you are using BIND (the freeware DNS server from the Internet Software Consortium, see O’Reillys “BIND and DNS, 5th Edition” for details. If you have a DNS appliance, consult their documentation or online help for details.
In general though, you define both “forward zones” that map nodenames to IP address, and “reverse zones” that map IP addresses onto nodenames. You also have to inform all client computers of the IP addresses of at least two DNS servers, that they can use for resolving nodenames to IP addresses (or vice versa). Client computers can be informed of these DNS server addresses either via manual configuration, or automatically via DHCPv4 or DHCPv6. If your client computer doesn’t know where to find DNS servers, you may have full Internet connectivity but no name resolution. You can ping (or even surf to) nodes anywhere in the world by specifying their numeric IP addresses (e.g. http://64.170.98.32 – try it!). However, most people would consider such a computer to not be very useful. This gives you a very good idea of how important DNS is to the Internet (even the Second Internet).
8.5 – DNS Protocol
DNS is an Application Layer protocol. It uses UDP port 53 (for most queries and responses) or TCP port 53 (for zone transfers between DNS servers). It was originally defined in RFC 882, “Domain Names – Concepts and Facilities”, November 1983, and RFC 883, “Domain Names – Implementation and Specification”, November 1983. Those were replaced by RFC 1034 “Domain Names – Concepts and Facilities”, November 1987, and RFC 1035, “Domain Names – Implementation and Specification”, November 1987. There have been numerous updates to these, including RFCs 1101, 1183, 1348, 1876, 1982, 1995, 1996, 2065, 2136, 2137, 2181, 2308, 2535, 2845, 3425, 3658, 4033, 4034, 4035, 4343 and 4592.
8.6 – DNS Resource Records
The data in DNS servers is kept in resource records. In forward zones, it is possible to have any of the following resource records (the following list is not comprehensive):
Name Contents
A “A” - IPv4 address associated with a domain name
AAAA “Quad-A” - IPv6 address associated with a domain name
MX “Mail eXchange” – domain name of a mail server for the domain
SRV “Service” – domain name of servers for other protocols, such as SIP and LDAP
CNAME “Alias” – provide an alternative domain name for another domain name
HINFO “Host Info” – any arbitrary info you want to provide about a host
NAPTR “Naming Authority Pointer” – used mostly in ENUM
NS “Name Server” – name of a valid DNS server for this domain
SOA “Start of Authority” – start of a zone in configuration files, includes default TTL
SPF “Sender Policy Framework” – used in anti-spam technology
TSIG “Transaction Signature” – symmetric cryptographic key used in zone transfers
TXT Any arbitrary text information
In reverse zones, typically only the following resource records are found:
NS “Name Server” – name of a valid DNS server for this domain
SOA “Start of Authority” – Same as in forward zones
PTR “Pointer Record” – IP address for a specific node, in reverse order
The following examples show how typical resource records look:
In general, it is a pain to manually create reverse PTR records, and any change to IP addresses (e.g. from changing ISPs) requires changes to all forward and reverse resource records in DNS. Here again, an appliance with GUI can help by automatically generating reverse PTR resource records. This is especially useful for IPv6 reverse PTR records.
In the InfoWeapons SolidDNS appliance, you can define named networks for IPv6. When you define a network, which will create the associated reverse zone, you can assign that network a name, which has the value of the network’s prefix. You can then define node addresses in terms of the network name. First, this fills in the first 64 bits of each address, which reduces errors and saves time. However, if you ever change ISPs, you can simply redefine the prefix for the network, and all forward and reverse resource records created from the nodes specified using that network name will be updated with the new prefix. This is called instant prefix renumbering. There was an a6 resource record created at one point for IPv6 forward resource records that was supposed to accomplish this, but there were so many problems with it, it has now been deprecated (you are not supposed to use it any more). It is much better to do this in an appliance that has GUI and database, and generates only the standard AAAA resource records.
8.7 – DNS Servers and Zones
A given DNS server can have any number of zones defined on it. A given zone can be a forward zone (for mapping domain names to IP addresses) or a reverse zone (for mapping IP addresses to domain names). There is usually one forward zone for each domain for which the DNS server contains information (e.g. hughesnet.org), and one reverse zone for each network that DNS server contains information for (e.g. 172.20.0.0/16). So, the forward zone for hughesnet.org might contain mappings for ws1.hughesnet.org to 172.20.0.11, for us1.hughesnet.org to 172.20.0.13, and so on. The reverse zone for 172.20.0.0/16 might contain mappings from 172.20.0.11 to ws1.hughesnet.org, from 172.20.0.13 to us1.hughesnet.org, 172.20.0.91 to us1.v6home.org, and so on.
Any zone (forward or reverse) can be a primary zone or a secondary zone. A primary zone is one on which the DNS administrator manages the contents of (either via a GUI interface or via editing BIND configuration files). A secondary zone is one whose contents are automatically transferred from a corresponding primary zone of the same name on a different DNS server (no management is required for a secondary zone, once that zone is created). When you create a secondary zone, you specify the IP address of the DNS server that contains the corresponding primary zone. Usually there is one primary zone (on one DNS server) and one or more secondary zones (each on other DNS servers) for a given set of records. A given DNS server can have any mix of primary zones and secondary zones. Sometimes the terms primary and secondary are used for entire DNS servers, especially if all zones on a server are all primary zones or all secondary zones, but technically the terms refer to zones, not servers. The transfer of all records from a primary zone on one DNS server to a secondary zone on another DNS server is called a zone transfer. Typically a primary zone is configured to allow zone transfers only to secondary zones on authorized DNS servers (by IP address). There is also a cryptographic authentication scheme called TSIG that can restrict zone transfers to only authorized secondary zones. Otherwise, a hacker could perform a zone transfer from one or more of your primary zones, and obtain information useful in attacking your network (effectively, a “map” of at least part of your network). Typically zone transfers from primary zones to secondary zones are done automatically on a periodic basis). If a hacker changes data in a secondary zone, the correct data would be automatically restored as of the next zone transfer. If a hacker changes data in a primary zone, the hacker’s changes will be automatically and securely transferred to all secondary zones via the regular zone transfers. It is very important to secure your primary zones.
It is possible for all of the zones on a given DNS server to be accessible by one or more clients for performing DNS resolutions (lookups), in which case it is a resolving server. A primary server that is not accessible for resolutions by any client (or other DNS servers) is called a stealth server. It is only ever used to do zone transfers to secondary servers (hence need not be very powerful). Access via UDP port 53 can be completely disabled (zone transfers take place over TCP port 53), and even those can be restricted by IP address. Use of a stealth server lowers the possibility of hackers being able to attack your primary DNS server. There would be no real use for a “stealth secondary server”.
8.8 – Different Types of DNS Servers
There are different types of DNS servers based on how they are populated with data.
8.8.1 – Authoritative DNS Servers
A DNS server which contains a primary zone or a secondary zone is said to be authoritative for the domain (or network) defined in that zone. All resolving servers cache (temporarily store) the results of any query they perform on clients.
If a client makes a query of a resolving server that currently has the required information (either because it is authoritative or because it has cached it from a previous query) it responds with that information to the client immediately. If a resolving server is asked for information it does not currently have, it can either return a reference (“I don’t know, go ask this server”), or it can do a recursive query on the client’s behalf (“I didn’t know, but I went and found out for you by making client queries myself on your behalf, and here is what I found.”) A recursive query can go through several servers before the requested information is finally obtained and returned to the client that asked for it in the first place. Any server involved in the process typically caches the retrieved information. Every record published by a DNS server has a Time To Live (TTL) defined for it. When a record is cached, it is kept on the caching server only for the defined Time To Live for that record, after which it is considered stale and is discarded. Once a DNS server discards stale information, if it is asked for it again, it must do another recursive query, at which point it again caches the record. This caching and expiration scheme keeps the data current, but means that a change to authoritative information may take a while to propagate to all other servers (often 24 to 48 hours, depending on Time To Live values chosen).
When a client obtains information from an authoritative server, it is reported as an authoritative answer. When it obtains information that has been cached, it is reported as a non-authoritative answer. This doesn’t mean it is any less trustworthy, just that it obtained the information at “second hand”, (out of some DNS servers cache) instead of directly from the authority on the subject (an authoritative server).
8.8.2 – Caching-Only Servers
A resolving server that has no defined primary or secondary zones, is called a caching only server, and typically, once setup and configured, requires little or no management.
8.9 – Client Access to DNS
In a typical network, every client should have the addresses of at least two valid resolving DNS servers configured. If a connection to one of them fails, the client will automatically try the other configured addresses. This increases the robustness of the network. In a small network (e.g. home connection), the specified servers may be located at and managed by the ISP. In some cases, the DSL or cable modem might provide a DNS proxy function, which allows DNS queries to be submitted to the default gateway address. The modem relays such requests to the DNS servers configured in the modem, and returns the replies to the internal client that made the request.
Any network can have one or more local DNS servers (assuming they can make outgoing queries via UDP port 53). To run an authoritative server on a network, that server must be accessible by relevant clients and other servers. If any of those clients or servers is external, then the authoritative server must have a globally accessible “external” IP address (not a private IP address). For example, I run an authoritative DNS server for my domain hughesnet.org in my home, on a DNS server that has a valid external IP address. I also run other servers (e-mail and web) that have globally accessible external IP addresses (in my case, both IPv4 and IPv6 addresses). I can access these services from anywhere on the Internet, just like using servers at ISPs. This sort of thing is far simpler and less expensive with IPv6 than with IPv4.
8.9 1 – Recursive DNS Queries
A single DNS query (e.g. “lookup the IP addresses for node ws1.hughesnet.org”) can actually require several resolutions. If the server already has information for ws1.hughesnet.org , either because it is authoritative for that information, or because it has still valid cached information, it returns the requested information immediately (“ws1.hughesnet.org has an IPv4 address which is 172.20.0.11 and an IPv6 address which is 2001:418:5403:3000::c”). It is up to the client which of these is used. If it is a dual-stack client (supports both IPv4 and IPv6) it should use the IPv6 address by preference.
If information for ws1.hughesnet.org is not present on the resolving DNS server, that server must find the authoritative server for the domain hughesnet.org. The server that is authoritative for the domain org can tell it this information. To locate that server, the resolving server can ask any root DNS server. To locate a root DNS server, the resolving server can look in its root hints file. Any of these things could already be in cache (and typically are, if any other nodename ending in .org or .hughesnet.org have been looked up by any client recently). If none of them are in cache, then first, the resolving DNS server will ask a root DNS server “who is authoritative for domain org”. It will cache the response it gets, and ask the returned server for org “who is authoritative for domain hughesnet.org”. It will cache that response also, and ask the returned server for hughesnet.org “what is the IP address of ws1.hughesnet.org”. It will cache that response as well, and return the answer to the client, who has been patiently waiting. Most DNS servers have a way to empty (or “dump”) the cache if you would like to watch all of this happen with a network sniffer (this would require root level access on the computer running your DNS server).
8.10 – The Root DNS Servers
All DNS queries eventually chain up to one of the 13 root DNS servers (or the cached data from them). In reality, “DNS anycast” is employed so that there are actually quite a few copies of most of the 13 root servers distributed around the world (see table below). The current information on the root servers (from which I made the table below) is always available at:
http://www.root-servers.org
Every DNS server includes a file with the current anycast addresses of the 13 root servers (a.root- servers.net to m.root-servers.net), as summarized in the table below. A copy of the official current file (in BIND format) can always be found at:
http://www.internic.net/zones/named.root
All DNS server operators from time to time obtain the current copy of this file and update their server(s) root hints file with it. The information in this file allows a DNS server to locate a root server when it needs one.
The only thing the DNS root servers publish is the information in a short file that is maintained by IANA that helps other DNS servers locate the DNS hierarchy layer just below that of the DNS root servers (i.e. the servers that are authoritative for the top level domains such as com, net, org, uk, jp, etc.) All root servers publish the same information, so only one ever needs to be asked (typically chosen at random from the 13 available). A copy of the current version of this information (in BIND format) can always be found at:
http://www.isoc.org/briefings/020/zonefile.shtml
Only operators of DNS root servers ever actually need to obtain this file and update their DNS root servers with it. In reality, due to DNS caching the actual root servers are only rarely involved in a typical DNS query. A typical non-root DNS server only needs to access a root server about once every 48 hours. It would normally have the information published by the root servers cached in memory from previous enquires. Only once the Time To Live expires for a given resource record obtained from an actual root server, would the DNS server have to go back and obtain more up to date information from a DNS root server (which it would again cache to use in future lookups). Most of the time, this new information will just be the same information that just expired.
Current Root Servers (all in the domain “root-servers.net”)
In the above table, the first number in the count field (before the slash) is the total number of anycast servers for that name, regardless of IP version. The second number (after the slash) is the number of anycast servers for that name that can accept queries over IPv6. Currently, all root servers will accept queries over IPv4 (this may not always be the case). All root servers can return A and/or AAAA records for the servers authoritative for Top Level domains. One of the watershed events for IPv6 happened in February 2008, when VeriSign enabled IPv6 access on enough of the root servers that a client doing queries over IPv6 would always be able to complete a query without having to fail back to IPv4. Since then, clients that access DNS over IPv6 (IPv6-only nodes or dual stack nodes) can resolve names to addresses as effectively as IPv4-only nodes have been able to since the introduction of DNS. Eventually all root servers will probably support queries over IPv6.
Total Root Server Names = 13
Total Root Server Names that accept connections over IPv6 = 8 (61.5% of names)
Total Deployed Root Servers = 202
Total Deployed Root Servers that accept connections over IPv6 = 51 (25.2% of total)
8.11 – MX and SRV records
In addition to providing nodename to IP address lookup (forward resolution) and IP address to nodename lookup (reverse resolution), DNS servers can also advertise the preferred servers for various functions, such as e-mail (SMTP), VoIP (SIP), etc.
The MX (Mail Exchanger) record can advertise one or more e-mail server names, with priorities. Other mail servers when they want to deliver mail to your domain will do a DNS query asking for the MX record(s) for your domain. The sending server will try to make connections over port 25 (SMTP) to the advertised nodenames, in decreasing priority, until it either has a connection accepted (in which case it will deliver all the mail it has for your domain), or it runs out of advertised nodenames (in which case it will try again on some schedule, until it succeeds, or decides your domain is not currently online). Thus a client can send messages to any name at your domain (fred@hughenet.org), and it