How do `/etc/hosts` and DNS work together to resolve hostnames to IP addresses?
This is dictated by the NSS (Name Service Switch) configuration i.e. /etc/nsswitch.conf
file's hosts
directive. For example, on my system:
hosts: files mdns4_minimal [NOTFOUND=return] dns
Here, files
refers to the /etc/hosts
file, and dns
refers to the DNS system. And as you can imagine whichever comes first wins.
Also, see man 5 nsswitch.conf
to get more idea on this.
As an aside, to follow the NSS host resolution orderings, use getent
with hosts
as database e.g.:
getent hosts example.com
To answer just your last question: /etc/hosts
doesn't apply again immediately because firefox
is caching the last hostname it got for google.com
; if you want it to always fetch it again, you'll have to set network.dnsCacheExpiration
to 0
in about:config
. More info (though a bit outdated) here. Sorry if this is offtopic.
As a sidenote, many programs don't use the standard resolver (getaddrinfo(3)
, getnameinfo(3)
[1]) because it sucks.
First, the interface is not asynchronous; any moderately complex program will have to spawn a separate thread doing just the getaddrinfo()
and then invent its own protocol to communicate with it (and let's not even enter into getaddrinfo_a()
, which is sending a signal upon completion, so it's even worse).
Second, the resolver implementation in glibc
(the standard C library in linux) is horrible, expecting you to let it pull random dynamic objects into the address space via dlopen()
behind your back, and making it impossible to contain it in any way or use it in statically linked executables.
Since many programs don't use the standard resolver directly, they also don't bother to replicate its behavior exactly, and ignore some or all of /etc/resolv.conf
, /etc/hosts
, /etc/nsswitch.conf
or /etc/gai.conf
.
[1] and don't even mention the non-reentrant, ipv4-only gethostbyname()
, which was deprecated since ages.
The file /etc/hosts
and the DNS don't work together. They provide independent resolutions of names (network names).
The glue that links them is inside /etc/nsswitch.conf
for linux systems. In /etc/netsvc.conf
for AIX servers, in the system for Windows and could be listed with lookupd -configuration
(search for LookupOrder, similar to: Cache FF DNS NI DS
) in MacOS systems.
The actual order becomes complex and usually convoluted as each name resolution service could (and many times do) look inside other levels of resolution. Like dnsmasq
(a light DNS server generally at 127.0.0.1:53
, or ::1:53
(or both)) usually reads and includes the /etc/hosts
file contents. Or like systemd.resolver
(a basic resolver that should only resolve un-dotted names like mycomputer
) calls directly DNS resolutions for dotted names (mycomputer.here.dev.
) under some conditions.
In general, services are called in order and the first one that doesn't fail wins and is accepted as the correct address. The general basic order is: /etc/hosts
(file), mDNS (un-dotted names), DNS, NIS, NIS+, LDAP. In some linux systems there is a last resort resolution for the computer hostname
in the service myhostname
For example, in this system (from cat /etc/nsswitch
):
hosts: files mdns4_minimal [NOTFOUND=return] dns myhostname
Note that the very old (glibc 2.4 and earlier) order
entry set in /etc/host.conf
as:
order hosts,bind,nis
Only apply to the files (file /etc/hosts
) name service.
The effects on this (linux) client computer related to NIS and LDAP are (usually) controlled by the DNS server used (bind, unbound, etc.).
so:
- If a hostname can be resolved in /etc/hosts, does DNS apply after /etc/hosts to resolve the hostname or treat the resolved IP address by /etc/hosts as a "hostname" to resolve recursively?
None.
If a hostname can be resolved in /etc/hosts
, the DNS
doesn't apply (if files is before DNS).
nor is the resolved IP address treated as a "hostname".
It simply is: the resolved address.
browser
A browser could use any method to resolve a name (after it has checked its cache of resolved names). Only if it uses a system provided method the order given above apply. The browser, as any program, could choose to contact any DNS server directly.
If the system order has /etc/hosts
before DNS
, it means that an entry in that file will take precedence to DNS
resolution service.
So:
- ... Does it mean that /etc/hosts overrides DNS for resolving hostnames?
Yes (if the browser use the system provided resolution).
Why doesn't
/etc/hosts
apply again, so that I can't connect to the website?
Only until the browser internal cache is cleared (or it times out) for that specific name is that name searched outside of the browser again.
If the browser has a name resolved in its cache, the browser uses it again.
Use this to clear the cache.
Or simply close (wait a while) and re-start the browser.