Why does putting a dot after the URL remove login information?
Adding the dot to the end of the domain name makes it an absolute fully-qualified domain name instead of just a regular fully-qualified domain name, and most browsers treat absolute domain names as being a different domain from the equivalent regular domain name (I'm not sure why they do this though).
A bit of background:
The domain name system is strictly hierarchical, just like a filesystem, or an X.500/LDAP directory. Unlike filesystems or X.500 though, the hierarchy is listed right-to-left instead of left-to-right. So the rightmost component of a domain name is the top of the hierarchy. Putting a dot to the far right of a domain name makes it absolute, meaning that it's explicitly rooted at the top of the DNS hierarchy. In essence, it's the same as using a full distinguished name instead of a common name in an X.500 lookup, or putting a /
at the beginning of a POSIX path.
Using an absolute FQDN has a few specific implications for how a client system will look up the DNS record for that domain:
- It causes some resolvers to skip any locally defined entries (for example, it will cause some resolvers to ignore
/etc/hosts
on a UNIX-like system). - When used with the
.local
domain, it will force some systems to use mDNS instead of traditional DNS to try and resolve the name. - It causes all resolvers to ignore any configured search domain or local DNS domains when looking up the name.
That last part is the important part, and is the reason that the concept of an absolute FQDN exists. Most systems can be configured with what's called a search domain. When they go to resolve a given domain, they will try looking under any configured search domains first, and only resolve from the top of the hierarchy if they can't find the name in any configured search domains (so, if you had foo.example
configured as a search domain on your system and tried to go to bar.example
in a browser, it would (normally, see below) try to go to bar.example.foo.example
first, and only if it couldn't find that would it try bar.example
directly). Most, but not all, resolvers these days ignore the search domain when resolving a domain that ends with a known top-level domain name (.com
, .net
, etc), so it's not usually necessary for most users to use absolute FQDN's, and thus most people don't know about them.
This is because example.com
and example.com.
are (sometimes!) considered different hosts, for two reasons:
- Because they actually can have different meanings, depending on your specific network configuration.
- Because the Internet Standard RFCs that define the syntax say so, depending on how you interpret them.
If the browser considers them different hosts, it will not share session state (e.g. cookies) between them, so one "host" would not know the other one has you logged in.
Part of this is that the browser may not know, depending on its implementation, that the two actually resolve to the same name. Especially if it has passed DNS resolution off to a remote resolver and only expects an IP address back (rather than the whole expanded record).
Summary
- The two hosts can actually be different, in the real world.
- It's sometimes unclear how the standards view them. It seems many application standards don't explicitly deal with the situation.
- Of the ones that describe domain name canonicalization and comparison, they usually split the name into the individual "labels".
- Then it hinges on whether you consider the absolute form of the domain name to have an additional null label, as the original DNS RFC states.
- In an ideal world, all these comparisons would take place using absolute domain names only, and relative names would not be used past the original lookup. Or browsers can take all names as absolute and disallow relative lookup. But this does not seem to currently be the case, and could introduce other problems.
- While it might not be illegal for a browser to perform the DNS lookup itself (rather than using OS resolvers) and therefore figure out the final absolute domain name, this is also not required by any standard I could find.
Practical differences
The different meanings part is, as Austin noted, a consequence of how DNS lookups work with search suffices. Your typical non-rooted label, e.g. example.com
, will cause your typical DNS resolver to first try any search suffices defined on your system. In a corporate environment this may be your company domain, e.g. if you have mycompany.example.
defined as a search suffix then any lookup for example.com
will first try example.com.mycompany.example.
. This is useful if you wanted to look up a internal server without having to type the whole fully qualified ("complete") domain.
But what if you actually wanted the public example.com
? You can use the trailing .
, in the form example.com.
, in order to tell the resolver that you have entered an absolute ("complete") name and not to try any relative lookups against search suffices.
How Internet Standards view the situation
There's a few places we need to look for how these are standardised, and unfortunately the waters can be a bit muddied. I usually like to look for the most relevant standard first and go back from there, but since this is so scattered it may be easier to start from the bottom.
Domain names
Internet Standard RFC1034 describes domain names in section 3.1 and specifies the "preferred name syntax" for domain names in section 3.5. Note in section 3.1:
Each node has a label, which is zero to 63 octets in length. Brother nodes may not have the same label, although the same label can be used for nodes which are not brothers. One label is reserved, and that is the null (i.e., zero length) label used for the root.
[...]
When a user needs to type a domain name, the length of each label is omitted and the labels are separated by dots ("."). Since a complete domain name ends with the root label, this leads to a printed form which ends in a dot. We use this property to distinguish between:
a character string which represents a complete domain name (often called "absolute"). For example, "poneria.ISI.EDU."
a character string that represents the starting labels of a domain name which is incomplete, and should be completed by local software using knowledge of the local domain (often called "relative"). For example, "poneria" used in the ISI.EDU domain.
Relative names are either taken relative to a well known origin, or to a list of domains used as a search list. Relative names appear mostly at the user interface, where their interpretation varies from implementation to implementation, and in master files, where they are relative to a single origin domain name. The most common interpretation uses the root "." as either the single origin or as one of the members of the search list, so a multi-label relative name is often one where the trailing dot has been omitted to save typing.
URIs
From there we can go to how domain names are used in URIs, Internet Standard RFC3986. In section 3 we see the URI syntax. The part we are interested in is the authority, which contains a host (followed by an optional :
port). This is further defined in section 3.2.2, specifically the part that talks about a registered name:
A registered name intended for lookup in the DNS uses the syntax defined in Section 3.5 of [RFC1034] and Section 2.1 of [RFC1123]. Such a name consists of a sequence of domain labels separated by ".", each domain label starting and ending with an alphanumeric character and possibly also containing "-" characters. The rightmost domain label of a fully qualified domain name in DNS may be followed by a single "." and should be if it is necessary to distinguish between the complete domain name and some local domain.
This brings us back to search suffices and the possibility of a "local domain" matching a different result from the "complete domain". Remember that conceptually, according to RFC1034, example.com.
is equivalent to example.com.<root>
, where <root>
is the special null label.
There is some discussion of normalization in section 6 but nothing about the host, let alone trailing dots.
Proposed Standard RFC 7230, which defines HTTP/1.1, notes that it largely follows RFC3986 for URI definitions in section 2.7.
TLS
This is where things get confusing.
Informational RFC2818 describes HTTP over TLS (HTTPS). It doesn't say anything explicit about host matching, apart from following the rules in RFC2459 (replaced by Proposed Standard RFC5280). This refers back to RFC1034 (the one that defined DNS), but says nothing explicit about absolute addresses or trailing dots.
Proposed Standard RFC6125 is a more modern take on uses of TLS. It talks more about domain name matching, but again does not explicitly address trailing dots -- though, it does say you're supposed to match "fully qualified domain names" only (this is a surprisingly poorly-defined concept). It does say all labels must match - which goes back to RFC1034, and if we consider the null-label to be representing the root then example.com
and example.com.
do have different labels (the latter has 3, example
, com
and <root>
).
There's some discussion over in Mozilla bug 134402 over the different interpretations.
Cookies
Moving away from TLS a bit, we can look at cookies in Proposed Standard RFC6265. There, section 5.1.2 and section 5.1.3 talk about canonicalization and matching of host names. Here, again, we split the host name into individual labels to perform the canonicalization (which essentially converts Unicode domain names to ASCII/punycode lowercase). And, again hinges on whether you consider the null-label representing the root to have been preserved through this canonicalization step. If you do, then they have different labels, and thus are different hosts for the purpose of cookies.
The explanation given by Mokubai is exactly correct, and the problem is in the browser not identifying that this is the same domain and therefore not sending the cookies.
But the situation is even worse: The dot at the end only marks the domain as
fully-qualified (unambiguous), which works quite well with DNS,
as the message does finally get to the right address (superuser.com
).
I have even gotten from Fiddler this dialog for superuser.com.
(with dot):
With some empirical testing, here are the headers sent with these two requests.
https://superuser.com
(sensitive info crossed out)
https://superuser.com.
(with dot, no sensitive info needs to be crossed out)
Conclusion: The problem is with the browser not ignoring a dot at the end of a fully-qualified domain name, as is quite possible by the DNS standard.
Further remark: The browser developers were not the only one to fall in this
trap. I have the NoScript add-on installed to stop all JavaScript, but
superuser.com
(no dot) is allowed through. But NoScript still blocks
superuser.com.
(with dot) as being an unknown website.
I have no doubt that testing would find the same behavior in many other products.
It is strange that the developers from major actors in the Web domain, such as Google Chrome, Firefox and Microsoft's Fiddler, all responsible for many advances in Web standards, have not paid attention to this possibility.