Why does java.net.URL's hashcode resolve the host to an IP?

Why does java.net.URL’s hashcode resolve the host to an IP?

There are two reasons. The first is:

  • The URL class's behavior was designed to model a URL being a locator of network accessible resource . Specifically equals and hashCode() were designed so that two URL instances are equal if they locate the same resource. This requires that the DNS name be resolved to an IP address.

With the benefit of hindsight we know the following:

  1. The URL.equals method cannot1 reliably determine if two URL strings are locators for the same resource. Reasons include virtual hosting, HTTP 30x forwarding, and server internal mapping of URLs, and so on.

  2. The IP resolution behavior of URL.equals and URL.hashcode is a trap for inexperienced Java programmers, even though it is clearly documented.

  3. Even in cases where it leads to the correct answer, IP resolution by URL.equals can be an unexpected (and unwanted) performance hit.

In short ... that aspect of the design for URL was a mistake.

This brings us to the second, more important reason.

  • The behavior of URL.equals(Object) was defined a LONG time ago, and it would be impossible to change now without breaking (possibly) millions of deployed Java applications. This rules out any possibility that Sun (now Oracle) will change it.

Maybe the designers of a (hypothetical) successor to the Java class library could fix this (and other things). Of course, backwards compatibility with existing Java programs would have to be thrown out of the window to achieve this.

And finally, the real answer for Java application developers is to simply use the URI class instead. (Real software engineering is about getting the job done as well as you can, not about complaining about the tools you have been provided with.)


1 - When I say "cannot" above, I mean that it is theoretically impossible. Dealing with some of the more difficult cases would require changes to the HTTP protocol. And even if some (hypothetical) future version of HTTP "fixed" the problem, we'd still be dealing with legacy HTTP servers in 20 years time ... and URL.equals would therefore still be broken.


A lot of people think this was a very bad idea.

Here's some explanation from the Javadoc of URI. This question is also useful.


Don't use java.net.URL. That's the simple answer to your question. Use java.net.URI instead, which won't do hostname resolution.

Tags:

Java

Url