Java hashCode(): Override faster that native implementation?
You have misused JMH, so the benchmark scores do not have much sense.
- It's usually not needed to run something in a loop inside a benchmark. JMH runs a benchmark loop itself in a way that prevents JIT compiler overoptimizing the code being measured.
- Results and side effects of the code being measured need to be consumed, either by calling
Blackhole.consume
or by returning the result from a method. - The parameters of the code are typically read from
@State
variables in order to avoid constant folding and constant propagation.
In your case, BookWithHash
objects are transient: JIT realizes the objects do not escape, and eliminates allocation altogether. Furthermore, since some of the object fields are constant, JIT can simplify hashCode
computation by using constants instead of reading the object fields.
On the contrary, the default hashCode
relies on the object identity. That's why the allocation of Book
cannot be eliminated. So, your benchmark is actually comparing the allocation of 20000 objects (mind the Double
object) with some arithmetic operations on the local variables and constants. No surprise, the latter is much faster.
Another thing to take into account is that the first call of identity hashCode
is much slower than the subsequent calls, because the hashCode needs to be first generated and put into the object header. This in turn requires a call to VM runtime.
The second and the subsequent calls of hashCode
will just get the cached value from the object header, and this indeed will be much faster.
Here is a corrected benchmark that compares 4 cases:
- getting (generating) an identity hashCode of a new object;
- getting an identity hashCode of an existing object;
- computing an overridden hashCode of a newly created object;
- computing an overridden hashCode of an existing object.
@State(Scope.Benchmark)
public class HashCode {
int id = 123;
String title = "Jane Eyre";
String author = "Charlotte Bronte";
Double price = 14.99;
Book book = new Book(id, title, author, price);
BookWithHash bookWithHash = new BookWithHash(id, title, author, price);
@Benchmark
public int book() {
return book.hashCode();
}
@Benchmark
public int bookWithHash() {
return bookWithHash.hashCode();
}
@Benchmark
public int newBook() {
return (book = new Book(id, title, author, price)).hashCode();
}
@Benchmark
public int newBookWithHash() {
return (bookWithHash = new BookWithHash(id, title, author, price)).hashCode();
}
}
Benchmark Mode Cnt Score Error Units
HashCode.book avgt 5 2,907 ± 0,032 ns/op
HashCode.bookWithHash avgt 5 5,052 ± 0,119 ns/op
HashCode.newBook avgt 5 74,280 ± 5,384 ns/op
HashCode.newBookWithHash avgt 5 14,401 ± 0,041 ns/op
The results show that getting an identity hashCode of an existing object is notably faster than computing hashCode over the object fields (2.9 vs. 5 ns). However, generating a new identity hashCode is a really slow operation, even comparing to an object allocation.
The performance difference is due to the fact that you are creating a new object for each hashCode()
invocation in the benchmark, and the default hashCode()
implementation caches its value in the object header, while the custom one obliviously does not. Writing to the object header takes a lot of time, since it involves a native call.
Repeated invocations of the default hashCode()
implementation perform a little better than the custom one.
If you set -XX:-UseBiasedLocking
, you will see that the performance difference decreases. Since biased locking information is stored in object headers too, and disabling it affects object layout, this is an additional proof.