Why not remove type erasure from the next JVM?
Type erasure is more than just a byte code feature that you can turn on or off.
It affects the way the entire runtime environment works. If you want to be able to query the generic type of every instance of a generic class, it implies that meta information, comparable to a runtime Class
representation, is created for each object instantiation of a generic class.
If you write new ArrayList<String>(); new ArrayList<Number>(); new ArrayList<Object>()
you are not only creating three objects, you are potentially creating three additional meta objects reflecting the types, ArrayList<String>
, ArrayList<Number>
, and ArrayList<Object>
, if they didn’t exist before.
Consider that there are thousand of different List
signatures in use in a typical application, most of them never used in a place where the availability of such Reflection is required (due to the absence of this feature, we could conclude that currently, all of them work without such a Reflection).
This, of course, multiplies, thousand different generic list types imply thousand different generic iterator types, thousand spliterator and Stream incarnations, not even counting the internal classes of the implementation.
And it even affects places without an object allocation which are currently exploting the type erasure under the hood, e.g. Collections.emptyList()
, Function.identity()
or Comparator.naturalOrder()
, etc. return the same instance each time they are invoked. If you insist on having the particalar captured generic type reflectively inspectable, this won’t work anymore. So if you write
List<String> list=Collections.emptyList();
List<Number> list=Collections.emptyList();
you would have to receive two distinct instances, each of them reporting a different on getClass()
or the future equivalent.
It seems, people wishing for this ability have a narrow view on their particular method, where it would be great if they could reflectively find out whether one particular parameter is actually one out of two or three types, but never think about the weight of carrying meta information about potentially hundreds or thousands generic instantiations of thousands of generic classes.
This is the place where we have to ask what we gain in return: the ability to support a questionable coding style (this is what altering the code’s behavior due to information found via Reflection is all about).
The answer so far only addressed the easy aspect of removing type erasure, the desire the introspect the type of an actual instance. An actual instance has a concrete type, which could be reported. As mentioned in this comment from the user the8472, the demand for removal of type erasure often also implies the wish for being able to cast to (T)
or create an array via new T[]
or access the type of a type variable via T.class
.
This would raise the true nightmare. A type variable is a different beast than the actual type of a concrete instance. A type variable could resolve to a, e.g. ? extends Comparator<? super Number>
to name one (rather simple) example. Providing the necessary meta information would imply that not only object allocation becomes much more expensive, every single method invocation could impose these additional cost, to an even bigger extend as we are now not only talking about the combination of generic classes with actual classes, but also every possible wildcarded combination, even of nested generic types.
Keep in mind that the actual type of a type parameter could also refer to other type parameters, turning the type checking into a very complex process, which you not only have to repeat for every type cast, if you allow to create an array out of it, every storage operation has to repeat it.
Besides the heavy performance issue, the complexity raises another problem. If you look at the bug tracking list of javac
or related questions of Stackoverflow, you may notice that the process is not only complex, but also error prone. Currently, every minor version of javac
contains changes and fixes regarding generic type signature matching, affecting what will be accepted or rejected. I’m quite sure, you don’t want intrinsic JVM operations like type casts, variable assignments or array stores to become victim of this complexity, having a different idea of what is legal or not in every version or suddenly rejecting what javac
accepted at compile-time due to mismatching rules.
Your understanding of backwards compatibility is wrong.
The desired goal is for new JVM's to be able to run old library code correctly and unchanged even with new code. This allows users to upgrade their Java versions reliably even to much newer versions than the code was written for.
To some extent erasure will be removed in the future with project valhalla to enable specialized implementations for value types.
Or to put it more accurately, type erasure really means the absence of type specialization for generics, and valhalla will introduce specialization over primitives.
Specifically I'm asking if there are any technical reasons why type erasure couldn't be removed in the next version of the JVM
Performance. You don't have to generate specialized code for all combinations of generic types, instances or generated classes don't have to carry type tags, polymorphic inline caches and runtime type checks (compiler-generated instanceof
checks) stay simple and we still get most of the type-safety through compile-time checks.
Of course there are also plenty of downsides, but the tradeoff has already been made, and the question what would motivate the JVM devs to change that tradeoff.
And it might also be a compatibility thing, there could be code that performs unchecked casts to abuse generic collections by relying on type erasure that would break if the type constraints were enforced.