Why covariance and contravariance do not support value type

I think everything starts from definiton of LSP (Liskov Substitution Principle), which climes:

if q(x) is a property provable about objects x of type T then q(y) should be true for objects y of type S where S is a subtype of T.

But value types, for example int can not be substitute of object in C#. Prove is very simple:

int myInt = new int();
object obj1 = myInt ;
object obj2 = myInt ;
return ReferenceEquals(obj1, obj2);

This returns false even if we assign the same "reference" to the object.


Basically, variance applies when the CLR can ensure that it doesn't need to make any representational change to the values. References all look the same - so you can use an IEnumerable<string> as an IEnumerable<object> without any change in representation; the native code itself doesn't need to know what you're doing with the values at all, so long as the infrastructure has guaranteed that it will definitely be valid.

For value types, that doesn't work - to treat an IEnumerable<int> as an IEnumerable<object>, the code using the sequence would have to know whether to perform a boxing conversion or not.

You might want to read Eric Lippert's blog post on representation and identity for more on this topic in general.

EDIT: Having reread Eric's blog post myself, it's at least as much about identity as representation, although the two are linked. In particular:

This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.


It is perhaps easier to understand if you think about the underlying representation (even though this really is an implementation detail). Here is a collection of strings:

IEnumerable<string> strings = new[] { "A", "B", "C" };

You can think of the strings as having the following representation:

[0] : string reference -> "A"
[1] : string reference -> "B"
[2] : string reference -> "C"

It is a collection of three elements, each being a reference to a string. You can cast this to a collection of objects:

IEnumerable<object> objects = (IEnumerable<object>) strings;

Basically it is the same representation except now the references are object references:

[0] : object reference -> "A"
[1] : object reference -> "B"
[2] : object reference -> "C"

The representation is the same. The references are just treated differently; you can no longer access the string.Length property but you can still call object.GetHashCode(). Compare this to a collection of ints:

IEnumerable<int> ints = new[] { 1, 2, 3 };
[0] : int = 1
[1] : int = 2
[2] : int = 3

To convert this to an IEnumerable<object> the data has to be converted by boxing the ints:

[0] : object reference -> 1
[1] : object reference -> 2
[2] : object reference -> 3

This conversion requires more than a cast.


It does come down to an implementation detail: Value types are implemented differently to reference types.

If you force value types to be treated as reference types (i.e. box them, e.g. by referring to them via an interface) you can get variance.

The easiest way to see the difference is simply consider an Array: an array of Value types are put together in memory contiguously (directly), where as an array of Reference types only have the reference (a pointer) contiguously in memory; the objects being pointed to are separately allocated.

The other (related) issue(*) is that (almost) all Reference types have the same representation for variance purposes and much code does not need to know of the difference between types, so co- and contra-variance is possible (and easily implemented -- often just by omission of extra type checking).

(*) It may be seen to be the same issue...