What is the difference between a lambda and a method reference at a runtime level?

Getting Started

To investigate this we start with the following class:

import java.io.Serializable;
import java.util.Comparator;

public final class Generic {

    // Bad implementation, only used as an example.
    public static final Comparator<Integer> COMPARATOR = (a, b) -> (a > b) ? 1 : -1;

    public static Comparator<Integer> reference() {
        return (Comparator<Integer> & Serializable) COMPARATOR::compare;
    }

    public static Comparator<Integer> explicit() {
        return (Comparator<Integer> & Serializable) (a, b) -> COMPARATOR.compare(a, b);
    }

}

After compilation, we can disassemble it using:

javap -c -p -s -v Generic.class

Removing the irrelevant parts (and some other clutter, such as fully-qualified types and the initialisation of COMPARATOR) we are left with

  public static final Comparator<Integer> COMPARATOR;    

  public static Comparator<Integer> reference();
      0: getstatic     #2  // Field COMPARATOR:LComparator;    
      3: dup    
      4: invokevirtual #3   // Method Object.getClass:()LClass;    
      7: pop    
      8: invokedynamic #4,  0  // InvokeDynamic #0:compare:(LComparator;)LComparator;    
      13: checkcast     #5  // class Serializable    
      16: checkcast     #6  // class Comparator    
      19: areturn

  public static Comparator<Integer> explicit();
      0: invokedynamic #7,  0  // InvokeDynamic #1:compare:()LComparator;    
      5: checkcast     #5  // class Serializable    
      8: checkcast     #6  // class Comparator    
      11: areturn

  private static int lambda$explicit$d34e1a25$1(Integer, Integer);
     0: getstatic     #2  // Field COMPARATOR:LComparator;
     3: aload_0
     4: aload_1
     5: invokeinterface #44,  3  // InterfaceMethod Comparator.compare:(LObject;LObject;)I
    10: ireturn

BootstrapMethods:    
  0: #61 invokestatic invoke/LambdaMetafactory.altMetafactory:(Linvoke/MethodHandles$Lookup;LString;Linvoke/MethodType;[LObject;)Linvoke/CallSite;    
    Method arguments:    
      #62 (LObject;LObject;)I    
      #63 invokeinterface Comparator.compare:(LObject;LObject;)I    
      #64 (LInteger;LInteger;)I    
      #65 5    
      #66 0    

  1: #61 invokestatic invoke/LambdaMetafactory.altMetafactory:(Linvoke/MethodHandles$Lookup;LString;Linvoke/MethodType;[LObject;)Linvoke/CallSite;    
    Method arguments:    
      #62 (LObject;LObject;)I    
      #70 invokestatic Generic.lambda$explicit$df5d232f$1:(LInteger;LInteger;)I    
      #64 (LInteger;LInteger;)I    
      #65 5    
      #66 0

Immediately we see that the bytecode for the reference() method is different to the bytecode for explicit(). However, the notable difference isn't actually relevant, but the bootstrap methods are interesting.

An invokedynamic call site is linked to a method by means of a bootstrap method, which is a method specified by the compiler for the dynamically-typed language that is called once by the JVM to link the site.

(Java Virtual Machine Support for Non-Java Languages, emphasis theirs)

This is the code responsible for creating the CallSite used by the lambda. The Method arguments listed below each bootstrap method are the values passed as the variadic parameter (i.e. args) of LambdaMetaFactory#altMetaFactory.

Format of the Method arguments

  1. samMethodType - Signature and return type of method to be implemented by the function object.
  2. implMethod - A direct method handle describing the implementation method which should be called (with suitable adaptation of argument types, return types, and with captured arguments prepended to the invocation arguments) at invocation time.
  3. instantiatedMethodType - The signature and return type that should be enforced dynamically at invocation time. This may be the same as samMethodType, or may be a specialization of it.
  4. flags indicates additional options; this is a bitwise OR of desired flags. Defined flags are FLAG_BRIDGES, FLAG_MARKERS, and FLAG_SERIALIZABLE.
  5. bridgeCount is the number of additional method signatures the function object should implement, and is present if and only if the FLAG_BRIDGES flag is set.

In both cases here bridgeCount is 0, and so there is no 6, which would otherwise be bridges - a variable-length list of additional methods signatures to implement (given that bridgeCount is 0, I'm not entirely sure why FLAG_BRIDGES is set).

Matching the above up with our arguments, we get:

  1. The function signature and return type (Ljava/lang/Object;Ljava/lang/Object;)I, which is the return type of Comparator#compare, because of generic type erasure.
  2. The method being called when this lambda is invoked (which is different).
  3. The signature and return type of the lambda, which will be checked when the lambda is invoked: (LInteger;LInteger;)I (note that these aren't erased, because this is part of the lambda specification).
  4. The flags, which in both cases is the composition of FLAG_BRIDGES and FLAG_SERIALIZABLE (i.e. 5).
  5. The amount of bridge method signatures, 0.

We can see that FLAG_SERIALIZABLE is set for both lambdas, so it's not that.

Implementation methods

The implementation method for the method reference lambda is Comparator.compare:(LObject;LObject;)I, but for the explicit lambda it's Generic.lambda$explicit$df5d232f$1:(LInteger;LInteger;)I. Looking at the disassembly, we can see that the former is essentially an inlined version of the latter. The only other notable difference is the method parameter types (which, as mentioned earlier, is because of generic type erasure).

When is a lambda actually serializable?

You can serialize a lambda expression if its target type and its captured arguments are serializable.

Lambda Expressions (The Java™ Tutorials)

The important part of that is "captured arguments". Looking back at the disassembled bytecode, the invokedynamic instruction for the method reference certainly looks like it's capturing a Comparator (#0:compare:(LComparator;)LComparator;, in contrast to the explicit lambda, #1:compare:()LComparator;).

Confirming capturing is the issue

ObjectOutputStream contains an extendedDebugInfo field, which we can set using the -Dsun.io.serialization.extendedDebugInfo=true VM argument:

$ java -Dsun.io.serialization.extendedDebugInfo=true Generic

When we try to serialize the lambdas again, this gives a very satisfactory

Exception in thread "main" java.io.NotSerializableException: Generic$$Lambda$1/321001045
        - element of array (index: 0)
        - array (class "[LObject;", size: 1)
/* ! */ - field (class "invoke.SerializedLambda", name: "capturedArgs", type: "class [LObject;") // <--- !!
        - root object (class "invoke.SerializedLambda", SerializedLambda[capturingClass=class Generic, functionalInterfaceMethod=Comparator.compare:(LObject;LObject;)I, implementation=invokeInterface Comparator.compare:(LObject;LObject;)I, instantiatedMethodType=(LInteger;LInteger;)I, numCaptured=1])
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1182)
    /* removed */
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
    at Generic.main(Generic.java:27)

What's actually going on

From the above, we can see that the explicit lambda is not capturing anything, whereas the method reference lambda is. Looking over the bytecode again makes this clear:

  public static Comparator<Integer> explicit();
      0: invokedynamic #7,  0  // InvokeDynamic #1:compare:()LComparator;    
      5: checkcast     #5  // class java/io/Serializable    
      8: checkcast     #6  // class Comparator    
      11: areturn

Which, as seen above, has an implementation method of:

  private static int lambda$explicit$d34e1a25$1(java.lang.Integer, java.lang.Integer);
     0: getstatic     #2  // Field COMPARATOR:Ljava/util/Comparator;
     3: aload_0
     4: aload_1
     5: invokeinterface #44,  3  // InterfaceMethod java/util/Comparator.compare:(Ljava/lang/Object;Ljava/lang/Object;)I
    10: ireturn

The explicit lambda is actually calling lambda$explicit$d34e1a25$1, which in turn calls the COMPARATOR#compare. This layer of indirection means it's not capturing anything that isn't Serializable (or anything at all, to be precise), and so is safe to serialize. The method reference expression directly uses COMPARATOR (the value of which is then passed to the bootstrap method):

  public static Comparator<Integer> reference();
      0: getstatic     #2  // Field COMPARATOR:LComparator;    
      3: dup    
      4: invokevirtual #3   // Method Object.getClass:()LClass;    
      7: pop    
      8: invokedynamic #4,  0  // InvokeDynamic #0:compare:(LComparator;)LComparator;    
      13: checkcast     #5  // class java/io/Serializable    
      16: checkcast     #6  // class Comparator    
      19: areturn

The lack of indirection means that COMPARATOR must be serialized along with the lambda. As COMPARATOR does not refer to a Serializable value, this fails.

The fix

I hesitate to call this a compiler bug (I expect the lack of indirection serves as an optimisation), although it is very strange. The fix is trivial, but ugly; adding the explicit cast for COMPARATOR at declaration:

public static final Comparator<Integer> COMPARATOR = (Serializable & Comparator<Integer>) (a, b) -> a > b ? 1 : -1;

This makes everything perform correctly on Java 1.8.0_45. It's also worth noting that the eclipse compiler produces that layer of indirection in the method reference case as well, and so the original code in this post does not require modification to execute correctly.


I want to add the fact that there is actually a semantic difference between a lambda and a method reference to an instance method (even when they have the same content as in your case, and disregarding serialisation):

SOME_COMPARATOR::compare

This form evaluates to a lambda object which is closed over the value of SOME_COMPARATOR at evaluation time (that is, it contains reference to that object). It will check if SOME_COMPARATOR is null at evaluation time and throw a null pointer exception already then. It will not pick up changes to the field that are made after its creation.

(a,b) -> SOME_COMPARATOR.compare(a,b)

This form evaluates to a lambda object which will access the value of the SOME_COMPARATOR field when called. It is closed over this, since SOME_COMPARATOR is an instance field. When called it will access the current value of SOME_COMPARATOR and use that, potentially throwing an null pointer exception at that time.

Demonstration

This behaviour can be seen from the following small example. By stopping the code in a debugger and inspecting the fields of the lambdas one can verify what they are closed over.

Object o = "First";

void run() {
    Supplier<String> ref = o::toString; 
    Supplier<String> lambda = () -> o.toString();
    o = "Second";
    System.out.println("Ref: " + ref.get()); // Prints "First"
    System.out.println("Lambda: " + lambda.get()); // Prints "Second"
}

Java Language Specification

The JLS describes this behaviour of method references in 15.13.3:

The target reference is the value of ExpressionName or Primary, as determined when the method reference expression was evaluated.

And:

First, if the method reference expression begins with an ExpressionName or a Primary, this subexpression is evaluated. If the subexpression evaluates to null, a NullPointerException is raised

In Tobys code

This can be seen in Tobys listing of the code of reference, where getClass is called on the value of SOME_COMPARATOR which will trigger an exception if it is null:

4: invokevirtual #3   // Method Object.getClass:()LClass;

(Or so I think, I'm really not an expert on byte code.)

Method references in code that is complied with Eclipse 4.4.1 does not throw an exception in that situation however. Eclipse seems to have a bug here.