Is it possible to evaluate lambda calculus terms efficiently?

At the current state of knowledge, the best way of evaluating lambda terms is the so called optimal graph reduction technique. The technique was introduced in 1989 by J.Lamping in his POPL article "An algorithm for optimal lambda calculus reduction", and then revised and improved by several authors. You may find a survey in my book with S.Guerrini "The optimal implementation of functional programming languages", Cambridge Tracts in Theoretical Computer Science n.45, 1998.

The term "optimal" refers to the management of sharing. In lambda calculus you have a lot of duplication and the efficiency of the reduction is crucially based on avoding replicating work. In a first order setting, directed acyclic graphs (dags) are enough for managing sharing, but as soon as you enter in a higher order setting you need more complex graph structures comprising sharing and unsharing.

On pure lambda terms, optimal graph reduction is faster than all other known reduction techniques (environment machine, supercombinators, or whatever). Some benchmarks are given in the above book (see pag.296-301), where we proved that our implementation outperformed both caml-light (the precursor of ocaml) and Haskell (really slow). So, if you hear people saying that it has never been proved that optimal reduction is faster than other techniques, it is not true: it was proved both in theory and experimentally.

The reason functional languages are not yet implemented in this way is that in the practice of functional programming you very seldom use functionals with a really high rank, and when you do they are frequently linear. As soon as you raise the rank, the intrinsic complexity of lambda terms can become dangerously high. Having a technique that allows you to reduce a term in time O(2^n) instead of O(2^(2^n)) does not make all that difference, in practice: both computations are just unfeasible.

I recently wrote a short article trying to explain these issues: "About the efficient reduction of lambda terms". In the same article, I also discuss the possibility to have superoptimal reduction techniques.


There are various approaches to evaluating lambda terms. Depending on whether type information is available or not, you can get more efficient and more secure evaluators (runtime checks are not needed as the programs are known to be well-behaved). I'm going to give a crude presentation of some techniques.

Big steps evaluators

Rather than repeatedly locating lambda-redexes and firing them (which means traversing the term multiple times), you can devise an algorithm trying to minimise the work by evaluating a term in "big steps".

E.g. if the term you want to normalise is t u then you normalise both t and u and if norm t is a lambda abstraction, you fire the corresponding redex and restart the normalisation on the term you just got.

Abstract / Virtual Machines

Now, you can do basically the same work but a lot faster using abstract machines. These small virtual machines with their own set of instructions and reduction rules which you can implement in your favourite language and compile lambda-terms to.

The historical example is the SECD machine.

Danvy et al. have shown that well-known abstract machines can be derived from the evaluators mentioned before by a combination of continuation passing style and defunctionalisation.

To get a lambda-term back from an abstract machine, you need to implement a reification / read back function building a lambda term based on the state the machine is in. Grégoire and Leroy show how such a thing can be done in a context where the type theory of the language is the one of Coq.

Normalisation by Evaluation

Normalisation by Evaluation is the practice of leveraging the evaluation mechanism of the host language in order to implement the normalisation procedure for another language.

Lambda abstraction are turned into functions in the host language, application becomes the host language's application, etc., etc.

This technique can be used in a typed manner (e.g. normalising the simply-typed lambda-calculus in Haskell or in OCaml) or an untyped one (e.g. the simply-typed lambda-calculus once more or Coq terms compiled to OCaml (!)).