What is Hindley-Milner?
Simple Hindley-Milner type inference implementation in C# :
Hindley-Milner type inference over (Lisp-ish) S-expressions, in under 650 lines of C#
Note the implementation is in the range of only 270 or so lines of C# (for the algorithm W proper and the few data structures to support it, anyway).
Usage excerpt:
// ...
var syntax =
new SExpressionSyntax().
Include
(
// Not-quite-Lisp-indeed; just tolen from our host, C#, as-is
SExpressionSyntax.Token("\\/\\/.*", SExpressionSyntax.Commenting),
SExpressionSyntax.Token("false", (token, match) => false),
SExpressionSyntax.Token("true", (token, match) => true),
SExpressionSyntax.Token("null", (token, match) => null),
// Integers (unsigned)
SExpressionSyntax.Token("[0-9]+", (token, match) => int.Parse(match)),
// String literals
SExpressionSyntax.Token("\\\"(\\\\\\n|\\\\t|\\\\n|\\\\r|\\\\\\\"|[^\\\"])*\\\"", (token, match) => match.Substring(1, match.Length - 2)),
// For identifiers...
SExpressionSyntax.Token("[\\$_A-Za-z][\\$_0-9A-Za-z\\-]*", SExpressionSyntax.NewSymbol),
// ... and such
SExpressionSyntax.Token("[\\!\\&\\|\\<\\=\\>\\+\\-\\*\\/\\%\\:]+", SExpressionSyntax.NewSymbol)
);
var system = TypeSystem.Default;
var env = new Dictionary<string, IType>();
// Classic
var @bool = system.NewType(typeof(bool).Name);
var @int = system.NewType(typeof(int).Name);
var @string = system.NewType(typeof(string).Name);
// Generic list of some `item' type : List<item>
var ItemType = system.NewGeneric();
var ListType = system.NewType("List", new[] { ItemType });
// Populate the top level typing environment (aka, the language's "builtins")
env[@bool.Id] = @bool;
env[@int.Id] = @int;
env[@string.Id] = @string;
env[ListType.Id] = env["nil"] = ListType;
//...
Action<object> analyze =
(ast) =>
{
var nodes = (Node[])visitSExpr(ast);
foreach (var node in nodes)
{
try
{
Console.WriteLine();
Console.WriteLine("{0} : {1}", node.Id, system.Infer(env, node));
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
Console.WriteLine();
Console.WriteLine("... Done.");
};
// Parse some S-expr (in string representation)
var source =
syntax.
Parse
(@"
(
let
(
// Type inference ""playground""
// Classic..
( id ( ( x ) => x ) ) // identity
( o ( ( f g ) => ( ( x ) => ( f ( g x ) ) ) ) ) // composition
( factorial ( ( n ) => ( if ( > n 0 ) ( * n ( factorial ( - n 1 ) ) ) 1 ) ) )
// More interesting..
( fmap (
( f l ) =>
( if ( empty l )
( : ( f ( head l ) ) ( fmap f ( tail l ) ) )
nil
)
) )
// your own...
)
( )
)
");
// Visit the parsed S-expr, turn it into a more friendly AST for H-M
// (see Node, et al, above) and infer some types from the latter
analyze(source);
// ...
... which yields:
id : Function<`u, `u>
o : Function<Function<`z, `aa>, Function<`y, `z>, Function<`y, `aa>>
factorial : Function<Int32, Int32>
fmap : Function<Function<`au, `ax>, List<`au>, List<`ax>>
... Done.
See also Brian McKenna's JavaScript implementation on bitbucket, which also helps to get started (worked for me).
'HTH,
You may be able to find the original papers using Google Scholar or CiteSeer -- or your local university library. The first is old enough that you may have to find bound copies of the journal, I couldn't find it online. The link that I found for the other one was broken, but there might be others. You'll certainly be able to find papers that cite these.
Hindley, Roger J, The principal type scheme of an object in combinatory logic, Transactions of the American Mathematical Society, 1969.
Milner, Robin, A Theory of Type Polymorphism, Journal of Computer and System Sciences, 1978.
Hindley-Milner is a type system discovered independently by Roger Hindley (who was looking at logic) and later by Robin Milner (who was looking at programming languages). The advantages of Hindley-Milner are
It supports polymorphic functions; for example, a function that can give you the length of the list independent of the type of the elements, or a function does a binary-tree lookup independent of the type of keys stored in the tree.
Sometimes a function or value can have more than one type, as in the example of the length function: it can be "list of integers to integer", "list of strings to integer", "list of pairs to integer", and so on. In this case, a signal advantage of the Hindley-Milner system is that each well-typed term has a unique "best" type, which is called the principal type. The principal type of the list-length function is "for any
a
, function from list ofa
to integer". Herea
is a so-called "type parameter," which is explicit in lambda calculus but implicit in most programming languages. The use of type parameters explains why Hindley-Milner is a system that implements parametric polymorphism. (If you write a definition of the length function in ML, you can see the type parameter thus:fun 'a length [] = 0 | 'a length (x::xs) = 1 + length xs
If a term has a Hindley-Milner type, then the principal type can be inferred without requiring any type declarations or other annotations by the programmer. (This is a mixed blessing, as anyone can attest who has ever been handled a large chunk of ML code with no annotations.)
Hindley-Milner is the basis for the type system of almost every statically typed functional language. Such languages in common use include
- The ML family (Standard ML and Objective Caml)
- Haskell
- Clean
All these languages have extended Hindley-Milner; Haskell, Clean, and Objective Caml do so in ambitious and unusual ways. (Extensions are required to deal with mutable variables, since basic Hindley-Milner can be subverted using, for example, a mutable cell holding a list of values of unspecified type. Such problems are dealt with by an extension called the value restriction.)
Many other minor languages and tools based on typed functional languages use Hindley-Milner.
Hindley-Milner is a restriction of System F, which allows more types but which requires annotations by the programmer.