What does the `forall` keyword in Haskell/GHC do?

Let's start with a code example:

foob :: forall a b. (b -> b) -> b -> (a -> b) -> Maybe a -> b
foob postProcess onNothin onJust mval =
    postProcess val
    where
        val :: b
        val = maybe onNothin onJust mval

This code doesn't compile (syntax error) in plain Haskell 98. It requires an extension to support the forall keyword.

Basically, there are 3 different common uses for the forall keyword (or at least so it seems), and each has its own Haskell extension: ScopedTypeVariables, RankNTypes/Rank2Types, ExistentialQuantification.

The code above doesn't get a syntax error with either of those enabled, but only type-checks with ScopedTypeVariables enabled.

Scoped Type Variables:

Scoped type variables helps one specify types for code inside where clauses. It makes the b in val :: b the same one as the b in foob :: forall a b. (b -> b) -> b -> (a -> b) -> Maybe a -> b.

A confusing point: you may hear that when you omit the forall from a type it is actually still implicitly there. (from Norman's answer: "normally these languages omit the forall from polymorphic types"). This claim is correct, but it refers to the other uses of forall, and not to the ScopedTypeVariables use.

Rank-N-Types:

Let's start with that mayb :: b -> (a -> b) -> Maybe a -> b is equivalent to mayb :: forall a b. b -> (a -> b) -> Maybe a -> b, except for when ScopedTypeVariables is enabled.

This means that it works for every a and b.

Let's say you want to do something like this.

ghci> let putInList x = [x]
ghci> liftTup putInList (5, "Blah")
([5], ["Blah"])

What must be the type of this liftTup? It's liftTup :: (forall x. x -> f x) -> (a, b) -> (f a, f b). To see why, let's try to code it:

ghci> let liftTup liftFunc (a, b) = (liftFunc a, liftFunc b)
ghci> liftTup (\x -> [x]) (5, "Hello")
    No instance for (Num [Char])
    ...
ghci> -- huh?
ghci> :t liftTup
liftTup :: (t -> t1) -> (t, t) -> (t1, t1)

"Hmm.. why does GHC infer that the tuple must contain two of the same type? Let's tell it they don't have to be"

-- test.hs
liftTup :: (x -> f x) -> (a, b) -> (f a, f b)
liftTup liftFunc (t, v) = (liftFunc t, liftFunc v)

ghci> :l test.hs
    Couldnt match expected type 'x' against inferred type 'b'
    ...

Hmm. so here GHC doesn't let us apply liftFunc on v because v :: b and liftFunc wants an x. We really want our function to get a function that accepts any possible x!

{-# LANGUAGE RankNTypes #-}
liftTup :: (forall x. x -> f x) -> (a, b) -> (f a, f b)
liftTup liftFunc (t, v) = (liftFunc t, liftFunc v)

So it's not liftTup that works for all x, it's the function that it gets that does.

Existential Quantification:

Let's use an example:

-- test.hs
{-# LANGUAGE ExistentialQuantification #-}
data EQList = forall a. EQList [a]
eqListLen :: EQList -> Int
eqListLen (EQList x) = length x

ghci> :l test.hs
ghci> eqListLen $ EQList ["Hello", "World"]
2

How is that different from Rank-N-Types?

ghci> :set -XRankNTypes
ghci> length (["Hello", "World"] :: forall a. [a])
    Couldnt match expected type 'a' against inferred type '[Char]'
    ...

With Rank-N-Types, forall a meant that your expression must fit all possible as. For example:

ghci> length ([] :: forall a. [a])
0

An empty list does work as a list of any type.

So with Existential-Quantification, foralls in data definitions mean that, the value contained can be of any suitable type, not that it must be of all suitable types.


Here is a quick and dirty explanation in plain terms that you're likely to be already familiar with.

The forall keyword is really only used in one way in Haskell. It always means the same thing when you see it.

Universal quantification

A universally quantified type is a type of the form forall a. f a. A value of that type can be thought of as a function that takes a type a as its argument and returns a value of type f a. Except that in Haskell these type arguments are passed implicitly by the type system. This "function" has to give you the same value no matter which type it receives, so the value is polymorphic.

For example, consider the type forall a. [a]. A value of that type takes another type a and gives you back a list of elements of that same type a. There is only one possible implementation, of course. It would have to give you the empty list because a could be absolutely any type. The empty list is the only list value that is polymorphic in its element type (since it has no elements).

Or the type forall a. a -> a. The caller of such a function provides both a type a and a value of type a. The implementation then has to return a value of that same type a. There's only one possible implementation again. It would have to return the same value that it was given.

Existential quantification

An existentially quantified type would have the form exists a. f a, if Haskell supported that notation. A value of that type can be thought of as a pair (or a "product") consisting of a type a and a value of type f a.

For example, if you have a value of type exists a. [a], you have a list of elements of some type. It could be any type, but even if you don't know what it is there's a lot you could do to such a list. You could reverse it, or you could count the number of elements, or perform any other list operation that doesn't depend on the type of the elements.

OK, so wait a minute. Why does Haskell use forall to denote an "existential" type like the following?

data ShowBox = forall s. Show s => SB s

It can be confusing, but it's really describing the type of the data constructor SB:

SB :: forall s. Show s => s -> ShowBox

Once constructed, you can think of a value of type ShowBox as consisting of two things. It's a type s together with a value of type s. In other words, it's a value of an existentially quantified type. ShowBox could really be written as exists s. Show s => s, if Haskell supported that notation.

runST and friends

Given that, how are these different?

foo :: (forall a. a -> a) -> (Char,Bool)
bar :: forall a. ((a -> a) -> (Char, Bool))

Let's first take bar. It takes a type a and a function of type a -> a, and produces a value of type (Char, Bool). We could choose Int as the a and give it a function of type Int -> Int for example. But foo is different. It requires that the implementation of foo be able to pass any type it wants to the function we give it. So the only function we could reasonably give it is id.

We should now be able to tackle the meaning of the type of runST:

runST :: forall a. (forall s. ST s a) -> a

So runST has to be able to produce a value of type a, no matter what type we give as a. To do so, it uses an argument of type forall s. ST s a which certainly must somehow produce the a. What's more, it must be able to produce a value of type a no matter what type the implementation of runST decides to give as s.

So in effect the runST function is saying to you: "you can choose the type a as long as I can choose the type s."

OK, so what? The benefit is that this puts a constraint on the caller of runST in that the type a cannot involve the type s at all, since you don't know what type s is going to be. You can't pass it a value of type ST s [s], for example. What that means in practice is that the implementation of runST can know that any value involving the type s is local to its own implementation, so it's free to do things it otherwise wouldn't be allowed to, like mutate them in place. The type guarantees that the mutation is local to the implementation of runST.

The type of runST is an example of a rank-2 polymorphic type because the type of its argument contains a forall quantifier. The type of foo above is also of rank 2. An ordinary polymorphic type, like that of bar, is rank-1, but it becomes rank-2 if the types of arguments are required to be polymorphic, with their own forall quantifier. And if a function takes rank-2 arguments then its type is rank-3, and so on. In general, a type that takes polymorphic arguments of rank n has rank n + 1.


Can anybody completely explain the forall keyword in clear, plain English?

No. (Well, maybe Don Stewart can.)

Here are the barriers to a simple, clear explanation or forall:

  • It's a quantifier. You have a to have at least a little logic (predicate calculus) to have seen a universal or existential quantifier. If you've never seen predicate calculus or are not comfortable with quantifiers (and I have seen students during PhD qualifying exams who are not comfortable), then for you, there's no easy explanation of forall.

  • It's a type quantifier. If you haven't seen System F and gotten some practice writing polymorphic types, you're going to find forall confusing. Experience with Haskell or ML is not enough, because normally these languages omit the forall from polymorphic types. (In my mind, this is a language-design mistake.)

  • In Haskell in particular, forall is used in ways that I find confusing. (I'm not a type theorist, but my work brings me in contact with a lot of type theory, and I'm quite comfortable with it.) For me, the main source of confusion is that forall is used to encode a type that I myself would prefer to write with exists. It's justified by a tricky bit of type isomorphism involving quantifiers and arrows, and every time I want to understand it, I have to look things up and work out the isomorphism myself.

    If you are not comfortable with the idea of type isomorphism, or if you don't have any practice thinking about type isomorphisms, this use of forall is going to stymie you.

  • While the general concept of forall is always the same (binding to introduce a type variable), the details of different uses can vary significantly. Informal English is not a very good tool for explaining the variations. To really understand what's going on, you need some mathematics. In this case the relevant mathematics can be found in Benjamin Pierce's introductory text Types and Programming Languages, which is a very good book.

As for your particular examples,

  • runST should make your head hurt. Higher-rank types (forall to the left of an arrow) are rarely found in the wild. I encourage you to read the paper that introduced runST: "Lazy Functional State Threads". This is a really good paper, and it will give you a much better intuition for the type of runST in particular and for higher-rank types in general. The explanation take several pages, it's very well done, and I'm not going to try to condense it here.

  • Consider

    foo :: (forall a. a -> a) -> (Char,Bool)
    bar :: forall a. ((a -> a) -> (Char, Bool))
    

    If I call bar, I can simply pick any type a that I like, and I can pass it a function from type a to type a. For example, I can pass the function (+1) or the function reverse. You can think of the forall as saying "I get to pick the type now". (The technical word for picking the type is instantiating.)

    The restrictions on calling foo are much more stringent: the argument to foo must be a polymorphic function. With that type, the only functions I can pass to foo are id or a function that always diverges or errors, like undefined. The reason is that with foo, the forall is to the left of the arrow, so as the caller of foo I don't get to pick what a is—rather it's the implementation of foo that gets to pick what a is. Because forall is to the left of the arrow, rather than above the arrow as in bar, the instantiation takes place in the body of the function rather than at the call site.

Summary: A complete explanation of the forall keyword requires math and can be understood only by someone who has studied the math. Even partial explanations are hard to understand without math. But maybe my partial, non-math explanations help a little. Go read Launchbury and Peyton Jones on runST!


Addendum: Jargon "above", "below", "to the left of". These have nothing to do with the textual ways types are written and everything to do with abstract-syntax trees. In the abstract syntax, a forall takes the name of a type variable, and then there is a full type "below" the forall. An arrow takes two types (argument and result type) and forms a new type (the function type). The argument type is "to the left of" the arrow; it is the arrow's left child in the abstract-syntax tree.

Examples:

  • In forall a . [a] -> [a], the forall is above the arrow; what's to the left of the arrow is [a].

  • In

    forall n f e x . (forall e x . n e x -> f -> Fact x f) 
                  -> Block n e x -> f -> Fact x f
    

    the type in parentheses would be called "a forall to the left of an arrow". (I'm using types like this in an optimizer I'm working on.)


My original answer:

Can anybody completely explain the forall keyword in clear, plain English

As Norman indicates, it is very hard to give a clear, plain English explanation of a technical term from type theory. We're all trying though.

There is only really one thing to remember about 'forall': it binds types to some scope. Once you understand that, everything is fairly easy. It is the equivalent of 'lambda' (or a form of 'let') on the type level -- Norman Ramsey uses the notion of "left"/"above" to convey this same concept of scope in his excellent answer.

Most uses of 'forall' are very simple, and you can find them introduced in the GHC Users Manual, S7.8., particularly the excellent S7.8.5 on nested forms of 'forall'.

In Haskell, we usually leave off the binder for types, when the type is universally quanitified, like so:

length :: forall a. [a] -> Int

is equivalent to:

length :: [a] -> Int

That's it.

Since you can bind type variables now to some scope, you can have scopes other than the top level ("universally quantified"), like your first example, where the type variable is only visible within the data structure. This allows for hidden types ("existential types"). Or we can have arbitrary nesting of bindings ("rank N types").

To deeply understand type systems, you will need to learn some jargon. That's the nature of computer science. However, simple uses, like above, should be able to be grasped intuitively, via analogy with 'let' on the value level. A great introduction is Launchbury and Peyton Jones.