Using return vs. not using return in the list monad

It's easy to see when re-writing the code with bind and return:

[1,2] >>= (\x->        [x,x+1]) === concatMap (\x-> [  x,x+1  ]) [1,2]

[1,2] >>= (\x-> return [x,x+1]) === concatMap (\x-> [ [x,x+1] ]) [1,2]

Your first code is tantamount to calling join on the results of the second, removing one monadic "layer" introduced by return :: a -> m a, conflating the "list" of the monad in use, with the "list" of your value. If you were returning a pair, say, it wouldn't have made much sense to omit the return:

                                     -- WRONG: type mismatch
[1,2] >>= (\x->        (x,x+1)) === concatMap (\x-> (  x,x+1  )) [1,2]
                                     -- OK:
[1,2] >>= (\x-> return (x,x+1)) === concatMap (\x-> [ (x,x+1) ]) [1,2]

Or, we can use a join/fmap re-write:

ma >>= famb === join (fmap famb ma)   -- famb :: a -> m b, m ~ []

join (fmap (\x->        [x,x+1]) [1,2]) = concat [ [  x,x+1  ] | x<-[1,2]]
join (fmap (\x->        (x,x+1)) [1,2]) = concat [ (  x,x+1  ) | x<-[1,2]]  -- WRONG
join (fmap (\x-> return [x,x+1]) [1,2]) = concat [ [ [x,x+1] ] | x<-[1,2]]

                                                         =  [y | x<-[1,2], y<-[ x,x+1 ]]
                                            {- WRONG -}  =  [y | x<-[1,2], y<-( x,x+1 )]
                                                         =  [y | x<-[1,2], y<-[[x,x+1]]]

To see why you get the particular answers that arise, the desugaring explanations are very helpful. Let me supplement them with a little general advice about developing perceptions of Haskell code.

Haskell's type system makes no distinction between two separable "moral" purposes:

[x] the type of values which are lists with elements drawn from x
[x] the type of computations of elements of x which allow prioritized choice

The fact that these two notions have the same representation does not mean that they play the same roles. In f1, the [x, x+1] is playing the role of computation, so the possibilities it generates are merged into the choice generated by the whole computation: that's what the >>= of the list monad does. In f2, however, the [x, x+1] is playing the role of value, so that the whole computation generates a prioritized choice between two values (which happen to be list values).

Haskell does not use types to make this distinction [and you may have guessed by now that I think it should, but that's another story]. Instead, it uses syntax. So you need to train your head to perceive the value and computation roles when you read code. The do notation is a special syntax for constructing computations. What goes inside the do is built from the following template kit:

jigsaw pieces for computations

The three blue pieces make do-computations. I've marked the computation holes in blue and the value holes in red. This is not meant to be a complete syntax, just a guide to how to perceive pieces of code in your mind.

Indeed, you may write any old expression in the blue places provided it has a suitably monadic type, and the computation so generated will be merged into the overall computation using >>= as needed. In your f1 example, your list is in a blue place and treated as prioritized choice.

Similarly, you may write expressions in red places which may very well have monadic types (like lists in this case), but they will be treated as values all the same. That's what happens in f2: as it were, the result's outer brackets are blue, but the inner brackets are red.

Train your brain to make the value/computation separation when you read code, so that you know instinctively which parts of the text are doing which job. Once you've reprogrammed your head, the distinction between f1 and f2 will seem completely normal!

The other answers here are correct, but I wonder if they're not quite what you need... I'll try to keep this as simple as possible, just two points:

Point 1. return is not a special thing in the Haskell language. It's not a keyword, and it's not syntactic sugar for something else. It's just a function that's part of the Monad typeclass. Its signature is simply:

return :: a -> m a

where m is whichever monad we're talking about at the time. It takes a "pure" value and jams it into your monad. (Incidentally, there's another function called pure that's basically a synonym for return... I like it better because the name is more obvious!) Anyway, if m is the list monad, then return has this type:

return :: a -> [a]

If it helps, you could think of the type synonym type List a = [a], which might make it slightly more obvious that List is the thing we're substituting for m. Anyway, if you were going to implement return yourself, the only reasonable way you'd implement it is by taking some value (of whatever type a) and sticking it in a list by itself:

return a = [a]

So I can say return 1 in the list monad, and I'll get [1]. I can likewise say return [1, 2, 3] and I'll get [[1, 2, 3]].

Point 2. IO is a monad, but not all monads are IO. Many Haskell tutorials seem to conflate the two topics largely for historical reasons (incidentally, the same confusing historical reasons that led to return being so poorly named). It sounds like you might have some (understandable) confusion around that.

In your code, you're in the list monad because you wrote do x <- [1, 2]. If instead you had written do x <- getLine for example, you'd be in the IO monad (because getLine returns IO String). Anyway, you're in the list monad, so you get list's definition of return described above. You also get list's definition of >>=, which is just (a flipped version of) concatMap, defined as:

concatMap :: (a -> [b]) -> [a] -> [b]
concatMap f xs = concat (map f xs)

The other posted answers pretty much have it covered from here :) I know I didn't answer your question directly, but I'm hoping these two points instead addressed the fundamental things you might have found confusing.

Using return vs. not using return in the list monad

Tags:

List

Haskell

Monads

Do Notation

Related

Recent Posts