Understanding a recursively defined list (fibs in terms of zipWith)
Let's take a look at definition of zipWith
zipWith f (x:xs) (y:ys) = f x y : zipWith xs ys
Our fibs is:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
For take 3 fibs
substituting the definition of zipWith
and xs = tail (x:xs)
we get
0 : 1 : (0+1) : zipWith (+) (tail fibs) (tail (tail fibs))
For take 4 fibs
substituting once more we get
0 : 1 : 1 : (1+1) : zipWith (+) (tail (tail fibs)) (tail (tail (tail fibs)))
and so on.
I wrote an article on this a while back. You can find it here.
As I mentioned there, read chapter 14.2 in Paul Hudak's book "The Haskell School of Expression", where he talks about Recursive Streams, using Fibonacci example.
Note:tail of a sequence is the sequence without the first item.
|---+---+---+---+----+----+----+----+------------------------------------| | 1 | 1 | 2 | 3 | 5 | 8 | 13 | 21 | Fibonacci sequence (fibs) | |---+---+---+---+----+----+----+----+------------------------------------| | 1 | 2 | 3 | 5 | 8 | 13 | 21 | 34 | tail of Fib sequence (tail fibs) | |---+---+---+---+----+----+----+----+------------------------------------|
Add the two columns: add fibs (tail fibs) to get tail of tail of fib sequence
|---+---+---+---+----+----+----+----+------------------------------------| | 2 | 3 | 5 | 8 | 13 | 21 | 34 | 55 | tail of tail of Fibonacci sequence | |---+---+---+---+----+----+----+----+------------------------------------|
add fibs (tail fibs) can be written as zipWith (+) fibs (tail fibs)
Now, all we need to do prime the generation by starting with the first 2 fibonacci numbers to get the complete fibonacci sequence.
1:1: zipWith (+) fibs (tail fibs)
Note: This recursive definition will not work in a typical language that does eager evaluation. It works in haskell as it does lazy evaluation. So, if you ask for the first 4 fibonacci numbers, take 4 fibs, haskell only computes enough of sequence as required.
I'll give a bit of an explanation of how it works internally. First, you must realise that Haskell uses a thing called a thunk for its values. A thunk is basically a value that has not yet been computed yet -- think of it as a function of 0 arguments. Whenever Haskell wants to, it can evaluate (or partly-evaluate) the thunk, turning it in to a real value. If it only partly evaluates a thunk, then the resulting value will have more thunks in it.
For example, consider the expression:
(2 + 3, 4)
In an ordinary language, this value would be stored in memory as (5, 4)
, but in Haskell, it is stored as (<thunk 2 + 3>, 4)
. If you ask for the second element of that tuple, it will tell you "4", without ever adding 2 and 3 together. Only if you ask for the first element of that tuple will it evaluate the thunk, and realise that it is 5.
With fibs, it's a bit more complicated, because it's recursive, but we can use the same idea. Because fibs
takes no arguments, Haskell will permanently store any list elements that have been discovered -- that is important.
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
It helps to visualise Haskell's current knowledge of three expressions: fibs
, tail fibs
and zipWith (+) fibs (tail fibs)
. We shall assume Haskell starts out knowing the following:
fibs = 0 : 1 : <thunk1>
tail fibs = 1 : <thunk1>
zipWith (+) fibs (tail fibs) = <thunk1>
Note that the 2nd row is just the first one shifted left, and the 3rd row is the first two rows summed.
Ask for take 2 fibs
and you'll get [0, 1]
. Haskell doesn't need to further evaluate the above to find this out.
Ask for take 3 fibs
and Haskell will get the 0 and 1, and then realise that it needs to partly evaluate the thunk. In order to fully evaluate zipWith (+) fibs (tail fibs)
, it needs to sum the first two rows -- it can't fully do that, but it can begin to sum the first two rows:
fibs = 0 : 1 : 1: <thunk2>
tail fibs = 1 : 1 : <thunk2>
zipWith (+) fibs (tail fibs) = 1 : <thunk2>
Note that I filled in the "1" in the 3rd row, and it automatically appeared in the first and second rows as well, since all three rows are sharing the same thunk (think of it like a pointer that got written to). And because it didn't finish evaluating, it created a new thunk containing the rest of the list, should that ever be needed.
It isn't needed, though, because take 3 fibs
is done: [0, 1, 1]
. But now, say you ask for take 50 fibs
; Haskell already remembers the 0, 1 and 1. But it needs to keep going. So it continues summing the first two rows:
fibs = 0 : 1 : 1 : 2 : <thunk3>
tail fibs = 1 : 1 : 2 : <thunk3>
zipWith (+) fibs (tail fibs) = 1 : 2 : <thunk3>
...
fibs = 0 : 1 : 1 : 2 : 3 : <thunk4>
tail fibs = 1 : 1 : 2 : 3 : <thunk4>
zipWith (+) fibs (tail fibs) = 1 : 2 : 3 : <thunk4>
And so on, until it has filled in 48 columns of the 3rd row, and thus has worked out the first 50 numbers. Haskell evaluates just as much as it needs, and leaves the infinite "rest" of the sequence as a thunk object in case it ever needs any more.
Note that if you subsequently ask for take 25 fibs
, Haskell will not evaluate it again -- it will just take the first 25 numbers from the list it has already calculated.
Edit: Added a unique number to each thunk to avoid confusion.
A very related example run through can be found here, although I haven't gone over it completely it maybe of some help.
I am not exactly sure of the implementation details, but I suspect they should be on the lines of my argument below.
Please take this with a pinch of salt, this maybe inaccurate implementationally but just as an understanding aid.
Haskell will not evaluate anything unless it is forced to, that is known as Lazy Evaluation, which is a beautiful concept in itself.
So lets assume we've been asked only to do a take 3 fibs
Haskell stores the fibs
list as 0:1:another_list
, since we've been asked to take 3
we may as well assume it is stored as fibs = 0:1:x:another_list
and (tail fibs) = 1:x:another_list
and 0 : 1 : zipWith (+) fibs (tail fibs)
will then be 0 : 1 : (0+1) : (1+x) : (x+head another_list) ...
By pattern matching Haskell knows that x = 0 + 1
So leading us to 0:1:1
.
I'll be really interested though if someone knows some proper implementational details. I can understand that Lazy Evaluation techniques can be fairly complicated though.
Hope this helps in understanding.
Mandatory disclaimer again : Please take this with a pinch of salt, this maybe inaccurate implementationally but just as an understanding aid.