Why the constant() solution is more efficient than the easier one in "FP in Scala" 5.8?
This is more a comment to @ElBaulP answer than an answer in its own rights but it is too big for a comment.
I think you missed the root of the optimization even though it is explicitly stated in the comment above. The fact that the val tail
in the constant
is lazy
is an implementation detail or rather a trick to make the main source of the optimization possible. And the main source of the optimization is the fact that tail
is self-referential.
For a moment let's get rid of all the lazy
stuff. Assume that Cons
is defined as
case class Cons[+A](h: A, t: () => Stream[A]) extends Stream[A]
and let's define constant
as
def constant[A](a: A): Stream[A] = {
val tailFunc: () => Stream[A] = () => tail
val tail: Stream[A] = Cons(a, tailFunc)
tail
}
Yes, this is a syntactically invalid program because tailFunc
uses tail
defined only in the next line. But imagine Scala can compile that. I think now it is quite clear that such an implementation of constant
will only create one instance of class Cons
per call and that instance will be used no matter how long you try to iterate over the returned stream.
What defining tail
as lazy
allows is just to make the code logically equivalent to the above one compile without errors. From the implementation point of view it is similar to something like that:
class Lazy[A](var value:A)
def constant[A](a: A): Stream[A] = {
val lazyTail: Lazy[Stream[A]] = new Lazy(null)
// this tailFunc works because lazyTail is already fixed and can be safely
// captured although lazyTail.value will be changed
val tailFunc: () => Stream[A] = () => lazyTail.value
lazyTail.value = new Stream(a, tailFunc)
lazyTail.value
}
This code does not exactly matches actual lazy
implementation in many details because it is actually eager but I think it shows that you don't really need lazy
to make the optimization work (but at the cost of using a var
, which Scala compiler does anyway inside its real, more complicated lazy
implementation).
In your naive implementation
def constant[A](a: A): Stream[A] = Stream.cons(a, constant(a))
lazy evaluation of tail
allows you to not fail immediately with stack overflow because of recursive calls of constant
from itself. But still when you do constant(whatever).tail
, the tail
is being evaluated so the constant
method is called once again and a new Cons
object is created. And this will happen every time tail
is called on new head
.
To re-state it once again, lazy val tail
is just a trick to allow tail
to reference itself at creation and the really important part is the fact that tail
references itself which allows to use only one object for iteration rather than one object per each next .tail
call.
I think it is because with the lazy implementation, you are creating the object only once, and memoizing it, so when you call constant
, you are referring to the same object over and over again, something like this:
tail -----
^------'
With your implementation (That is the same I came across while reading the book), you are creating new objects every time you call the function. So if you call multiple times your implementation, you have:
Stream.cons(a, Stream.cons(a, Stream.cons(a, constant(a))))
Let see it with an example:
def constant[A](a: A): Stream[A] = { println("CALLED"); cons(a, constant(a)) }
// This is more efficient than `cons(a, constant(a))` since it's just
// one object referencing itself.
def constant_1[A](a: A): Stream[A] = {
lazy val tail: Stream[A] = Cons(() ⇒ a, () ⇒ tail)
println("CALLED_1")
tail
}
println(s"Cons ${Stream.constant_1(0).take(10).toListFast}")
println(s"Cons ${Stream.constant(0).take(10).toListFast}")
Above code produces the following output:
CALLED_1
Cons List(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
CALLED
CALLED
CALLED
CALLED
CALLED
CALLED
CALLED
CALLED
CALLED
CALLED
Cons List(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
As you can see, the println
statement of the lazy implementation is only called once.
You can see @SergGr for a detailed explanation.