How is debugging achieved in a lazy functional programming language?
From experience with Clojure (which is lazy, functional, and encourages but does not enforce purity):
You can set breakpoints just as with any other language. However, becuase of lazy evaluation, these might not get called immediately, but will be hit as soon as evaluation the lazy structure is forced.
In lazy functional languages that allow side effects (Clojure included) you can insert printlns and other debug logging relatively easily. I personally find these very useful. You have to be careful about when these get called because of laziness, but if you don't see the output at all it can be a hint that your code isn't being evaluated because of laziness.....
Having said all the above, I have never so far needed to resort to the debugger. Often a few simple tests (perhaps on the REPL) are enough to verify that functional code is working correctly, and if these fail then it's usually quite obvious what is going wrong.
I don't think this topic can be dealt within a short space. Please read the papers available at the following links:
- A Theory of Tracing Pure Functional Programs.
- The Haskell Tracer publications.
- Haskell Debugging Technologies.
I've never delved into anything terribly complicated in Haskell, but the fact that side effects are virtually gone has eliminated most of the need for debugging. Pure functions are extremely simple to test and verify without a debugger.
On the other hand, I did experience a couple times I needed to debug something within a monad, in which case I already was able to print/log/whatever.
At least for smaller programs or systems, debugging kind of goes out the window. Strong typing and static type-checking really further eliminate the traditional bugs you find in procedural programming. Most bugs, if any, are logical bugs (called the wrong functions, mathematical error, etc) -- very easy to test interactively.
Nothing prevents you from using breakpoints in a lazily evaluated functional program. The difference to eager evaluation is when the program will stop at the breakpoint and what the trace will look like. The program will stop when the expression a breakpoint is set on is actually being reduced (obviously).
Instead of the stack trace you're used to you get the reductions that led up to the reduction of the expression with the breakpoint on it.
Small silly example. You have this Haskell program.
add_two x = 2 + x
times_two x = 2 * x
foo = times_two (add_two 42)
And you put a breakpoint on the first line (add_two
), then evaluate foo
. When the program stops on the breakpoint, in an eager language you'd expect to have a trace like
add_two
foo
and times_two
hasn't even begun to be evaluated, but in the GHCi debugger you get
-1 : foo (debug.hs:5:17-26)
-2 : times_two (debug.hs:3:14-18)
-3 : times_two (debug.hs:3:0-18)
-4 : foo (debug.hs:5:6-27)
<end of history>
which is the list of reductions that led up to the reduction of the expression you put the breakpoint on. Note that it looks like times_two
"called" foo
even though it does not do so explicitly. You can see from this that the evaluation of 2 * x
in times_two
(-2) did force the evaluation of (add_two 42)
(-1) from the foo
line. From there you can perform a step as in an imperative debugger (perform the next reduction).
Another difference to debugging in an eager language is that variables may be not yet evaluated thunks. For example, at step -2 in the above trace and inspect x
, you'll find it's still an unevaluated thunk (indicated by brackets in GHCi).
For far more detailed information and examples (how to step through the trace, inspect values, ...), see the GHCi Debugger section in the GHC manual. There's also the Leksah IDE which I haven't used yet as I'm a VIM and terminal user, but it has a graphical frontend to the GHCi debugger according to the manual.
You also asked for print statements. Only with pure functions, this is not so easily possible as a print statement would have to be within the IO monad. So, you have a pure function
foo :: Int -> Int
and wish to add a trace statement, the print would return an action in the IO monad and so you'd have to adjust the signature of the function you wish to put that trace statement in, and the signatures of the functions that call it, ...
This is not a good idea. So, you need some way to break purity to achieve trace statements. In Haskell, this can be done with unsafePerformIO
. There's the Debug.Trace
module that already has a function
trace :: String -> a -> a
which outputs the string and returns the second parameter. It would be impossible to write as a pure function (well, if you intend to really output the string, that is). It uses unsafePerformIO
under the hood. You can put that into a pure function to output a trace print.
Would you have to program a monad for every section of code you want to test?
I'd suggest rather the opposite, make as many functions pure as possible (I'm assuming here you mean the IO monad for printing, monads are not necessarily impure). Lazy evaluation allows you to separate IO code from processing code very cleanly.
Whether imperative debugging techniques are a good idea or not depends on the situation (as usual). I find testing with QuickCheck/SmallCheck much more useful than unit testing in imperative languages, so I'd go that route first to avoid as much debugging as possible. QuickCheck properties actually make nice concise function specifications (a lot of test code in imperative languages looks like just another blob of code to me).
One trick to avoid a lot of debugging is to decompose the function into many smaller subfunctions and test as many of them as possible. This may be a bit unusal when coming from imperative programming, but it's a good habit no matter what language you're using.
Then again, debugging != testing and if something goes wrong somewhere, breakpoints and traces may help you out.