SICP example: Counting change, cannot understand
If we think too hard on recursion, we already fail. Personally, I use two metaphors in thinking recursions. One is from the small book "the little schemer": The Seventh Commandment - Recur on the subparts that are of the same nature
. Another is the divide-conquer-combine paradigm for designing algorithms. Essentially, they are the same thing in how to think recursively.
- Divide to the subparts of the same nature
The problem has two variables: The number (N) of money and the kinds (K) of coins, therefore any division needs to meet the following: 1. reducing all variables: both N and K, 2. the subparts are the same nature so each subpart can be solved by the recursion process itself or be can solved directly. 3. all subparts together == the original one part, no more and no less.
The division in the solution divides the original problems into two subparts: the first subpart is all combinations that using the first coin (we can restate it that all combinations using at least one coin of the first coin in the same meaning). The remaining subpart is that all combinations using none of the first coin. N is reduced in the first subpart, K is reduced in the second part. Both are the same nature which can be solved recursively and they together are the original problem.
- conquer
In this step, I think about the base cases. What are the all base cases when the problem is reduced to the minimal that can be given answers directly. In this solution, there are three base cases. 1st is the N is reduced to 0. 2nd is the N is reduced to the negative. 3rd is the coins is reduced to 0 but N is still positive.
- combine
How results are combined when the subparts are solved. In this solution, they are simply +.
What's more, if we are recursive on a list, the division is usually car of the list and cdr of the list. Usually car of list can be solved directly if itself isn't a list. cdr part should be solved recursively. The base case is the end of list if met.
B.T.W, I would highly recommend the little schemer
for learning recursion. It is much better than any others on this particular point as far as I have read.
The first code box in Will Ness' answer above gave me enough insight to understand the algorithm. Once I understood it, I realized I'd probably have got there very quickly by actually seeing what the algorithm does step-by-step.
Below is the graph of how the algorithm proceeds for a simple case. The amount is 6 pence and we have two kinds of coin: five pence (index 2) and a penny (index 1).
Note that the leaf nodes all evaluate to 0 or 1. This is obvious when we look at the condition in the procedure (one of these values is returned, or else the function calls itself again.) Only two leaf nodes evaluate to 1, so there are 2 ways to make 6 pence from these two kinds of coin, i.e. 6 pennies, or a penny and a five pence.
I now understand the algorithm but I still don't see how I would have worked out the algorithm from the initial problem. Maybe, as I read more of the SICP book, this kind of solution will seem more obvious to me.
(cc 6 2)
|
-----------------------------------
| |
(cc 6 1) (cc 1 2)
| |
------------------ --------------
| | | |
(cc 6 0)=0 (cc 5 1) (cc 1 1) (cc -4 2)=0
| |
------------- -------------
| | | |
(cc 5 0)=0 (cc 4 1) (cc 1 0)=0 (cc 0 1)=1
|
--------------
| |
(cc 4 0)=0 (cc 3 1)
|
--------------
| |
(cc 3 0)=0 (cc 2 1)
|
--------------
| |
(cc 2 0)=0 (cc 1 1)
|
--------------
| |
(cc 1 0)=0 (cc 0 1)=1
"number (N) of ways ... using N kinds" these two N
s are clearly not the same. so let's say K
kinds of coins.
We have many coins, but each coin is either 1, 5, 10, 25 or 50 cents, in total 5 kinds of coins. We need to buy something for a dollar, 100 cents. Assume unlimited supply of each kind of coins. How many ways are there for us to reach the total sum of 100?
We either use some coins (one or more) of 50 cents, or we don't. If not, we still have to get to the 100 with only 4 kinds of coins. But if we do, then after using one 50 cents coin, the total sum becomes 100 - 50 = 50 cents, and we may still use all 5 kinds of coins to reach the new, smaller total sum:
ways{ 100, 5 } = ways{ 100, 5 - 1 } ; never use any 50-cent coins
+ ; OR
ways{ 100 - 50, 5 } ; may use 50-cent coins, so use one
Or in general,
ways( sum, k ) = ways( sum, k - 1 )
+
ways( sum - first_denomination(k), k )
That's all there is to it. See? Generalization comes naturally with abstraction (replacing concrete values with symbols and making them parameters in a function definition).
Then we need to take care of the base cases. If sum = 0
, the result is 1: there's one way to reach total sum of 0 (and it is: take no coins).
If k = 0
, this means we are not allowed to use any kind of coins; in other words there's no way for us to reach a sum, any sum, without using at least some coins (unless the sum is 0, but we've already handled that case above). So the result must be 0.
Same if sum < 0
, of course. Impossible, i.e. 0 ways to sum up to it, using any coins with any positive denomination.
Another way to look at this is from the other end of the time arrow, if you will.
Imagine someone have already done all that for you and have put in front of you all these piles of bills, each pile summing up to the target sum. Without loss of generality, let each pile be sorted so that bigger bills are on top.
Divide all the piles into two groups: one with the biggest denomination bill on top each pile, and the other - without it. If the total number of piles is ways( denomsList, targetSum)
, then clearly the number of piles in the second group is ways( rest(denomsList), targetSum)
.
Then, we can get the top bill off each pile in the first group, and the number of piles in it clearly won't be changed by that. Having removed the top bill in each pile, we see that they all sum up to targetSum - first(denomsList)
, hence they number ways( denomsList, targetSum - first(denomsList))
in total.
The point to (structural) recursion is thinking in the small -- not trying to picture the whole sequence of operations at once, but rather standing still and trying to understand your current situation. It is a mental tool for approaching your problem, it is about solving it in the easiest most natural way, making as small a step as possible.
Calling (a copy of) yourself is a technicality. The main thing is the leap of faith, that you are allowed to call yourself: assuming you have already written down your definition, just use it were appropriate. And that's how it gets written down. You just describe what you have, how it's made of smaller parts (some of them similar to the full thing), and how the results for those parts can be combined back with the rest to get the full solution.
edit (from comments): The key to solving a problem recursively is to recognize that it can be broken down into a collection of smaller sub-problems to each of which that same general solving procedure that we are seeking applies, and the total solution is then found in some simple way from those sub-problems' solutions (which are found by that same general procedure as if it were available to us already). Each of thus created sub-problems being "smaller" guarantees the base case(s) will eventually be reached.
In other words, try to find the structure in the problem so that it has substructure(s) similar to the whole (like fractals; or e.g. a list's suffix is also a list; etc.); then, recursion is: assuming we already have the solution; taking the problem instance apart (according to the way in which we've structured our problem); transforming the "smaller" substructure(s) by the solution; and then combining it all back in some simple way (according to the way in which we structured our problem). The trick is to recognize the existing, inherent structure in your problem so that the solution comes naturally.
Or, in Prolog (of all the programming languages :) ) :
recursion( In, Out) :-
is_base_case( In),
base_relation( In, Out).
recursion( In, Out) :-
not_base_case( In),
constituents( In, SelfSimilarParts, LeftOvers), % (* forth >>> *)
maplist( recursion, SelfSimilarParts,
InterimResults),
constituents( Out, InterimResults, LeftOvers). % (* and back <<< *)
Which is to say, in pseudocode,
(In <--> Out) are related by recursion when
either
In is indivisible, and Out its counterpart
or
In = Sub_1 <+> Sub_2 <+> ... <+> Sub_N <++> Shell
------ r e c u r s i o n ------
Out = Res_1 {+} Res_2 {+} ... {+} Res_N {++} Shell
where
(Sub_i <--> Res_i) , for each i = 1, ..., N
The combination operation +
for In
and Out
might be different, because they can be different type of values.