Project Euler #15
While dynamic programming is certainly a correct way to solve this kind of problems, this particular instance shows a regularity that can be exploited.
You can see the problem as arranging a number of “right"s and “down"s, being wary not to count multiple times identical arrangements.
For example, the solutions of the size 2 problem (reported in the images in the question) can be see this way:
→→↓↓
→↓→↓
→↓↓→
↓→→↓
↓→↓→
↓↓→→
So, for any grid of side n, you can find the solution by means of combinatorics:
from math import factorial
n = 20
print factorial(2*n)/(factorial(n)*factorial(n))
2n! is the number of arrangements of the 20 → + 20↓, while the two n! account for the identical ways in which the → and ↓ can be arranged.
As others have noted, there's a discrete math solution to this particular problem. But suppose you did want to solve it recursively. Your performance problem is that you're solving the same problems over and over again.
Let me show you a little higher-order programming trick that will pay big dividends. Let's take an easier recursive problem:
long Fib(n)
{
if (n < 2) return 1;
return Fib(n-1) + Fib(n-2);
}
You ask this to compute Fib(5). That computes Fib(4) and Fib(3). Computing Fib(4) computes Fib(3) and Fib(2). Computing Fib(3) computes Fib(2) and Fib(1). Computing Fib(2) computes Fib(1) and Fib(0). Now we go back and compute Fib(2) again. Then we go back and compute Fib(3) again. Huge amounts of recomputation.
Suppose we cached the results of the computation. Then the second time the computation was requested, we'd just return the cached result. Now comes the higher-order trick. I want to represent this concept of "cache the results of a function" as a function that takes in a function, and returns me a function that has this nice property. I'll write it as an extension method on functions:
static Func<A, R> Memoize(this Func<A, R> f)
{
// Return a function which is f with caching.
var dictionary = new Dictionary<A, R>();
return (A a)=>
{
R r;
if(!dictionary.TryGetValue(a, out r))
{ // cache miss
r = f(a);
dictionary.Add(a, r);
}
return r;
};
}
Now we do some minor rewriting on Fib:
Func<long, long> Fib = null;
Fib = (long n) =>
{
if (n < 2) return 1;
return Fib(n-1) + Fib(n-2);
};
OK, we have our non-memoized function. Now, magic:
Fib = Fib.Memoize();
And boom, when we call Fib(5), now we do a dictionary lookup. 5 is not in the dictionary, so we call the original function. That calls Fib(4), which does another dictionary lookup and misses. That calls Fib(3), and so on. When we get back to calling Fib(2) and Fib(3) the second time, the results are already in the dictionary, so we do not re-compute them.
Writing a two-argument version:
static Func<A1, A2, R> Memoize(this Func<A1, A2, R>) { ... }
is not too hard and is left as an exercise. If you do that, then you can just take your original beautiful recursive logic, do a simple rewriting into a lambda, and say:
progress = progress.Memoize();
and suddenly your performance will increase, with no loss of readability of the original algorithm.
Quick No Programming Solution (based on combinatorics)
I take it "no backtracking" means we always either increase x or increase y.
If so, we know that in total we will have 40 steps to reach the finish -- 20 increases in x, 20 increases in y.
The only question is which of the 40 are the 20 increases in x. The problem amounts to: how many different ways can you choose 20 elements out of a set of 40 elements. (The elements are: step 1, step 2, etc. and we're choosing, say, the ones that are increases in x).
There's a formula for this: it's the binomial coefficient with 40 on top and 20 on the bottom. The formula is 40!/((20!)(40-20)!)
, in other words 40!/(20!)^2
. Here !
represents factorial. (e.g., 5! = 5*4*3*2*1
)
Canceling out one of the 20! and part of the 40!, this becomes: (40*39*38*37*36*35*34*33*32*31*30*29*28*27*26*25*24*23*22*21)/(20*19*18*17*16*15*14*13*12*11*10*9*8*7*6*5*4*3*2*1)
. The problem is thus reduced to simple arithmatic. The answer is 137,846,528,820
.
For comparison, note that (4*3)/(2*1)
gives the answer from their example, 6
.
This can be done much faster if you use dynamic programming (storing the results of subproblems rather than recomputing them). Dynamic programming can be applied to problems that exhibit optimal substructure - this means that an optimal solution can be constructed from optimal solutions to subproblems (credit Wikipedia).
I'd rather not give away the answer, but consider how the number of paths to the lower right corner may be related to the number of paths to adjacent squares.
Also - if you were going to work this out by hand, how would you do it?