Time complexity of Euclid's Algorithm
One trick for analyzing the time complexity of Euclid's algorithm is to follow what happens over two iterations:
a', b' := a % b, b % (a % b)
Now a and b will both decrease, instead of only one, which makes the analysis easier. You can divide it into cases:
- Tiny A:
2a <= b
- Tiny B:
2b <= a
- Small A:
2a > b
buta < b
- Small B:
2b > a
butb < a
- Equal:
a == b
Now we'll show that every single case decreases the total a+b
by at least a quarter:
- Tiny A:
b % (a % b) < a
and2a <= b
, sob
is decreased by at least half, soa+b
decreased by at least25%
- Tiny B:
a % b < b
and2b <= a
, soa
is decreased by at least half, soa+b
decreased by at least25%
- Small A:
b
will becomeb-a
, which is less thanb/2
, decreasinga+b
by at least25%
. - Small B:
a
will becomea-b
, which is less thana/2
, decreasinga+b
by at least25%
. - Equal:
a+b
drops to0
, which is obviously decreasinga+b
by at least25%
.
Therefore, by case analysis, every double-step decreases a+b
by at least 25%
. There's a maximum number of times this can happen before a+b
is forced to drop below 1
. The total number of steps (S
) until we hit 0 must satisfy (4/3)^S <= A+B
. Now just work it:
(4/3)^S <= A+B
S <= lg[4/3](A+B)
S is O(lg[4/3](A+B))
S is O(lg(A+B))
S is O(lg(A*B)) //because A*B asymptotically greater than A+B
S is O(lg(A)+lg(B))
//Input size N is lg(A) + lg(B)
S is O(N)
So the number of iterations is linear in the number of input digits. For numbers that fit into cpu registers, it's reasonable to model the iterations as taking constant time and pretend that the total running time of the gcd is linear.
Of course, if you're dealing with big integers, you must account for the fact that the modulus operations within each iteration don't have a constant cost. Roughly speaking, the total asymptotic runtime is going to be n^2 times a polylogarithmic factor. Something like n^2 lg(n) 2^O(log* n)
. The polylogarithmic factor can be avoided by instead using a binary gcd.
The suitable way to analyze an algorithm is by determining its worst case scenarios.
Euclidean GCD's worst case occurs when Fibonacci Pairs are involved.
void EGCD(fib[i], fib[i - 1])
, where i > 0.
For instance, let's opt for the case where the dividend is 55, and the divisor is 34 (recall that we are still dealing with fibonacci numbers).
As you may notice, this operation costed 8 iterations (or recursive calls).
Let's try larger Fibonacci numbers, namely 121393 and 75025. We can notice here as well that it took 24 iterations (or recursive calls).
You can also notice that each iterations yields a Fibonacci number. That's why we have so many operations. We can't obtain similar results only with Fibonacci numbers indeed.
Hence, the time complexity is going to be represented by small Oh (upper bound), this time. The lower bound is intuitively Omega(1): case of 500 divided by 2, for instance.
Let's solve the recurrence relation:
We may say then that Euclidean GCD can make log(xy) operation at most.
There's a great look at this on the wikipedia article.
It even has a nice plot of complexity for value pairs.
It is not O(a%b)
.
It is known (see article) that it will never take more steps than five times the number of digits in the smaller number. So the max number of steps grows as the number of digits (ln b)
. The cost of each step also grows as the number of digits, so the complexity is bound by O(ln^2 b)
where b is the smaller number. That's an upper limit, and the actual time is usually less.