Induction vs. Strong Induction

The terms "weak induction" and "strong induction" are not commonly used in the study of logic. The terms are commonly used only in books aimed at teaching students how to write proofs.

Here are their prototypical symbolic forms:

  • weak induction: $(\Phi(0) \land (\forall n) [ \Phi(n) \to \Phi(n+1)]) \to (\forall n) \Phi(n)$

  • strong induction: $(\Psi(0) \land (\forall n) [ (\forall m \leq n) \Psi(m) \to (\forall m \leq n+1) \Psi(m)]) \to (\forall n) \Psi(n)$

The thing to notice is that "strong" induction is almost exactly weak induction with $\Phi(n)$ taken to be $(\forall m \leq n)\Psi(n)$. In particular, strong induction is not actually stronger, it's just a special case of weak induction modulo some trivialities like replacing $\Psi(0)$ with $(\forall m \leq 0 )\Psi(m)$. Of course you can write variations of the symbolic forms, but the same point applies to all of them: "strong" induction is essentially just weak induction whose induction hypothesis has a bounded universal quantifier.

So the question is not why we still have "weak" induction - it's why we still have "strong" induction when this is not actually any stronger.

My opinion is that the reason this distinction remains is that it serves a pedagogical purpose. The first proofs by induction that we teach are usually things like $\forall n \left [ \sum_{i=0}^n i = n(n+1)/2 \right ]$. The proofs of these naturally suggest "weak" induction, which students learn as a pattern to mimic.

Later, we teach more difficult proofs where that pattern no longer works. To give a name to the difference, we call the new pattern "strong induction" so that we can distinguish between the methods when presenting a proof in lecture. Then we can tell a student "try using strong induction", which is more helpful than just "try using induction".

In terms of logical strength in formal arithmetic, as you can see above, the two forms are equivalent over some weak base theory as long as you are looking at induction for a class of formulas that is closed under bounded universal number quantification. In particular, all the syntactic classes of the analytical and arithmetical hierarchies have that property, so weak induction for $\Sigma^0_k$ formulas is the same as strong induction for $\Sigma^0_k$ formulas, weak induction for $\Pi^1_k$ formulas is the same as strong induction for $\Pi^1_k$ formulas, and so on. This equivalence will hold under any reasonable formalization of "strong" induction - I chose mine above to make the issue particularly obvious.


Addendum I was asked in a comment why $$ (1)\colon (\forall t)[(\forall m < t)\Phi(m) \to \Phi(t)] $$ implies $$ (2)\colon (\forall n)[(\forall m \leq n)\Phi(m) \to (\forall m \leq n+1) \Phi(m)]. $$ I'm going to give a relatively formal proof to show how it goes. The proof is not by induction, instead it just uses universal generalization to prove the universally quantified statements.

For the proof, I will assume (1) and prove (2). Working towards that goal, I fix a value of $n$ and assume: $$ (3)\colon (\forall m \leq n)\Phi(m). $$

I first want to prove $(\forall m < n+1)\Phi(m)$, which is an abbreviation for $(\forall m)[m < n+1 \to \Phi(m)]$. Pick an $m$. If $m < n+1$, then $m \leq n$, so I know $\Phi(m)$ by assumption (3). So, by universal generalization, I obtain $(\forall m < n+1)\Phi(m)$.

Next, note that a substitution instance of (1) gives $(\forall m < n+1)\Phi(m) \to \Phi(n+1)$. I have proved $(\forall m < n+1)\Phi(m)$ so I can assert $\Phi(n+1)$.

So now I have assumed $(\forall m \leq n)\Phi(m)$ and I have also proved $\Phi(n+1)$. Another proof by cases establishes $(\forall m \leq n+1)\Phi(m)$.

By examining the proof, you can see which axioms I need in my weak base theory. I need at least the following two axioms:

  • $(m \leq t) \leftrightarrow (m < t) \lor (m=t)$

  • $(m < t+1) \to (m \leq t)$

I believe those are the only two axioms I used in the proof.


One way of looking at the question is to say that all induction is over some well-founded set, and strong induction is useful when the order type of that well-founded set is not the same as that of the natural numbers. Rather than elaborate on this general remark, let me simply give two examples of slightly more sophisticated induction that would appear in many undergraduate courses.

The first is the statement that every positive integer has a prime factorization. Here if we are trying to factorize n it is convenient to assume that all smaller numbers can be factorized, since if n is not prime then we will not use the factorization of n-1 but rather the factorizations of two factors of n about which we know nothing. In a sense we could say that the well-founded set that underlies this proof is N with the relation "is a proper factor of". (The simplest proof that this is well founded is to use the fact that if a is a proper factor of b then a is less than b, though once we know the fundamental theorem of arithmetic then this seems slightly unnatural. But we don't want to assume the fundamental theorem of arithmetic in order to prove the easy part of the fundamental theorem of arithmetic.)

The second is the statement that every tree with n vertices has n-1 edges. Here the proof is that every tree must have at least one vertex of degree 1 (or you could keep following a path until you hit a point you've hit before -- to make this completely rigorous will itself involve induction but let's forget that one) and that if you remove that vertex and its edge then you must still have a tree, which by induction has n-2 edges. Here we could either do straightforward induction on the number of vertices or we could do induction on the more complicated well founded set of all graphs under strict containment. The advantage of the first is that it is straightforward induction, but the disadvantage is that we have to create a slightly artificial inductive hypothesis -- that all trees with n vertices have n-1 edges. Graph theory beginners often trip up here and try to prove results like this by arguing that if you take a tree and add a new vertex to it, joining it to one of the existing vertices, then by induction the old tree has n-2 edges so the new one must have n-1 edges. In other words, they assume the result for n-1 and prove it for n, but unfortunately the result they assume for n-1 is not the right one and strictly speaking all they prove is that for each n there is a tree with n vertices and n-1 edges.

The second example isn't exactly strong induction, but I think it fits into the general discussion and helps to give some idea of what forms of induction are appropriate when.


Induction and strong induction, or what I would prefer to call the least number principle, are not equivalent formula by formula relative to, say, the standard algebraic axioms $\text{PA}^-$ underlying PA expressing that the structure is a discretely ordered semiring whose least element is $0$. However, induction for the class of all $\Sigma_n$ formulas is equivalent to the least number principle for the class of all $\Sigma_n$ formulas for each $n$.

Let me give a few details.

For a formula $\phi(x)$ define $$I(\phi) := (\phi(0) \& (\forall x)(\phi(x) \to \phi(x+1))) \to (\forall x)\phi(x)$$

while $$L(\phi) := (\exists x) \neg \phi(x) \to (\exists y)(\neg \phi(y) \& (\forall z < y) \phi(z))$$

It is easy to see that $\text{PA}^- \vdash L(\phi) \to I(\phi)$ while $\text{PA}^- \vdash I(\tilde{\phi}) \to B(\phi)$ where $\tilde{\phi}(x) := (\forall y \leq x) \phi(y)$.

If $\phi$ is of class $\Sigma_n$ (expressible as a formula with $n$ alternations of unbounded quantifiers starting with $\exists$ and then a string of bounded quantifiers followed by a quantifier free formula) then while $\tilde{\phi}$ not explicitly $\Sigma_n$, it is equivalent to a $\Sigma_n$ formula. Thus, relative to $\text{PA}^-$, we have a level by level equivalence of these principles.

It is fairly easy to see by considering structures that are not well-ordered that no formula by formula equivalence can be expected.