Why do we do mathematical induction only for positive whole numbers?

Let's forget about natural numbers for a while and take a look at what mathematical induction is proving. At the base step, it's proven that a property, say $P$, holds for some special object, say $a$. At the induction step, it's proven that if $P$ holds for some fixed object, say $x$, then it holds for some other object related to $x$ in some way, which can be translated mathematically to: for some function $S$, $P$ holds for $S(x)$. Now, since we know that $P$ holds for $a$, we conclude that $P$ holds for $S(a)$, and then for $S\big(S(a)\big)$, and so on: $$a\quad\to\quad S(a)\quad\to\quad S\big(S(a)\big)\quad\to\quad S\Big(S\big(S(a)\big)\Big)\quad\to\quad\dots$$ this leads to a sequence of objects defined like this: $$a_0=a$$ $$a_{n+1}=S(a_n)$$ $$a_0\quad\to\quad a_1\quad\to\quad a_2\quad\to\quad a_3\quad\to\quad\dots$$ and $P$ holds for every element of this sequence. If we look at the indices, we find natural numbers. So natural numbers appear naturally in inductive proofs! It may be useful to note that having only the basis and the inductive step, the objects mentioned above are the only objects that we have proven to have the property $P$.

Now, the standard starting object is $0$ and the standard successor function is $S(x)=x+1$. But putting any other objects in a sequence, you can use induction for giving a proof. Note that even in that case, you can define a sequence like we did before, and use the standard induction! So there is no real difference between them. For example, letting $a=0$ and $S(x)=x-1$, you can prove a property for every nonpositive integer. In this case we have $a_n=-n$. So you can use mathematical induction twice to prove a property for all integers; once with $a=0$ and $S(x)=x+1$, and once with $a=0$ and $S(x)=x-1$.

Finally, it may be useful to note that not every set of objects can be represented as a sequence. For example, real numbers can't be listed in a sequence and so you can't use this kind of induction for them.


Induction really only requires that we can always make some sort of path between our base case and the cases we want to prove. The normal form of induction makes a path that looks like: $$\rightarrow 0\rightarrow 1 \rightarrow 2 \rightarrow 3 \rightarrow 4 \rightarrow \ldots$$ where each arrow represents an implication - that the case for $n$ implies the one for $n+1$. I include an arrow to $0$ to remind us that that case must be proven. It's perfectly okay to replace this path by any other one - so $$\rightarrow 0\rightarrow \frac{1}2\rightarrow 1\rightarrow \frac{3}2\rightarrow 2\rightarrow \ldots$$ works just as well or starting from a negative $$\rightarrow-1\rightarrow 2\rightarrow 3 \rightarrow 4\rightarrow 5\rightarrow\ldots$$ or going in both directions $$\begin{align*} & \downarrow & \\\ldots\leftarrow-3\leftarrow -2\leftarrow -1 \leftarrow &\,\,0\rightarrow1\rightarrow 2 \rightarrow 3\rightarrow\ldots\end{align*}$$ so long as we can provide a proof corresponding to each of the arrows.

A notable thing about this is that we can use something called "structural induction" where we think of discrete mathematical structures (like graphs or groups) and prove some property of them by reducing the problem to a smaller case (in a way that, enough such reductions bring us to some base case), and these could have quite complex diagrams if plotted as I've been doing. The main point of these pictures is that induction requires making some sort of linking.

In the abstract sense, what induction requires is that we order all the cases we wish to prove it on in a special way called a "well-order" which means that every non-empty set has a minimal element (i.e. an element $x$ so that there is no $y<x$ in the set). Then, we need to prove, for any element $x$ a proof that, if our desired proposition holds for all $y<x$, it holds for $x$ as well. What we then find is that, if there were a counterexample, there would be a minimal counterexample - some $x$ such that for all $y<x$, the statement held. But this is impossible since the statement holding for all $y<x$ suffices to show the proposition.

Ordinary induction uses the ordinary order on the natural numbers - and we note that we are required to prove that the case of $0$ holds assuming it holds for all $x<0$ - of course, there are no such $x$ in our domain, which explains why the base case needs separate proof. Similarly, we can induct on the set $\frac{1}2\mathbb N$ using the normal order. And, the induction on $\mathbb Z$ I depicted comes from the (partial) order defined as normal on the positives and defined as $-n<-m$ on negatives exactly when $n<m$ for $n,m\in\mathbb N$ (where we do not say that $1<-1$ or $-1<1$ - we never need to compare them as neither is used in the proof of the other).

The ordinary order on $\mathbb R>0$ does not satisfy this. However, if one can prove that if the statement holds for $x$ then it holds for $x+k$ for all $k\in (0,\varepsilon]$ for a fixed $\varepsilon$, then one may order the set by saying that $x<y$ whenever $x+\varepsilon$ is less than $y$ in the usual order, and this is a well order, so we can use induction. This does, however, require us to prove infinitely many implications, and one really doesn't see it that much.


Induction works on a set $A$ that is equipped with a relation $R\subseteq A\times A$ such that every non-empty subset $B$ of $A$ contains an element $b\in B$ with $\{x\in A\mid x \mathrel{R} b\}\cap B=\varnothing$.

A relation that satisfies this is by definition "well-founded".

Let $\mathcal P$ be some property of elements of $A$ that is "inherited" from $R$-predecessors.

This in the sense that: if $\mathcal P(b)$ for all $b\in A$ with $b \mathrel{R} a$ then also $\mathcal P(a)$.

Then it can be shown that $\mathcal P(a)$ is true for each $a\in A$.

To prove this assume that the statement is not true. Then the set $B:=\{a\in A\mid\neg\mathcal P(a)\}$ is not empty. Let $b\in B$ with $\{x\in A\mid x \mathrel{R} b \}\cap B=\varnothing$. Then $\mathcal P(c)$ for $c\in A$ with $c \mathrel{R} b$ and consequently $\mathcal P(b)$. A contradiction is found, and we are ready.

So the essential question here is:

are we dealing with a well-founded relation on a set?

The answer is "yes" if you are working with e.g. $\langle\mathbb N,<\rangle$ or $\langle\{\frac12n\mid n\in\mathbb N\},<\rangle$ or $\langle-\mathbb N,>\rangle$, but is "no" if you are working with e.g. $\langle\mathbb R,<\rangle$, $\langle\mathbb Z,<\rangle$ or $\langle\mathbb N,>\rangle$ .