How to demystify the axioms of propositional logic?

There are good answers already, but one note:

Another way to understand the choice of the first three axioms you quote is that they are exactly what you need in order to be able to prove the Deduction Theorem by induction on the length of the inner proof.

  • $H\to H$ takes care of a step in the original proof that just applies the hypothesis.

  • $A \to (H \to A)$ allows you to translate a step in the original proof that introduces a logical axiom.

  • $(H \to (P\to Q)) \to ((H \to P) \to (H\to Q))$ is what you need to translate an application of Modus Ponens.

Actually it turns out that $H\to H$ can be derived from the two others (though that is not particularly intuitive), so in many presentations it will be left out.

There are many different ways to complete these three axioms such that you can prove exactly all of the propositional tautologies that can be written using $\neg$ and $\to$. The one you're quoting has the advantage of being reasonably simple and intuitively obvious, while still being sufficient to allow all tautologies to be proved.

Together, the Deduction Theorem and Modus Ponens tell us a lot about how the $\to$ connective works, but it turns out there are propositional tautologies written with $\to$ alone that cannot be proved by these two ingredients. It is quite remarkable that the simple-looking fourth axiom can manage in one breath to tell us everything else there is to know about $\to$, as well as (simultaneously) everything about $\neg$ too. I don't have any good intuitive explanation of how it does that. I can just prove that it does.


I assume that you understand why these axioms are tautologies (i.e., true under all truth assignments) and that your question is why these were chosen, from among the infinite number of all tautologies, to serve as axioms. Several factors were involved in this choice (I think --- the choice was actually made long before I was around).

First, there was the fairly arbitrary decision to use $\to$ and $\neg$ as the primitive connectives, so that all other connectives ($\land,\lor, \iff$) are viewed as abbreviations. Had other connectives been chosen as primitive, they would have been involved in the axioms.

Second, having included $\to$ among the primitive notions, one has a very simple and widely used (since ancient times) rule of inference, namely modus ponens: From $\phi\to\psi$ and $\phi$, infer $\psi$. The desire to use this rule may well have motivated the choice of $\to$ as one of the primitive notions; the other, $\neg$, may have been motivated as the simplest thing that, together with $\to$, lets you define all the other connectives.

Third, one wants enough axioms to support, with modus ponens, a completeness theorem. That is, every tautology ought to be provable from the axioms using modus ponens. (More generally, every semantic consequence of any set $S$ of formulas ought to be deducible from $S$ and the axioms using modus ponens.) One way to find such a system of axioms is to begin with no axioms, start trying to write a proof of completeness, and gradually add axioms as you see that they are needed for your completeness proof. With extreme good luck, this might lead you to the axioms in the question. With less luck, it will lead you to a messier set of axioms, which you could then try to "clean up" by removing any redundancies, replacing complicated axioms by simpler axioms that imply the complicated ones, etc. I expect that this is how the axioms were first found. Of course, once they're found, people like me don't go through the work of rediscovering them but merely copy them from textbooks.


Which set of axioms we choose for logic is as much an aesthetic choice as it is a mathematical choice. Things that we'd like in logic axioms are:

  1. Self-evidence - can a person interpreting each axiom see why they ought to be true?
  2. Simplicity - related to self-evidence, but simplicity has its own rewards
  3. Independence - We tend to want our axioms not to be too redundant. If Axiom C can be proved from Axiom A and Axiom B, why make C an axiom?

These are aesthetic decisions.

We often have an implicit technical goal. Essentially, when we are designing axioms, we have an idea of what the axioms represent, and we want "enough" of them to get all the sorts of results we want.

We can start with different axiom sets and get the same logic. The definition of when two axiom systems are equivalent can be a little complex, bucause one system might have a different set of symbols than another. (For example, we can do logic without $\to$ and $\land$ and get an equivalent logical system. Often, we keep redundant symbols, because the axioms become less simple and less self-evident without them.)

The most misunderstood element of standard propositional logic is the $\to$ symbol. It is often read as "implies," but that has human meaning that one statement follows from the other in some direct way. In particular, we tend to think of "implies" as having a universal quantifier on it: "($x$ is even) $\to$ ($x+2$ is even)" we read with an implicit "For all $x\dots$" at the beginning. That is not really what $\to$ means, which is one reason some axioms of logic don't use the $\to$ symbol at all, or explicitly define $p\to q$ as a shorthand for $(\lnot p)\lor q$.

So you can see that not only are the axioms chosen based on aesthetics, but the set of symbols we consider fundamental are chosen likewise.

At the most absurd, we can get by with only one logical operation, and axioms on that operation, to get all of propositional logic. For example, if we defined $p*q$ to mean "both $p$ and $q$ are false" then $p*p$ would mean "$p$ is false." $(p*p)*(q*q)$ would mean "$p$ and $q$ are both true." On the other hand $(p*q)*(p*q)$ means "At least one of $p$ and $q$ is true." This violates all of the aesthetics - there is nothing about $*$ that seems intuitively "fundamental," for example, and while $\land$ and $\lor$ have some nice intuitive axioms like associativity, translating those into axioms for $*$ makes them quite ugly. That said, mathematicians are often interested in such reductions because the number of symbols necessary to understand a system is one measure of the complexity of the system.