Why can we multiply by breaking up the factors as sums in different ways?

There are two ways I see to answer this question. One is from an axiomatic standpoint, where numbers are merely symbols on paper that are required to follow certain rules. The other uses the interpretation of multiplication as computing area. The former would take a good 5-10 pages to build up from the Peano axioms.

For the latter, you can draw a rectangle 13 units by 34 units. Break one side into 10 units and 3 units, and the other side into 30 units and 4 units. At this point, you should see that this decomposes the rectangle into four pieces, corresponding to the four terms you get from the distributive rule. The various was to compute $13 \cdot 34$ are all just ways to decompose the rectangle, and at the end of the day they all compute the same number: the area of the rectangle.


Well, that is literally what the distributive law tells you. It tells you that $$(a+b)(c+d)=a(c+d)+b(c+d)=ac+ad+bc+bd$$ and so whenever you break up the two factors of a product as a sum, you can use the "pieces" to compute the product.

So what you are really asking for is a proof of the distributive law itself. What constitutes a "proof" of such a basic fact depends heavily on what definitions of "numbers" and the operations on them that you are using (for some definitions, the distributive law is simply an axiom that you assume). But here is an intuitive explanation that works for natural numbers (and this can be turned into a rigorous proof if you define arithmetic of natural numbers in terms of cardinalities of sets).

We want to prove that $(a+b)c=ac+bc$. What does a product $xy$ of natural numbers mean? Well, it means you draw a grid of dots with $x$ rows and $y$ columns, and count up the total number of dots. So, to compute $(a+b)c$, you draw a grid with $a+b$ rows and $c$ columns. Now we observe that we can split such a grid into two pieces: the top $a$ rows and the bottom $b$ rows. The top $a$ rows form a grid with $a$ rows and $c$ columns, so they have $ac$ dots. The bottom $b$ rows form a grid with $b$ rows and $c$ columns, so they have $bc$ dots. In total, then, we have $ac+bc$ dots, so $(a+b)c=ac+bc$.

(In the computation of $(a+b)(c+d)$ above I used the distributive law in two different versions, one with the sum on the left side of the product and one with the sum on the right side of the product. You can prove the version with the sum on the right side of the product in the same way (you just split up the columns instead of the rows), or you can deduce it from the other version using commutativity of multiplication.)


Formally, this is a property of rings. Rings have a multiplication operation that distributes with respect to addition, meaning for any 3 numbers $a$ $b$ and $c$: $$a\cdot(b + c) = (a\cdot b) + (a\cdot c)$$ $$(b + c)\cdot a = (b\cdot a) + (c\cdot a)$$ The real numbers are a ring (they're actually a field, which is a special kind of ring), so that's half of your answer.

The other half is Eric's answer which addresses the natural numbers. Natural numbers are not a ring, but they do have a distributive property.

From a philosophical perspective, we could define anything we wanted, but what's interesting about this particular pattern is that it's so useful. Nothing prevents me from making $x+\frac{3}{2}$ means a triple gainer with a twist, but outside of diving, that particular pattern isn't all that useful. We tend to find that fields and rings show up rather often in physically meaningful scenarios.

Now from a philosophical perspective, it makes sense to point out that there are also lots of other really useful patterns that show up. For example, if you look at how we define rotations, such as using yaw, pitch, and roll to describe the orientation of an aircraft, those don't seem to add the way we want them to. The rotations form a pattern known as a group, which doesn't even have a concept of addition at all! They only have multiplication. And by that I mean mathematicians decided to call the one operation in this pattern "multiplication" because its rules are a generalization of matrix multiplication.

We also have all sorts of oddball cases which may or may not actually be philosophically relevant. For example, we can consider the ordinal numbers, which explore labeling objects as 1st, 2nd, 3rd, and so on. Ordinals grapple with the concept of infinity, which generally means they've got some quirks. One of the quirks of ordinals is that they are left distributive but not right distributive. That means I can use the distributive property in $a\cdot(b + c)$ but not $(b + c)\cdot a$! So that shows that we've come up with some really strange systems which look sane, but where the distributive law starts to get a little strange. (For what it's worth, part of the reason this law acts so strange is that multiplication isn't commutative in ordinals: $2\cdot\omega \neq \omega\cdot 2$)

So in the end, what makes this distributive law so interesting philosophically is that real numbers and natural numbers seem to be terribly good at describing the world around us, and both of them have distributive properties. But that doesn't mean that everything interesting has a distributive law, or even that the distributive law will make intuitive sense to you! Now the question for why real numbers and natural numbers are so useful in reality is a really interesting philosophical question which has lead some people to argue that mathematics is the language upon which reality sits.

I say it sits on a turtle. But who am I to judge?