How are joins Commutative and Associative?
You've made an incorrect assumption.
If three's a relationship between A
and B
, and between B
and C
but no relationship between A
and C
, joining A
and C
doesn't give you no rows. It actually gives you m * n rows, where m is the number of rows in A
, and n is the number of rows in B
.
A JOIN B on (a.id = b.A_id)
JOIN C on (b.id = c.B_id)
becomes, if we join A
and C
first:
A JOIN C on (1 = 1)
JOIN B on (a.id = b.A_id AND b.id = c.B_id)
NOTE: College was over 30 years ago - hopefully you understand conceptually, and can rewrite into the correct syntax for actual relational algebra.
If you have 20 rows in A
, 30 in B
, and 40 in C
, and every row in B
matches one and only one row in A
, and every row in C
matches one and only one row in B
. When you Join A
to B
, you'll get 30 rows back; when you join those 30 rows to C
, you'll get 40 rows back.
If you first join A
to C
, with no relationship to define which pairs of rows are valid, you'll get back every possible pair of rows - 800 rows. When we join B
to the 800 rows, the connection between B
and A
means that a row in B
can only match the AC
rows that include one specific A
row - there'll be 40 of those (one for each C
row). Out of those 40 the only actual matches are the ones where the connection between B
and C
is also valid. And, we know that the C
rows that match the current B
row cannot match any other B
row. So, for one B
row we might have 2 AC
rows that match, for another 4, for yet another just 1.
That said, we know every C
row does match one (and only one) B
row, so the total number of matches will come out to 40 again - the same 40 matches as if we had first matches A
and B
, or B
and C
.
So - the condition on a join doesn't tell you which rows do match - it really tells you which rows don't match. Take it away, and you've got a CROSS JOIN
- the cross product of the rows in the two tables.