The logical order of execution plan of the SQL query with more than one join
One way to determine the logical order of joins is to replace the first inner join in your example with a left outer join:
SELECT *
FROM user_branch T1
LEFT JOIN dimcustomer2 T2
ON T1.BRANCH_CODE = T2.BRANCH_CODE
INNER JOIN customer_guarantee T3
ON T3.CUSTOMER_NUM = T2.CUSTOMER_NUM
Let us assume that some rows in T1
have no matches in T2
. More specifically, let us assume these are the three tables:
T1 T2 T3
BRANCH_CODE BRANCH_CODE CUSTOMER_NUM CUSTOMER_NUM
----------- ----------- ------------ ------------
11 11 230 120
12 12 235 170
13 15 260 230
14 235
15 245
250
260
270
There are two joins here and two possibilities in which order they are executed.
1. LEFT JOIN, then INNER JOIN
If the left join evaluates first, then its result will have nulls in the T2
columns where T1
rows had no match:
T1.BRANCH_CODE T2.BARNCH_CODE T2.CUSTOMER_NUM -------------- -------------- --------------- 11 11 230 12 12 235 13 (null) (null) 14 (null) (null) 15 15 260
Joining that result further with T3
using an inner join on a condition that uses a T2
column will eliminate the non-matches – and, therefore, corresponding T1
rows, – because a null cannot satisfy the join's equals condition:
T1.BRANCH_CODE T2.BARNCH_CODE T2.CUSTOMER_NUM T3.CUSTOMER_NUM -------------- -------------- --------------- --------------- 11 11 230 230 12 12 235 235 15 15 260 260
This way some of T1
rows will be excluded from the final result set.
2. INNER JOIN, then LEFT JOIN
Now if the inner join is executed first, then it will produce a result set containing the rows from T2
and T3
that match the inner join's condition:
T2.BARNCH_CODE T2.CUSTOMER_NUM T3.CUSTOMER_NUM -------------- --------------- --------------- 11 230 230 12 235 235 15 260 260
When this result set is then outer-joined to T1
, T1
being on the outer side, you will get a final result containing all the rows from T1
and those from the T2
-T3
inner join that match the outer join condition:
T1.BRANCH_CODE T2.BARNCH_CODE T2.CUSTOMER_NUM T3.CUSTOMER_NUM -------------- -------------- --------------- --------------- 11 11 230 230 12 12 235 235 13 (null) (null) (null) 14 (null) (null) (null) 15 15 260 260
Therefore, this second interpretation would mean that all T1
rows should be present in the result.
Since these two interpretations give such different results, it is clear that only one can be true. Executing the query, you will see that actually it is the first one. That means that logically, the joins are executed in the order they are specified in the FROM
clause.
Syntax variations
Note that the conclusion above applies to the most conventional join syntax, namely this:
FROM
T1
... JOIN T2 ON ...
... JOIN T3 ON ...
...
Your example matches that pattern, so the conclusion applies to it as well. However, there are variations worth mentioning where our conclusion does not apply, or at least not as straightforwardly.
1. Nested JOIN syntax
Syntactically, a join may be specified inside another join, like this:
FROM
T1
JOIN
T2
JOIN T3 ON ..
ON ...
In the above case, JOIN T2
is encountered before JOIN T3
. However the former join's declaration is not complete at that point: its ON
subclause is the one at the end and is logically evaluated only after the JOIN T3 ON ...
part. So in this case, T2
is joined to T3
first, then the result of the join is joined to T1
.
You could still argue that our conclusion stands here, although it is not as clear-cut in this situation. We concluded that the joins are evaluated in the order they are specified in the FROM
clause. In this example, the first join we encounter as we parse the FROM
clause, is not yet completely specified by the time the second one is.
2. Mixing comma joins and conventional joins
Prior to introduction of the explicit JOIN
syntax, joins were specified like this:
FROM
T1,
T2,
T3
WHERE
<joining conditions, filter conditions>
This type of join, sometimes referred to as comma join, is still supported by most, if not all, platforms, including SQL Server.
Without a joining condition, a comma join is essentially a cross join. A joining condition makes it an inner join. You can see, though, that the joining condition in this case comes in an entirely different clause, the WHERE
clause.
Now, SQL Server allows you to mix comma joins and conventional joins in the same FROM
clause. When you have a mix of joins like this:
FROM
T1,
T2
JOIN T3 ON ... ,
T4 ...
SQL Server will evaluate each individual comma-separated item independently of the others before cross-joining them all together. So in the above case, the T2 JOIN T3 ON ...
join will be evaluated before its result is cross-joined to T1
(to be further filtered down by any joining condition that might be found the WHERE
clause). Our conclusion does not apply at all here. However, you can see that a very different syntax comes into play in this case.
I am discussing the mixed syntax in slightly more detail in my answer on Stack Overflow: The multi-part identifier could not be bound.
I'm afraid the phrase "logical execution" does not make much sense; query execution by definition is physical materialization of a result set. I think what you mean by "logical execution" is the query compilation, the phase where the query syntax and semantic meaning is analyzed and the query plan is prepared to implement said semantic meaning.
Joined tables1 in a query are always evaluated left to right (or top to bottom):
select ... from t_a -- evaluated first
join t_b -- evaluated second
on t_a.c1 = t_b.c3
join t_x -- evaluated third
on t_b.c4 = t_x.c5
...
You can verify this yourself if you try to reference in the ON
clause a column that belongs to a table included later in the evaluation sequence. This will fail to compile:
select ... from t_a
join t_b
on t_a.c1 = t_c.c8 -- t_c is not known yet
join t_c
...
After the query is parsed, the execution plan can perform joins in any order that maintains the query semantics. As the manual says,
The order of table sources after the
FROM
keyword does not affect the result set that is returned.
This Q&A is somewhat related.
1 - A joined_table
clause consists of two table_source
s and their corresponding ON
clause.
All the JOINs actually belong to the FROM clause. Semantically, it does not make a difference in which order the JOINs are written, as long as you maintain the ON clauses and don't use LEFT/RIGHT OUTER JOIN clauses. Said differently, the output of the FROM and the JOIN clauses is a single large relation where it clearly does not really matter in which order the columns are. This is very important because this is an important optimization opportunity for the database optimizer: it makes sense to execute the JOINs among the smaller relations first if the WHERE conditions might prune it. This way it might be more feasible for the database to cache those columns while working on the rest of the result set.