The logical order of execution plan of the SQL query with more than one join

One way to determine the logical order of joins is to replace the first inner join in your example with a left outer join:

SELECT *
FROM user_branch T1
LEFT  JOIN dimcustomer2 T2
   ON T1.BRANCH_CODE = T2.BRANCH_CODE
INNER JOIN customer_guarantee T3
   ON T3.CUSTOMER_NUM = T2.CUSTOMER_NUM

Let us assume that some rows in T1 have no matches in T2. More specifically, let us assume these are the three tables:

    T1                         T2                           T3
                                                   
BRANCH_CODE          BRANCH_CODE  CUSTOMER_NUM          CUSTOMER_NUM
-----------          -----------  ------------          ------------
11                   11           230                   120
12                   12           235                   170
13                   15           260                   230
14                                                      235
15                                                      245
                                                        250
                                                        260
                                                        270

There are two joins here and two possibilities in which order they are executed.

1. LEFT JOIN, then INNER JOIN

If the left join evaluates first, then its result will have nulls in the T2 columns where T1 rows had no match:

T1.BRANCH_CODE  T2.BARNCH_CODE  T2.CUSTOMER_NUM
--------------  --------------  ---------------
11              11              230
12              12              235
13              (null)          (null)
14              (null)          (null)
15              15              260

Joining that result further with T3 using an inner join on a condition that uses a T2 column will eliminate the non-matches – and, therefore, corresponding T1 rows, – because a null cannot satisfy the join's equals condition:

T1.BRANCH_CODE  T2.BARNCH_CODE  T2.CUSTOMER_NUM  T3.CUSTOMER_NUM
--------------  --------------  ---------------  ---------------
11              11              230              230
12              12              235              235
15              15              260              260

This way some of T1 rows will be excluded from the final result set.

2. INNER JOIN, then LEFT JOIN

Now if the inner join is executed first, then it will produce a result set containing the rows from T2 and T3 that match the inner join's condition:

T2.BARNCH_CODE  T2.CUSTOMER_NUM  T3.CUSTOMER_NUM
--------------  ---------------  ---------------
11              230              230
12              235              235
15              260              260

When this result set is then outer-joined to T1, T1 being on the outer side, you will get a final result containing all the rows from T1 and those from the T2-T3 inner join that match the outer join condition:

T1.BRANCH_CODE  T2.BARNCH_CODE  T2.CUSTOMER_NUM  T3.CUSTOMER_NUM
--------------  --------------  ---------------  ---------------
11              11              230              230
12              12              235              235
13              (null)          (null)           (null)
14              (null)          (null)           (null)
15              15              260              260

Therefore, this second interpretation would mean that all T1 rows should be present in the result.

Since these two interpretations give such different results, it is clear that only one can be true. Executing the query, you will see that actually it is the first one. That means that logically, the joins are executed in the order they are specified in the FROM clause.

Syntax variations

Note that the conclusion above applies to the most conventional join syntax, namely this:

FROM
  T1
  ... JOIN T2 ON ...
  ... JOIN T3 ON ...
  ...

Your example matches that pattern, so the conclusion applies to it as well. However, there are variations worth mentioning where our conclusion does not apply, or at least not as straightforwardly.

1. Nested JOIN syntax

Syntactically, a join may be specified inside another join, like this:

FROM
  T1
  JOIN
    T2
    JOIN T3 ON ..
  ON ...

In the above case, JOIN T2 is encountered before JOIN T3. However the former join's declaration is not complete at that point: its ON subclause is the one at the end and is logically evaluated only after the JOIN T3 ON ... part. So in this case, T2 is joined to T3 first, then the result of the join is joined to T1.

You could still argue that our conclusion stands here, although it is not as clear-cut in this situation. We concluded that the joins are evaluated in the order they are specified in the FROM clause. In this example, the first join we encounter as we parse the FROM clause, is not yet completely specified by the time the second one is.

2. Mixing comma joins and conventional joins

Prior to introduction of the explicit JOIN syntax, joins were specified like this:

FROM
  T1,
  T2,
  T3
WHERE
  <joining conditions, filter conditions>

This type of join, sometimes referred to as comma join, is still supported by most, if not all, platforms, including SQL Server.

Without a joining condition, a comma join is essentially a cross join. A joining condition makes it an inner join. You can see, though, that the joining condition in this case comes in an entirely different clause, the WHERE clause.

Now, SQL Server allows you to mix comma joins and conventional joins in the same FROM clause. When you have a mix of joins like this:

FROM
  T1,
  T2
  JOIN T3 ON ... ,
  T4 ...

SQL Server will evaluate each individual comma-separated item independently of the others before cross-joining them all together. So in the above case, the T2 JOIN T3 ON ... join will be evaluated before its result is cross-joined to T1 (to be further filtered down by any joining condition that might be found the WHERE clause). Our conclusion does not apply at all here. However, you can see that a very different syntax comes into play in this case.

I am discussing the mixed syntax in slightly more detail in my answer on Stack Overflow: The multi-part identifier could not be bound.

I'm afraid the phrase "logical execution" does not make much sense; query execution by definition is physical materialization of a result set. I think what you mean by "logical execution" is the query compilation, the phase where the query syntax and semantic meaning is analyzed and the query plan is prepared to implement said semantic meaning.

Joined tables¹ in a query are always evaluated left to right (or top to bottom):

select ... from t_a                   -- evaluated first
           join t_b                   -- evaluated second
             on t_a.c1 = t_b.c3
           join t_x                   -- evaluated third
             on t_b.c4 = t_x.c5
           ...

You can verify this yourself if you try to reference in the ON clause a column that belongs to a table included later in the evaluation sequence. This will fail to compile:

select ... from t_a
           join t_b
             on t_a.c1 = t_c.c8       -- t_c is not known yet
           join t_c                   
           ...

After the query is parsed, the execution plan can perform joins in any order that maintains the query semantics. As the manual says,

The order of table sources after the FROM keyword does not affect the result set that is returned.

This Q&A is somewhat related.

1 - A joined_table clause consists of two table_sources and their corresponding ON clause.

All the JOINs actually belong to the FROM clause. Semantically, it does not make a difference in which order the JOINs are written, as long as you maintain the ON clauses and don't use LEFT/RIGHT OUTER JOIN clauses. Said differently, the output of the FROM and the JOIN clauses is a single large relation where it clearly does not really matter in which order the columns are. This is very important because this is an important optimization opportunity for the database optimizer: it makes sense to execute the JOINs among the smaller relations first if the WHERE conditions might prune it. This way it might be more feasible for the database to cache those columns while working on the rest of the result set.

The logical order of execution plan of the SQL query with more than one join

1. LEFT JOIN, then INNER JOIN

2. INNER JOIN, then LEFT JOIN

Syntax variations

1. Nested JOIN syntax

2. Mixing comma joins and conventional joins

Tags:

Sql Server 2016

T Sql

Query

Related

Recent Posts