Are the SQL concepts LEFT OUTER JOIN and WHERE NOT EXISTS basically the same?
As Michael already answered your question here is a quick sample to illustrate the difference:
Table A
Key Data
1 somedata1
2 somedata2
Table B
Key Data
1 data1
Left outer join:
SELECT *
FROM A
LEFT OUTER JOIN B
ON A.Key = B.Key
Result:
Key Data Key Data
1 somedata1 1
2 somedata2 null null
EXISTS use:
SELECT *
FROM A WHERE EXISTS ( SELECT B.Key FROM B WHERE A.Key = B.Key )
Not Exists In:
SELECT *
FROM A WHERE NOT EXISTS ( SELECT B.Key FROM B WHERE A.Key = B.Key )
Result:
Key Data
2 somedata2
No they are not the same thing, as they will not return the same rowset in the most simplistic use case.
The LEFT OUTER JOIN
will return all rows from the left table, both where rows exist in the related table and where they does not. The WHERE NOT EXISTS()
subquery will only return rows where the relationship is not met.
However, if you did a LEFT OUTER JOIN
and looked for IS NULL
on the foreign key column in the WHERE
clause, you can make equivalent behavior to the WHERE NOT EXISTS
.
For example this:
SELECT
t_main.*
FROM
t_main
LEFT OUTER JOIN t_related ON t_main.id = t_related.id
/* IS NULL in the WHERE clause */
WHERE t_related.id IS NULL
Is equivalent to this:
SELECT
t_main.*
FROM t_main
WHERE
NOT EXISTS (
SELECT t_related.id
FROM t_related
WHERE t_main.id = t_related.id
)
But this one is not equivalent:
It will return rows from t_main
both having and not having related rows in t_related
.
SELECT
t_main.*
FROM
t_main
LEFT OUTER JOIN t_related ON t_main.id = t_related.id
/* WHERE clause does not exclude NULL foreign keys */
Note This does not speak to how the queries are compiled and executed, which differs as well -- this only addresses a comparison of the rowsets they return.
Left outer join is more flexible than where not exists. You must use a left outer join if you want to return any of the columns from the child table. You can also use the left outer join to return records that match the parent table as well as all records in the parent table that have no match. Where not exists only lets you return the records with no match.
However in the case where they do return the equivalent rows and you do not need any of the columns in the right table, then where exists is likely to be the more performant choice (at least in SQL server, I don't know about other dbs).