Entity Framework (4.2) HasRequired results in unexpected LEFT OUTER JOIN
Your model configuration is correct and I think it is not a bug but it is behaviour by design, but I cannot tell exactly what design. I've also seen that SQL in such queries. Only a few remarks:
The query you are seeing is not specific to EF 4.2. It would also occur for EF 4.1 and EF 4.0. But not for EF 1 (.NET 3.5). In EF 1 every
Include
, also the first, has been mapped to aLEFT OUTER JOIN
, also for required relationships.I think one cannot say that using an
INNER JOIN
is "correct" for required navigation properties andLEFT OUTER JOIN
is wrong. From a mapping view point it doesn't matter what you use, given that the constraints in the database represent the relationships in the model correctly. For a required navigation property the FK column in the database must not be nullable and there must a constraint in the database which enforces that the FK refers to an existing row in the target table. If that is the case, everyJOIN
must return a row, no matter if you useINNER JOIN
orLEFT OUTER JOIN
.What happens if model and database is "out of sync" regarding the relationships? Basically nonsense happens in both cases: If you use a
LEFT OUTER JOIN
and the FK isNULL
in the DB or refers to a not existing row, you'd get an entity where the navigation property isnull
, violating the model definition that the property is required. Using anINNER JOIN
is not better: You'd get no entity at all, a query result which is at least as wrong as the result with theLEFT OUTER JOIN
, if not worse.So, I think the change in .NET 4 to use
INNER JOIN
s for someInclude
s has been made not because the SQL in EF 1 was wrong but to create better and more performant SQL. This change actually introduced a breaking change in that some queries returned other results now than they did in EF 1: http://thedatafarm.com/blog/data-access/ef4-breaking-change-ef4-inner-joins-affect-eager-loading-many-to-many/My understanding is that this has been fixed and that the reason was that
INNER JOIN
s in too many situations have been introduced for eager loading in EF 4. (Perhaps in this phase (beta/release candidate for EF 4) your query would have had twoINNER JOIN
s.) The reply to that problem from the EF team: http://connect.microsoft.com/VisualStudio/feedback/details/534675/ef4-include-method-returns-different-results-than-ef1-include (bold highlight from me):We are fixing the issue for .net 4 RTM. This was an unintended breaking change. We did not make an intended change where every left outer join produced by an Include became an inner join in .Net 4. But rather the optimization looked at the constraints in the EF metadata and tried to convert those left outer joins which could be safely converted to inner joins based on the constraints. We had a bug in the code where we were reasoning based on the constraints which resulted in more aggressive conversion than what the constraints implied. We have scaled back the optimization so that we convert left outer joins to inner joins only in the places where we are absolutely sure we can do it based on the constraints. We think we can improve this optimization a little more in the future. You will start seeing more left outer joins for some queries in RTM when compared to RC and Beta 2 but in most of these cases this is needed to return correct results.
So, the final release for EF 4 apparently reintroduced some more
LEFT OUTER JOIN
s (compared to beta/release candidate) to avoid a breaking change like that.
Sorry, this is more a historical story than a real explanation why you get an INNER JOIN
and then a LEFT OUTER JOIN
. As said, it is not wrong to write the query this way - as it wouldn't be wrong to use two INNER JOIN
s or two LEFT OUTER JOIN
s. I guess that only the EF team can explain exactly why your query produces that specific SQL.
I'd recommend - if you don't experience serious performance problems - not to worry about that SQL (since the result you get is correct after all) and proceed. Not liking the SQL which EF creates ends up in writing either a lot of feature and change requests or in writing a lot of raw SQL queries or in abandoning EF at all.