Linq 'into' keyword confusion
"into" has two different meanings:
- In a
join
clause, it changes the translation from usingJoin
toGroupJoin
. This means that instead of getting one result per matching pair, you get one result for each element of the original sequence, and that result contains the key and all the results from the other sequence, as a group. SeeEnumerable.GroupJoin
for more details - In a
select
orgroup...by
it becomes a query continuation, effectively starting a new query with the results of the old one in a new range variable.
When used with the select
keyword, into
will end the scope.
When used with the join
keyword, into
will add a variable containing all of the matching items from the join. (This is called a Group Join)
To add to what has already been said, I'd like to demonstrate the difference in the object structure produced by into versus without:
var q =
from c in categories
join p in products on c equals p.Category into ps
select new { Category = c, Products = ps };
Creates an object graph:
Category 1, Products:
Product 1
Product 2
Category 2, Products:
Product 3
Product 4
In this case, q
only contains 2 items, the two categories.
Without into, you get a more traditional join that flattens the relationship by creating all possible combinations:
var q =
from c in categories
join p in products on c equals p.Category
select new { Category = c, Product = p };
Category 1, Product 1
Category 1, Product 2
Category 2, Product 3
Category 2, Product 4
Note that now q
contains 4 items.
In response to comment:
var q =
from c in categories
join p in products on c equals p.Category into ps
select new { Category = c, Products = ps.Select(x=> x.Id) };