What is the "N+1 selects problem" in ORM (Object-Relational Mapping)?
Let's say you have a collection of Car
objects (database rows), and each Car
has a collection of Wheel
objects (also rows). In other words, Car
→ Wheel
is a 1-to-many relationship.
Now, let's say you need to iterate through all the cars, and for each one, print out a list of the wheels. The naive O/R implementation would do the following:
SELECT * FROM Cars;
And then for each Car
:
SELECT * FROM Wheel WHERE CarId = ?
In other words, you have one select for the Cars, and then N additional selects, where N is the total number of cars.
Alternatively, one could get all wheels and perform the lookups in memory:
SELECT * FROM Wheel
This reduces the number of round-trips to the database from N+1 to 2. Most ORM tools give you several ways to prevent N+1 selects.
Reference: Java Persistence with Hibernate, chapter 13.
SELECT
table1.*
, table2.*
INNER JOIN table2 ON table2.SomeFkId = table1.SomeId
That gets you a result set where child rows in table2 cause duplication by returning the table1 results for each child row in table2. O/R mappers should differentiate table1 instances based on a unique key field, then use all the table2 columns to populate child instances.
SELECT table1.*
SELECT table2.* WHERE SomeFkId = #
The N+1 is where the first query populates the primary object and the second query populates all the child objects for each of the unique primary objects returned.
Consider:
class House
{
int Id { get; set; }
string Address { get; set; }
Person[] Inhabitants { get; set; }
}
class Person
{
string Name { get; set; }
int HouseId { get; set; }
}
and tables with a similar structure. A single query for the address "22 Valley St" may return:
Id Address Name HouseId
1 22 Valley St Dave 1
1 22 Valley St John 1
1 22 Valley St Mike 1
The O/RM should fill an instance of Home with ID=1, Address="22 Valley St" and then populate the Inhabitants array with People instances for Dave, John, and Mike with just one query.
A N+1 query for the same address used above would result in:
Id Address
1 22 Valley St
with a separate query like
SELECT * FROM Person WHERE HouseId = 1
and resulting in a separate data set like
Name HouseId
Dave 1
John 1
Mike 1
and the final result being the same as above with the single query.
The advantages to single select is that you get all the data up front which may be what you ultimately desire. The advantages to N+1 is query complexity is reduced and you can use lazy loading where the child result sets are only loaded upon first request.
Supplier with a one-to-many relationship with Product. One Supplier has (supplies) many Products.
***** Table: Supplier *****
+-----+-------------------+
| ID | NAME |
+-----+-------------------+
| 1 | Supplier Name 1 |
| 2 | Supplier Name 2 |
| 3 | Supplier Name 3 |
| 4 | Supplier Name 4 |
+-----+-------------------+
***** Table: Product *****
+-----+-----------+--------------------+-------+------------+
| ID | NAME | DESCRIPTION | PRICE | SUPPLIERID |
+-----+-----------+--------------------+-------+------------+
|1 | Product 1 | Name for Product 1 | 2.0 | 1 |
|2 | Product 2 | Name for Product 2 | 22.0 | 1 |
|3 | Product 3 | Name for Product 3 | 30.0 | 2 |
|4 | Product 4 | Name for Product 4 | 7.0 | 3 |
+-----+-----------+--------------------+-------+------------+
Factors:
Lazy mode for Supplier set to “true” (default)
Fetch mode used for querying on Product is Select
Fetch mode (default): Supplier information is accessed
Caching does not play a role for the first time the
Supplier is accessed
Fetch mode is Select Fetch (default)
// It takes Select fetch mode as a default
Query query = session.createQuery( "from Product p");
List list = query.list();
// Supplier is being accessed
displayProductsListWithSupplierName(results);
select ... various field names ... from PRODUCT
select ... various field names ... from SUPPLIER where SUPPLIER.id=?
select ... various field names ... from SUPPLIER where SUPPLIER.id=?
select ... various field names ... from SUPPLIER where SUPPLIER.id=?
Result:
- 1 select statement for Product
- N select statements for Supplier
This is N+1 select problem!