Select group of rows that match all items in a list

You want to perform relational division, an operation that is not implemented in SQL. Here is an example where we have a product-supplier table and a required-products table:

CREATE TABLE product_supplier (
    product_id int NOT NULL,
    supplier_id int NOT NULL,
    UNIQUE (product_id, supplier_id)
);
INSERT INTO product_supplier (product_id, supplier_id) VALUES
(1, 1),
(2, 1),
(3, 1),
(1, 2),
(2, 2),
(3, 2),
(4, 2),
(2, 3),
(3, 3),
(4, 3);

CREATE TABLE reqd (
    product_id int NOT NULL,
    UNIQUE (product_id)
);
INSERT INTO reqd (product_id) VALUES
(1),
(2),
(3);

... and we want to find all suppliers that supply ALL required products and perhaps others. The result in the above example would be supplier 1 and 2.

The most straight forward solution is this:

SELECT product_supplier.supplier_id
FROM product_supplier
LEFT JOIN reqd ON product_supplier.product_id = reqd.product_id
GROUP BY product_supplier.supplier_id
HAVING COUNT(reqd.product_id) = (SELECT COUNT(*) FROM reqd);

+-------------+
| supplier_id |
+-------------+
|           1 |
|           2 |
+-------------+

And if we want to find all suppliers that supply ALL required products and no others (exact division/no remainder) then add one more condition to the above:

SELECT product_supplier.supplier_id
FROM product_supplier
LEFT JOIN reqd ON product_supplier.product_id = reqd.product_id
GROUP BY product_supplier.supplier_id
HAVING COUNT(reqd.product_id) = (SELECT COUNT(*) FROM reqd)
AND COUNT(product_supplier.product_id) = (SELECT COUNT(*) FROM reqd);

+-------------+
| supplier_id |
+-------------+
|           1 |
+-------------+

An alternate solution is to rephrase the problem: select suppliers where a required product does not exist that does not exist in the products supplied by the supplier. Hmmm:

SELECT DISTINCT supplier_id
FROM product_supplier AS ps1
WHERE NOT EXISTS (
    SELECT *
    FROM reqd
    WHERE NOT EXISTS (
        SELECT *
        FROM product_supplier AS ps2
        WHERE ps1.supplier_id = ps2.supplier_id AND ps2.product_id = reqd.product_id
    )
);

+-------------+
| supplier_id |
+-------------+
|           1 |
|           2 |
+-------------+

This type of problem is called Relational Division.

SELECT  a.*
FROM    Cars a
        INNER JOIN
        (
            SELECT  CarName
            FROM    PassedTest 
            WHERE   testType IN ('A', 'B', 'C', 'D')
            GROUP   BY CarName
            HAVING  COUNT(*) = 4
        ) b ON a.CarName = b.CarName

if a UNIQUE constraint was not enforce on TestType for every CarName on table PassedTest a DISTINCT keyword is required on COUNT() so it will only count unique values.

SELECT  a.*
FROM    Cars a
        INNER JOIN
        (
            SELECT  CarName
            FROM    PassedTest 
            WHERE   testType IN ('A', 'B', 'C', 'D')
            GROUP   BY CarName
            HAVING  COUNT(DISTINCT TestType) = 4
        ) b ON a.CarName = b.CarName

SQL of Relational Division

but if you are only interested on the CARNAME then you don't need to join the tables. Querying on table PassedTest will suit your needs.

SELECT  CarName
FROM    PassedTest 
WHERE   testType IN ('A', 'B', 'C', 'D')
GROUP   BY CarName
HAVING  COUNT(*) = 4

How about this?

SELECT carname
FROM PassedTest
GROUP BY carname
HAVING COUNT(DISTINCT testtype) = 4

You can also use it as an inner statement for taking info from the cars table:

SELECT *
FROM cars
WHERE carname IN (
    SELECT carname
    FROM PassedTest
    GROUP BY carname
    HAVING COUNT(DISTINCT testtype) = 4
)

Select group of rows that match all items in a list

Tags:

Sql

Sql Server

Tsql

Relational Division

Related

Recent Posts