How group with min(date) and select an another column in the same table

I infer that your data looks like this:

Person Table

╔══════════╦═══════╦════════╗
║ PersonID ║ Name  ║ Gender ║
╠══════════╬═══════╬════════╣
║        1 ║ John  ║ M      ║
║        2 ║ Vicky ║ F      ║
║        3 ║ Bob   ║ M      ║
╚══════════╩═══════╩════════╝

Job Table

Click to copy

╔══════════╦═════════════╦════════════╗
║ PersonID ║   JobName   ║  HireDate  ║
╠══════════╬═════════════╬════════════╣
║        1 ║ Electrician ║ 2010-01-01 ║
║        1 ║ Mechanic    ║ 2012-05-08 ║
║        2 ║ Scientific  ║ 2012-11-11 ║
╚══════════╩═════════════╩════════════╝

The first task is to find the first job (by hire date) for each person. One neat way to do that is by using a correlated subquery:

Click to copy

SELECT j.*
FROM dbo.Job AS j
WHERE 
    j.HireDate =
    (
        SELECT MIN(j2.HireDate)
        FROM dbo.Job AS j2
        WHERE j2.PersonID = j.PersonID
    );

Notice the correlation WHERE j2.PersonID = j.PersonID between the inner and outer queries there. The output of that query is:

Click to copy

╔══════════╦═════════════╦════════════╗
║ PersonID ║   JobName   ║  HireDate  ║
╠══════════╬═════════════╬════════════╣
║        1 ║ Electrician ║ 2010-01-01 ║
║        2 ║ Scientific  ║ 2012-11-11 ║
╚══════════╩═════════════╩════════════╝

The execution plan (given a clustered PRIMARY KEY on PersonID, HireDate) is:

First job plan

The interesting thing about that plan is the Job table is only scanned once, despite there being two references to it in the original query. The plan uses an optimization that I call Segment Top. Essentially the execution engine takes advantage of the index order to detect the start of a new group (segment) and take just the first row from each group (top).

Now that we have that result, all we need do is join it back to the Person table:

Click to copy

SELECT
    p.PersonName,
    p.Gender, 
    j.JobName
FROM dbo.Person AS p
LEFT JOIN
(
    -- Previous query
    SELECT j.*
    FROM dbo.Job AS j
    WHERE 
        j.HireDate =
        (
            SELECT MIN(j2.HireDate)
            FROM dbo.Job AS j2
            WHERE j2.PersonID = j.PersonID
        )
) AS j ON 
    j.PersonID = p.PersonID
OPTION (MERGE JOIN);

The execution plan is:

Final plan

The OPTION (MERGE JOIN) is not required; I just added it to show the plan you are likely to get when the tables contain a larger number of rows than in this small example.

Table definitions and sample data:

Click to copy

CREATE TABLE dbo.Person
(
    PersonID    integer NOT NULL,
    PersonName  varchar(30) NOT NULL,
    Gender      char(1) NOT NULL,

    PRIMARY KEY (PersonID)
);

CREATE TABLE dbo.Job
(
    PersonID    integer NOT NULL,
    JobName     varchar(30) NOT NULL,
    HireDate    datetime NOT NULL,

    PRIMARY KEY (PersonID, HireDate)
);

INSERT dbo.Person
    (PersonID, PersonName, Gender)
SELECT 1, 'John', 'M' UNION ALL
SELECT 2, 'Vicky', 'F' UNION ALL
SELECT 3, 'Bob', 'M';

INSERT dbo.Job
    (PersonID, JobName, HireDate)
SELECT 1, 'Mechanic', '20120508' UNION ALL
SELECT 1, 'Electrician', '20100101' UNION ALL
SELECT 2, 'Scientific', '20121111';

How group with min(date) and select an another column in the same table

Tags:

Group By

Sql Server 2000

Greatest N Per Group

Related

Recent Posts