Select which has max date or latest date

Your current query is not giving the desired result because you are using a GROUP BY clause on the PERSON_ID column which has a unique value for both entries. As a result you will return both rows.

There are a few ways that you can solve this. You can use a subquery to apply the aggregate function to return the max(LAST_UPDATE_DATE_TIME) for each SCHOOL_CODE:

Click to copy

select s1.LAST_UPDATE_DATE_TIME,
  s1.SCHOOL_CODE,
  s1.PERSON_ID
from SCHOOL_STAFF s1
inner join
(
  select max(LAST_UPDATE_DATE_TIME) LAST_UPDATE_DATE_TIME,
    SCHOOL_CODE
  from SCHOOL_STAFF
  group by SCHOOL_CODE
) s2
  on s1.SCHOOL_CODE = s2.SCHOOL_CODE
  and s1.LAST_UPDATE_DATE_TIME = s2.LAST_UPDATE_DATE_TIME;

See SQL Fiddle with Demo

Or you can use use a windowing function to return the rows of data for each school with the most recent LAST_UPDATE_DATE_TIME:

Click to copy

select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME
from
(
  select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME,
    row_number() over(partition by SCHOOL_CODE 
                        order by LAST_UPDATE_DATE_TIME desc) seq
  from SCHOOL_STAFF
  where STAFF_TYPE_NAME='Principal'
) d
where seq = 1;

See SQL Fiddle with Demo

This query implements row_number() which assigns a unique number to each row in the partition of SCHOOL_CODE and placed in a descending order based on the LAST_UPDATE_DATE_TIME.

As a side note, the JOIN with aggregate function is not exactly the same as the row_number() version. If you have two rows with the same event time the JOIN will return both rows, while the row_number() will only return one. If you want to return both with a windowing function, then consider using the rank() windowing function instead as it will return ties:

Click to copy

select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME
from
(
  select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME,
    rank() over(partition by SCHOOL_CODE 
                        order by LAST_UPDATE_DATE_TIME desc) seq
  from SCHOOL_STAFF
  where STAFF_TYPE_NAME='Principal'
) d
where seq = 1;

See Demo

I'm surprised nobody has taken advantage of window functions beyond row_number()

Here's some data to play with:

Click to copy

CREATE TABLE SCHOOL_STAFF
(
LAST_UPDATE_DATE_TIME VARCHAR(20),
SCHOOL_CODE VARCHAR(20),
PERSON_ID VARCHAR(20),
STAFF_TYPE_NAME VARCHAR(20)
);
INSERT INTO SCHOOL_STAFF VALUES ('24-JAN-13', 'ABE', '111222', 'Principal');
INSERT INTO SCHOOL_STAFF VALUES ('09-FEB-12', 'ABE', '222111', 'Principal');

The OVER() clause creates a window for which you will define your aggregate groups. In this case, I am only partitioning on the SHOOL_CODE, so we will see the FIRST_VALUE, which will come from LAST_UPDATE_DATE_TIME, grouped by SCHOOL_CODE, and in the order of LAST_UPDATE_DATE_TIME by descending order. This value will be applied to the entire column for each SCHOOL_CODE.

It is important to pay close attention to your partitioning and ordering in the over() clause.

Click to copy

SELECT DISTINCT
 FIRST_VALUE(LAST_UPDATE_DATE_TIME) OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS LAST_UPDATE
,FIRST_VALUE(SCHOOL_CODE)           OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS SCHOOL_CODE
,FIRST_VALUE(PERSON_ID)             OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS PERSON_ID
FROM SCHOOL_STAFF
WHERE STAFF_TYPE_NAME = 'Principal'
ORDER BY SCHOOL_CODE

Returns:

Click to copy

24-JAN-13   ABE 111222

This should eliminate your need for GROUP BY and Subqueries for the most part. You will want to make sure to include DISTINCT though.

Select which has max date or latest date

Tags:

Oracle

Greatest N Per Group

Related

Recent Posts