Select row with most recent date by group
For any solution, you might as well correct your date variable first, as shown by @akrun:
df$date <- as.Date(df$date, '%m/%d/%Y')
Base R
df[
tapply(1:nrow(df),df$ID,function(ii) ii[which.max(df$date[ii])])
,]
This uses a selection of row numbers to subset the data. You can see the selection by running the middle line (between the []
s) on its own.
Data.table
Similar to @rawr's:
require(data.table)
DT <- data.table(df)
unique(DT[order(date)], by="ID", fromLast=TRUE)
# or
unique(DT[order(-date)], by="ID")
You can try
library(dplyr)
df %>%
group_by(ID) %>%
slice(which.max(as.Date(date, '%m/%d/%Y')))
data
df <- data.frame(ID= rep(1:3, each=3), date=c('02/20/1989',
'03/14/2001', '02/25/1990', '04/20/2002', '02/04/2005', '02/01/2008',
'08/22/2011','08/20/2009', '08/25/2010' ), stringsAsFactors=FALSE)