Earliest Date for each id in R
Another answer that uses dplyr's filter command:
dta %>%
group_by(id) %>%
filter(date == min(date))
You may use library(sqldf) to get the minimum date as follows:
data1<-data.frame(id=c("789","123","456","123","123","456","789"),
e_date=c("2016-05-01","2016-07-02","2016-08-25","2015-12-11","2014-03-01","2015-07-08","2015-12-11"))
library(sqldf)
data2 = sqldf("SELECT id,
min(e_date) as 'earliest_date'
FROM data1 GROUP BY 1", method = "name__class")
head(data2)
id earliest_date
123 2014-03-01
456 2015-07-08
789 2015-12-11
We can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(data_full)
), grouped by 'id', we get the 1st row (head(.SD, 1L)
).
library(data.table)
setDT(data_full)[order(e_date), head(.SD, 1L), by = id]
Or using dplyr
, after grouping by 'id', arrange
the 'e_date' (assuming it is of Date
class) and get the first row with slice
.
library(dplyr)
data_full %>%
group_by(id) %>%
arrange(e_date) %>%
slice(1L)
If we need a base R
option, ave
can be used
data_full[with(data_full, ave(e_date, id, FUN = function(x) rank(x)==1)),]