How to duplicate last row by group (ID)?
Another alternative using slice
:
library(dplyr)
DF %>%
group_by(ActivityID) %>%
slice(c(1:n(),n()))
which gives:
# A tibble: 9 x 9 # Groups: ActivityID [3] ActivityID CareType HCWType Orientation Surface Date Time Dev.Date.Time SurfaceCategori~ <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> 1 1 IV RN01 leftFacing AlcOutside 2019-08-~ 11:08:01 2019-08-03 11:08~ HygieneArea 2 1 IV RN01 leftFacing In 2019-08-~ 11:08:12 2019-08-03 11:08~ In 3 1 IV RN01 leftFacing Door 2019-08-~ 11:08:12 2019-08-03 11:08~ FarPatient 4 1 IV RN01 leftFacing Door 2019-08-~ 11:08:12 2019-08-03 11:08~ FarPatient 5 2 IV RN01 leftFacing Door 2019-08-~ 11:08:18 2019-08-03 11:08~ FarPatient 6 2 IV RN01 leftFacing Other 2019-08-~ 11:08:22 2019-08-03 11:08~ FarPatient 7 2 IV RN01 leftFacing Other 2019-08-~ 11:08:22 2019-08-03 11:08~ FarPatient 8 3 IV RN01 leftFacing Table 2019-08-~ 11:10:26 2019-08-03 11:10~ NearPatient 9 3 IV RN01 leftFacing Table 2019-08-~ 11:10:26 2019-08-03 11:10~ NearPatient
Two base R alternatives:
# one
lastrows <- cumsum(aggregate(CareType ~ ActivityID, DF, length)[[2]])
DF[sort(c(seq(nrow(DF)), lastrows)),]
# two
idx <- unlist(tapply(1:nrow(DF), DF$ActivityID, FUN = function(x) c(x, tail(x, 1))))
DF[idx,]
which both give the same result.
Two data.table alternatives:
library(data.table)
setDT(DF) # convert 'DF' to a data.table
# one
DF[DF[, .I[c(1:.N,.N)], by = ActivityID]$V1]
# two
DF[, .SD[c(1:.N,.N)], by = ActivityID]
Used data:
DF <- structure(list(ActivityID = c(1L, 1L, 1L, 2L, 2L, 3L),
CareType = c("IV", "IV", "IV", "IV", "IV", "IV"),
HCWType = c("RN01", "RN01", "RN01", "RN01", "RN01", "RN01"),
Orientation = c("leftFacing", "leftFacing", "leftFacing", "leftFacing", "leftFacing", "leftFacing"),
Surface = c("AlcOutside", "In", "Door", "Door", "Other", "Table"),
Date = c("2019-08-03", "2019-08-03", "2019-08-03", "2019-08-03", "2019-08-03", "2019-08-03"),
Time = c("11:08:01", "11:08:12", "11:08:12", "11:08:18", "11:08:22", "11:10:26"),
Dev.Date.Time = c("2019-08-03 11:08:01", "2019-08-03 11:08:12", "2019-08-03 11:08:12", "2019-08-03 11:08:18", "2019-08-03 11:08:22", "2019-08-03 11:10:26"),
SurfaceCategories = c("HygieneArea", "In", "FarPatient", "FarPatient", "FarPatient", "NearPatient")),
class = "data.frame", row.names = c(NA, -6L))
One dplyr
and tidyr
possibility could be (using sample data from @Jaap):
DF %>%
group_by(ActivityID) %>%
uncount((row_number() == max(row_number())) + 1)
ActivityID CareType HCWType Orientation Surface Date Time Dev.Date.Time SurfaceCategori…
<int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 IV RN01 leftFacing AlcOutsi… 2019-08… 11:08… 2019-08-03 11:… HygieneArea
2 1 IV RN01 leftFacing In 2019-08… 11:08… 2019-08-03 11:… In
3 1 IV RN01 leftFacing Door 2019-08… 11:08… 2019-08-03 11:… FarPatient
4 1 IV RN01 leftFacing Door 2019-08… 11:08… 2019-08-03 11:… FarPatient
5 2 IV RN01 leftFacing Door 2019-08… 11:08… 2019-08-03 11:… FarPatient
6 2 IV RN01 leftFacing Other 2019-08… 11:08… 2019-08-03 11:… FarPatient
7 2 IV RN01 leftFacing Other 2019-08… 11:08… 2019-08-03 11:… FarPatient
8 3 IV RN01 leftFacing Table 2019-08… 11:10… 2019-08-03 11:… NearPatient
9 3 IV RN01 leftFacing Table 2019-08… 11:10… 2019-08-03 11:… NearPatient
Or:
DF %>%
group_by(ActivityID) %>%
uncount((row_number() == n()) + 1)
If we want to repeat only last row for each group it is enough to know last row number for each group. We can duplicated
with fromLast
argument as TRUE
to get those row numbers and then add them with current rows. Using @Jaap's data
DF[sort(c(seq_len(nrow(DF)), which(!duplicated(DF$ActivityID, fromLast = TRUE)))),]
# ActivityID CareType HCWType Orientation Surface Date Time Dev.Date.Time SurfaceCategories
#1 1 IV RN01 leftFacing AlcOutside 2019-08-03 11:08:01 2019-08-03 11:08:01 HygieneArea
#2 1 IV RN01 leftFacing In 2019-08-03 11:08:12 2019-08-03 11:08:12 In
#3 1 IV RN01 leftFacing Door 2019-08-03 11:08:12 2019-08-03 11:08:12 FarPatient
#3.1 1 IV RN01 leftFacing Door 2019-08-03 11:08:12 2019-08-03 11:08:12 FarPatient
#4 2 IV RN01 leftFacing Door 2019-08-03 11:08:18 2019-08-03 11:08:18 FarPatient
#5 2 IV RN01 leftFacing Other 2019-08-03 11:08:22 2019-08-03 11:08:22 FarPatient
#5.1 2 IV RN01 leftFacing Other 2019-08-03 11:08:22 2019-08-03 11:08:22 FarPatient
#6 3 IV RN01 leftFacing Table 2019-08-03 11:10:26 2019-08-03 11:10:26 NearPatient
#6.1 3 IV RN01 leftFacing Table 2019-08-03 11:10:26 2019-08-03 11:10:26 NearPatient