Update rows of data frame in R
dplyr 1.0.0
introduced a family of SQL-inspired functions for modifying rows. In this case you can now use rows_update()
:
library(dplyr)
df1 %>%
rows_update(df2, by = "ID")
ID Measurement1 Measurement2
1 1 45 104
2 2 10 11
3 3 23 99
4 4 21 22
We can use match
to get the row index. Using that index to subset the rows, we replace the 2nd and 3rd columns of the first dataset with the corresponding columns of second dataset.
ind <- match(df2$ID, df1$ID)
df1[ind, 2:3] <- df2[2:3]
df1
# ID Measurement1 Measurement2
#1 1 45 104
#2 2 10 11
#3 3 23 99
#4 4 21 22
Or we can use data.table
to join the dataset on
the 'ID' column (after converting the first dataset to 'data.table' i.e. setDT(df1)
), and assign the 'Cols' with the 'iCols' from the second dataset.
library(data.table)#v1.9.6+
Cols <- names(df1)[-1]
iCols <- paste0('i.', Cols)
setDT(df1)[df2, (Cols) := mget(iCols), on= 'ID'][]
# ID Measurement1 Measurement2
#1: 1 45 104
#2: 2 10 11
#3: 3 23 99
#4: 4 21 22
data
df1 <- structure(list(ID = 1:4, Measurement1 = c(45L, 34L, 23L, 56L),
Measurement2 = c(104L, 87L, 99L, 67L)), .Names = c("ID",
"Measurement1", "Measurement2"), class = "data.frame",
row.names = c(NA, -4L))
df2 <- structure(list(ID = c(2L, 4L), Measurement1 = c(10L, 21L),
Measurement2 = c(11L,
22L)), .Names = c("ID", "Measurement1", "Measurement2"),
class = "data.frame", row.names = c(NA, -2L))
library(dplyr)
df1 %>%
anti_join(df2, by = "ID") %>%
bind_rows(df2) %>%
arrange(ID)