Check if each row of a data frame is contained in another data frame
One way is to paste the rows together, and compare them with %in%
. The result is a logical vector the length of nrow(df1)
, as requested.
do.call(paste0, df1) %in% do.call(paste0, df2)
# [1] TRUE TRUE TRUE
Try:
Filter(function(x) x > 0, which(duplicated(rbind(df2, df1))) - nrow(df2))
It will tell you which row numbers in df1
occur in df2
. If you want an atomic vector of logicals like in Richard Scriven's answer, try
duplicated(rbind(df2, df1))[-seq_len(nrow(df2))]
It is also faster since it uses an internal C function duplicated
(mine is rowcheck2
)
> microbenchmark(rowcheck(df1, df2), rowcheck2(df1, df2))
Unit: milliseconds
expr min lq median uq max neval
rowcheck(df1, df2) 2.045210 2.169182 2.328296 3.539328 13.971517 100
rowcheck2(df1, df2) 1.046207 1.112395 1.243390 1.727921 7.442499 100