How to subtract two comma separated columns in R?
Another option leveraging the function vecsets::vsetdiff
which doesn't remove duplicates:
library(dplyr)
library(tidyr)
library(purrr)
library(vecsets)
dt %>%
mutate(x = strsplit(ColumnA,","),
y = strsplit(ColumnB,",")) %>%
mutate(z = map2(x,y,vecsets::vsetdiff))
ColumnA ColumnB x y z
1 A,B,C,A,A,A A,C,A A, B, C, A, A, A A, C, A B, A, A
2 A,B,C C A, B, C C A, B
Note that you end up with list columns here (which I created on purpose for this to work), but the data might be easier to work with that way anyway.
sapply(1:nrow(dt), function(i){
a = dt$ColumnA[i]
b = unlist(strsplit(dt$ColumnB[i], ","))
for (x in b){
a = sub(paste0(x, ",?"), "", a)
}
sub(",$", "", a)
})
#[1] "B,A,A" "A,B"