Creating one variable from a list of variables in R?
Using base R, we can use sapply
and use grepl
to find pattern in every column and assign 1 to rows where there is more than 0 matches.
df$indicator <- as.integer(rowSums(sapply(df, grepl, pattern = code_regex)) > 0)
df
# c1 c2 c3 indicator
#1 T1 R4 C5 1
#2 X1 C6 C2 0
#3 T6 C7 X4 0
#4 R5 X3 T2 1
If there are few other columns and we are interested to apply it only for columns which start with "c"
we can use grep
to filter them.
cols <- grep("^c", names(df))
as.integer(rowSums(sapply(df[cols], grepl, pattern = code_regex)) > 0)
Using dplyr
we can do
library(dplyr)
df$indicator <- as.integer(df %>%
mutate_at(vars(c1:c3), ~grepl(code_regex, .)) %>%
rowSums() > 0)
We can use tidyverse
library(tidyverse)
df %>%
mutate_all(str_detect, pattern = code_regex) %>%
reduce(`+`) %>%
mutate(df, indicator = .)
# c1 c2 c3 indicator
#1 T1 R4 C5 1
#2 X1 C6 C2 0
#3 T6 C7 X4 0
#4 R5 X3 T2 1
Or using base R
Reduce(`+`, lapply(df, grepl, pattern = code_regex))
#[1] 1 0 0 1