Subset a table by columns and rows using a named vector in R
I am not sure if you want something like below
u <- split(myVector,names(myVector))
eval(str2expression(sprintf("diamonds %%>%% filter(%s)",paste0(sapply(names(u),function(x) paste0(x," %in% u$",x)),collapse = " & "))))
such that
> eval(str2expression(sprintf("diamonds %%>%% filter(%s)",paste0(sapply(names(u),function(x) paste0(x," %in% u$",x)),collapse = " & "))))
# A tibble: 6,039 x 10
carat cut color clarity depth table price x y z
<dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
2 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
3 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75
4 0.3 Good J SI1 64 55 339 4.25 4.28 2.73
5 0.23 Ideal J VS1 62.8 56 340 3.93 3.9 2.46
6 0.31 Ideal J SI2 62.2 54 344 4.35 4.37 2.71
7 0.3 Good J SI1 63.4 54 351 4.23 4.29 2.7
8 0.3 Good J SI1 63.8 56 351 4.23 4.26 2.71
9 0.23 Good E VS1 64.1 59 402 3.83 3.85 2.46
10 0.33 Ideal J SI1 61.1 56 403 4.49 4.55 2.76
# ... with 6,029 more rows
Starting with the split
idea of ThomasIsCoding, slightly changed, here is a base R solution based on having Reduce/Map
created a logical index.
v <- split(unname(myVector), names(myVector))
i <- Reduce('&', Map(function(x, y){x %in% y}, diamonds[names(v)], v))
diamonds[i, ]
## A tibble: 6,039 x 10
# carat cut color clarity depth table price x y z
# <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
# 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
# 2 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
# 3 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75
# 4 0.3 Good J SI1 64 55 339 4.25 4.28 2.73
# 5 0.23 Ideal J VS1 62.8 56 340 3.93 3.9 2.46
# 6 0.31 Ideal J SI2 62.2 54 344 4.35 4.37 2.71
# 7 0.3 Good J SI1 63.4 54 351 4.23 4.29 2.7
# 8 0.3 Good J SI1 63.8 56 351 4.23 4.26 2.71
# 9 0.23 Good E VS1 64.1 59 402 3.83 3.85 2.46
#10 0.33 Ideal J SI1 61.1 56 403 4.49 4.55 2.76
## ... with 6,029 more rows
Package dplyr
The code above can be written as a function and used in dplyr::filter
.
# Input:
# X - a data set to be filtered
# values - a named list
values_in <- function(X, values){
v <- split(unname(values), names(values))
i <- Reduce('&', Map(function(x, y){x %in% y}, X[names(v)], v))
i
}
diamonds %>% filter( values_in(., myVector) )
The output is the same as above and, therefore, omited.
Using both approaches proposed by @Roman (generating all combinations of vector element and joining) and @ThomaslsCoding (splitting the vector) seems to do the trick:
data.frame(split(myVector, names(myVector))) %>%
expand.grid() %>%
inner_join(diamonds[,unique(names(myVector))])