Passing strings as arguments in dplyr verbs
In the next version of dplyr, it will probably work like this:
condition <- quote(dist > 50)
mtcars %>%
filter_(condition)
Since these 2014 answers, two new ways are possible using rlang's quasiquotation.
Conventional hard-coded filter statement. For the sake of comparison, the statement dist > 50
is included directly in dplyr::filter()
.
library(magrittr)
# The filter statement is hard-coded inside the function.
cars_subset_0 <- function( ) {
cars %>%
dplyr::filter(dist > 50)
}
cars_subset_0()
results:
speed dist
1 14 60
2 14 80
3 15 54
4 18 56
...
17 25 85
rlang approach with NSE (nonstandard evaluation). As described in the Programming with dplyr vignette, the statement dist > 50
is processed by rlang::enquo()
, which "uses some dark magic to look at the argument, see what the user typed, and return that value as a quosure". Then rlang's !!
unquotes the input "so that it’s evaluated immediately in the surrounding context".
# The filter statement is evaluated with NSE.
cars_subset_1 <- function( filter_statement ) {
filter_statement_en <- rlang::enquo(filter_statement)
message("filter statement: `", filter_statement_en, "`.")
cars %>%
dplyr::filter(!!filter_statement_en)
}
cars_subset_1(dist > 50)
results:
filter statement: `~dist > 50`.
<quosure>
expr: ^dist > 50
env: global
speed dist
1 14 60
2 14 80
3 15 54
4 18 56
17 25 85
rlang approach passing a string. The statement "dist > 50"
is passed to the function as an explicit string, and parsed as an expression by rlang::parse_expr()
, then unquoted by !!
.
# The filter statement is passed a string.
cars_subset_2 <- function( filter_statement ) {
filter_statement_expr <- rlang::parse_expr(filter_statement)
message("filter statement: `", filter_statement_expr, "`.")
cars %>%
dplyr::filter(!!filter_statement_expr)
}
cars_subset_2("dist > 50")
results:
filter statement: `>dist50`.
speed dist
1 14 60
2 14 80
3 15 54
4 18 56
...
17 25 85
Things are simpler with dplyr::select()
. Explicit strings need only !!
.
# The select statement is passed a string.
cars_subset_2b <- function( select_statement ) {
cars %>%
dplyr::select(!!select_statement)
}
cars_subset_2b("dist")