case_when in mutate pipe
In my case, quasiquotation helped a lot. You can create in advance a set of quoted formulae that define the mutation rules (and either use known column names as in the first formula or benefit from !!
and create rules dynamically as in the second formula), which is then utilized within mutate
- case_when
combination like here
library(dplyr)
library(rlang)
pattern <- quos(gear == 3L ~ "three", !!sym("gear") == 4L ~ "four", gear == 5L ~ "five")
# Or
# pattern <- list(
# quo(gear == 3L ~ "three"),
# quo(!!sym("gear") == 4L ~ "four"),
# quo(gear == 5L ~ "five"))
#
mtcars %>% mutate(test = case_when(!!!pattern)) %>% head(10L)
#> mpg cyl disp hp drat wt qsec vs am gear carb test
#> 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 four
#> 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 four
#> 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 four
#> 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 three
#> 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 three
#> 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 three
#> 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 three
#> 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 four
#> 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 four
#> 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 four
I prefer such solution because it allows creating complex rules, e.g. using map2
with LHS conditions and RHS values to generate quoted formulas
library(rlang)
library(purrr)
map2(c(3, 4, 5), c("three", "four", "five"), ~quo(gear == !!.x ~ !!.y))
#> [[1]]
#> <quosure>
#> expr: ^gear == 3 ~ "three"
#> env: 0000000014286520
#>
#> [[2]]
#> <quosure>
#> expr: ^gear == 4 ~ "four"
#> env: 000000001273D0E0
#>
#> [[3]]
#> <quosure>
#> expr: ^gear == 5 ~ "five"
#> env: 00000000125870E0
and using it in different places, applying to different data sets without the need to manually type in all the rules every time you need a complex mutation.
As a final answer to the problem, 7 additional symbols and two parentheses solve it
library(rlang)
library(dplyr)
mtcars %>%
mutate(test = case_when(!!!quos(gear == 3L ~ "three", gear != 3L ~ "not three"))) %>%
head(10L)
#> mpg cyl disp hp drat wt qsec vs am gear carb test
#> 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 not three
#> 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 not three
#> 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 not three
#> 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 three
#> 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 three
#> 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 three
#> 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 three
#> 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 not three
#> 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 not three
#> 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 not three
Created on 2019-01-16 by the reprex package (v0.2.1.9000)
We can use .$
mtcars %>%
mutate(cg = case_when(.$carb <= 2 ~ "low", .$carb > 2 ~ "high")) %>%
.$cg %>%
table()
# high low
# 15 17
As of version 0.7.0
of dplyr
, case_when
works within mutate
as follows:
library(dplyr) # >= 0.7.0
mtcars %>%
mutate(cg = case_when(carb <= 2 ~ "low",
carb > 2 ~ "high"))
For more information: http://dplyr.tidyverse.org/reference/case_when.html
With thanks to @sumedh: @hadley has explained that this is a known shortcoming of case_when
:
case_when()
is still somewhat experiment and does not currently work insidemutate()
. That will be fixed in a future version.