Create categories by comparing a numeric column with a fixed value
In the interest of updating a possible canonical, the package dplyr
has the function mutate
which lets you create a new column in a data.frame in a vectorized fashion:
library(dplyr)
iris_new <- iris %>%
mutate(Regulation = if_else(Sepal.Length >= 5, 'UP', 'DOWN'))
This makes a new column called Regulation
which consists of either 'UP'
or 'DOWN'
based on applying the condition to the Sepal.Length
column.
The case_when
function (also from dplyr
) provides an easy to read way to chain together multiple conditions:
iris %>%
mutate(Regulation = case_when(Sepal.Length >= 5 ~ 'High',
Sepal.Length >= 4.5 ~ 'Mid',
TRUE ~ 'Low'))
This works just like if_else
except instead of 1 condition with a return value for TRUE and FALSE, each line has condition (left side of ~
) and a return value (right side of ~
) that it returns if TRUE. If false, it moves on to the next condition.
In this case, rows where Sepal.Length >= 5
will return 'High'
, rows where Sepal.Length < 5
(since the first condition had to fail) & Sepal.Length >= 4.5
will return 'Mid'
, and all other rows will return 'Low'
. Since TRUE
is always TRUE
, it is used to provide a default value.
Try
iris$Regulation <- ifelse(iris$Sepal.Length >=5, "UP", "DOWN")