Categorize numeric variable into group/ bins/ breaks
We can use dplyr
:
library(dplyr)
data <- data %>% mutate(agegroup = case_when(age >= 40 & age <= 49 ~ '3',
age >= 30 & age <= 39 ~ '2',
age >= 20 & age <= 29 ~ '1')) # end function
Compared to other approaches, dplyr
is easier to write and interpret.
I would use findInterval()
here:
First, make up some sample data
set.seed(1)
ages <- floor(runif(20, min = 20, max = 50))
ages
# [1] 27 31 37 47 26 46 48 39 38 21 26 25 40 31 43 34 41 49 31 43
Use findInterval()
to categorize your "ages" vector.
findInterval(ages, c(20, 30, 40))
# [1] 1 2 2 3 1 3 3 2 2 1 1 1 3 2 3 2 3 3 2 3
Alternatively, as recommended in the comments, cut()
is also useful here:
cut(ages, breaks=c(20, 30, 40, 50), right = FALSE)
cut(ages, breaks=c(20, 30, 40, 50), right = FALSE, labels = FALSE)