How to reorder factor levels in a tidy way?
Using ‹forcats›:
iris.tr %>%
mutate(Species = fct_reorder(Species, mSW)) %>%
ggplot() +
aes(Species, mSW, color = Species) +
geom_point()
If you happen to have a character vector to order, for example:
iris2 <- iris %>%
mutate(Species = as.character(Species)) %>%
group_by(Species) %>%
mutate(mean_sepal_width = mean(Sepal.Width)) %>%
ungroup()
You can also order the factor level using the behavior of the forcats::as_factor function :
"Compared to base R, this function creates levels in the order in which they appear"
library(forcats)
iris2 %>%
# Change the order
arrange(mean_sepal_width) %>%
# Create factor levels in the order in which they appear
mutate(Species = as_factor(Species)) %>%
ggplot() +
aes(Species, Sepal.Width, color = Species) +
geom_point()
Notice how the species names on the x axis are not ordered alphabetically but by increasing value of their mean_sepal_width
. Remove the line containing as_factor
to see the difference.
Reordering the factor using base:
iris.ba = iris
iris.ba$Species = with(iris.ba, reorder(Species, Sepal.Width, mean))
Translating to dplyr
:
iris.tr = iris %>% mutate(Species = reorder(Species, Sepal.Width, mean))
After that, you can continue on to summarize and plot as in your question.
A couple comments: reordering a factor is modifying a data column. The dplyr
command to modify a data column is mutate
. All arrange
does is re-order rows, this has no effect on the levels of the factor and hence no effect on the order of a legend or axis in ggplot.
All factors have an order for their levels. The difference between an ordered = TRUE
factor and a regular factor is how the contrasts are set up in a model. ordered = TRUE
should only be used if your factor levels have a meaningful rank order, like "Low", "Medium", "High", and even then it only matters if you are building a model and don't want the default contrasts comparing everything to a reference level.