Arrange a grouped_df by group variable not working
Try switching the order of your group_by
statement:
df %>%
group_by(year, client) %>%
summarise(tot = sum(rev)) %>%
arrange(year, desc(tot))
I think arrange
is ordering within groups; after summarize
, the last group is dropped, so this means in your first example it's arranging rows within the client
group. Switching the order to group_by(year, client)
seems to fix it because the client
group gets dropped after summarize
.
Alternatively, there is the ungroup()
function
df %>%
group_by(client, year) %>%
summarise(tot = sum(rev)) %>%
ungroup() %>%
arrange(year, desc(tot))
Edit, @lucacerone: since dplyr 0.5 this does not work anymore:
Breaking changes arrange() once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes arrange() inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it’s not going to change again, as more changes will just cause more confusion.
Latest versions of dplyr
(at least from dplyr_0.7.4
) allow to arrange
within groups. You just have so set into the arrange()
call .by_group = TRUE
. More information is available here
In your example, try:
library(dplyr)
df %>%
group_by(client, year) %>%
summarise(tot = sum(rev)) %>%
arrange(desc(tot), .by_group = TRUE)