Arrange a grouped_df by group variable not working

Try switching the order of your group_by statement:

df %>% 
  group_by(year, client) %>%
  summarise(tot = sum(rev)) %>%
  arrange(year, desc(tot))

I think arrange is ordering within groups; after summarize, the last group is dropped, so this means in your first example it's arranging rows within the client group. Switching the order to group_by(year, client) seems to fix it because the client group gets dropped after summarize.

Alternatively, there is the ungroup() function

df %>% 
  group_by(client, year) %>%
  summarise(tot = sum(rev)) %>%
  ungroup() %>%
  arrange(year, desc(tot))

Edit, @lucacerone: since dplyr 0.5 this does not work anymore:

Breaking changes arrange() once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes arrange() inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it’s not going to change again, as more changes will just cause more confusion.

Latest versions of dplyr (at least from dplyr_0.7.4) allow to arrange within groups. You just have so set into the arrange() call .by_group = TRUE. More information is available here In your example, try:

library(dplyr)
df %>% 
        group_by(client, year) %>%
        summarise(tot = sum(rev)) %>%
        arrange(desc(tot), .by_group = TRUE)

Arrange a grouped_df by group variable not working

Tags:

R

Dplyr

Grouped Table

Related

Recent Posts