dplyr group_by and mutate, how to access the data frame?
Since dplyr 0.8 you can use group_map
, the .
in the group_map
call will represent the sub-data.frame. Its behavior has changed a bit with time, with dplyr 1.0 we can do
df <- data.frame(x=runif(10),let=rep(letters[1:5],each=2))
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df %>%
group_by(let) %>%
group_map(~mutate(., mean.by.letter = mean(x)), .keep = T) %>%
bind_rows()
#> # A tibble: 10 x 3
#> x let mean.by.letter
#> <dbl> <chr> <dbl>
#> 1 0.442 a 0.271
#> 2 0.0999 a 0.271
#> 3 0.669 b 0.343
#> 4 0.0167 b 0.343
#> 5 0.908 c 0.575
#> 6 0.242 c 0.575
#> 7 0.685 d 0.378
#> 8 0.0716 d 0.378
#> 9 0.883 e 0.843
#> 10 0.804 e 0.843
group_map()
was introduced there (with now outdated behavior!):
https://www.tidyverse.org/articles/2019/02/dplyr-0-8-0/ https://www.tidyverse.org/articles/2018/12/dplyr-0-8-0-release-candidate/
We can use within do
data %>%
group_by(let ) %>%
do(mutate(., mean.by.letter = mean(.$x)))