Mutating column in `dplyr` using `rowSums`
The examples do not work because you are nesting select
in mutate
and using bare variable names. In this case, select
is trying to do something like
> -df$ids
Error in -df$ids : invalid argument to unary operator
which fails because you can't negate a character string (i.e. -"i1"
or -"i2"
makes no sense). Either of the formulations below works:
df %>% mutate(blubb = rowSums(select_(., "X1", "X2")))
df %>% mutate(blubb = rowSums(select(., -3)))
or
df %>% mutate(blubb = rowSums(select_(., "-ids")))
as suggested by @Haboryme.
select_
is deprecated. You can use:
library(dplyr)
df <- data.frame(matrix(rnorm(20), 10, 2),
ids = paste("i", 1:20, sep = ""),
stringsAsFactors = FALSE)
df %>%
mutate(blubb = rowSums(select(., .dots = c("X1", "X2"))))
# Or more generally:
desired_columns <- c("X1", "X2")
df %>%
mutate(blubb = rowSums(select(., .dots = all_of(desired_columns))))
Adding to this old thread because I searched on this question then realized I was asking the wrong question. Also, I detect some yearning in this and related questions for the proper pipe steps way to do this.
The answers here are somewhat non-intuitive because they are trying to use the dplyr vernacular with non-"tidy" data. IF you want to do it the dplyr way, make the data tidy first, using gather()
, and then use summarise()
library(tidyverse)
df <- data.frame(matrix(rnorm(20), 10, 2),
ids = paste("i", 1:20, sep = ""),
stringsAsFactors = FALSE)
df %>% gather(key=Xn,value="value",-ids) %>%
group_by(ids) %>%
summarise(rowsum=sum(value))
#> # A tibble: 20 x 2
#> ids rowsum
#> <chr> <dbl>
#> 1 i1 0.942
#> 2 i10 -0.330
#> 3 i11 0.942
#> 4 i12 -0.721
#> 5 i13 2.50
#> 6 i14 -0.611
#> 7 i15 -0.799
#> 8 i16 1.84
#> 9 i17 -0.629
#> 10 i18 -1.39
#> 11 i19 1.44
#> 12 i2 -0.721
#> 13 i20 -0.330
#> 14 i3 2.50
#> 15 i4 -0.611
#> 16 i5 -0.799
#> 17 i6 1.84
#> 18 i7 -0.629
#> 19 i8 -1.39
#> 20 i9 1.44
If you care about the order of the ids when they are not sortable using arrange()
, make that column a factor first.
df %>%
mutate(ids=as_factor(ids)) %>%
gather(key=Xn,value="value",-ids) %>%
group_by(ids) %>%
summarise(rowsum=sum(value))