dplyr rowwise by some columns

In data.table, you can do

library(data.table)
setDT(x)

x[, grep("^V",names(DT)) := .SD/Reduce(`+`, .SD), .SDcols = V1:V5]

   A         V1        V2        V3         V4         V5
1: A 0.28571429 0.0000000 0.2857143 0.07142857 0.35714286
2: B 0.23076923 0.2307692 0.3076923 0.15384615 0.07692308
3: C 0.44444444 0.0000000 0.4444444 0.00000000 0.11111111
4: D 0.07142857 0.3571429 0.1428571 0.07142857 0.35714286
5: E 0.00000000 0.2222222 0.3333333 0.44444444 0.00000000

To compute the denominator with NA values ignored, I guess rowSums is an option, though it will coerce .SD to a matrix as an intermediate step.

You can combine tidyr's spread and gather with dplyr to get the following single pipeline:

x <- data.frame(A=LETTERS[1:5], as.data.frame(matrix(sample(0:5, 25, T), ncol=5)))

y <- x %>% 
        gather(V, val, -A) %>% 
        group_by(A) %>% 
        mutate(perc = val / sum(val)) %>% 
        select(-val) %>%
        spread(V, perc)

With tidy data it's quite easy to get any group-wise sum (rows, columns or any nested index-level) and compute percentages. The spread and gather will get you to and from your input data format.

dplyr rowwise by some columns

Tags:

R

Data.Table

Dplyr

Related

Recent Posts