How to use custom functions in mutate (dplyr)?
In many cases it's sufficient to create a vectorized version of the function:
your_function_V <- Vectorize(your_function)
The vectorized function is then usable in a dplyr's mutate
. See also this blog post.
The function posted in the question however takes one two-dimensional input from two different columns. Therefore we need to modify this, so the inputs are individual, before we vectorize.
binom.test.p <- function(x, y) {
# input x and y
x <- c(x, y)
if (is.na(x[1])|is.na(x[2])|(x[1]+x[2])<10) {
return(NA)
}
else {
return(binom.test(x, alternative="two.sided")$p.value)
}
}
# vectorized function
binom.test.p_V <- Vectorize(binom.test.p)
table %>%
mutate(Ratio = binom.test.p_V(ref_SG1_E2_1_R1_Sum, alt_SG1_E2_1_R1_Sum))
# works!
Your problem seems to be binom.test
instead of dplyr
, binom.test
is not vectorized, so you can not expect it work on vectors; You can use mapply
on the two columns with mutate
:
table %>%
mutate(Ratio = mapply(function(x, y) binom.test.p(c(x,y)),
ref_SG1_E2_1_R1_Sum,
alt_SG1_E2_1_R1_Sum))
# geneId ref_SG1_E2_1_R1_Sum alt_SG1_E2_1_R1_Sum Ratio
#1 a 10 10 1
#2 b 20 20 1
#3 c 10 10 1
#4 d 15 15 1
As for the last one, you need mutate_at
instead of mutate
:
table %>%
mutate_at(.vars=c(2:3), .funs=funs(sum=sum(.)))