Is there a R function that applies a function to each pair of columns?
92% of the time is being spent in cor.test.default
and routines it calls so its hopeless trying to get faster results by simply rewriting Papply
(other than the savings from computing only those above or below the diagonal assuming that your function is symmetric in x
and y
).
> M <- matrix(rnorm(100*300),300,100)
> Rprof(); junk <- Papply(M,function(x,y) cor.test( x, y)$p.value); Rprof(NULL)
> summaryRprof()
$by.self
self.time self.pct total.time total.pct
cor.test.default 4.36 29.54 13.56 91.87
# ... snip ...
It wouldn't be faster, but you can use outer
to simplify the code. It does require a vectorized function, so here I've used Vectorize
to make a vectorized version of the function to get the correlation between two columns.
df <- data.frame(x=rnorm(100),y=rnorm(100),z=rnorm(100))
n <- ncol(df)
corpij <- function(i,j,data) {cor.test(data[,i],data[,j])$p.value}
corp <- Vectorize(corpij, vectorize.args=list("i","j"))
outer(1:n,1:n,corp,data=df)
I'm not sure if this addresses your problem in a proper manner, but take a look at William Revelle's psych
package. corr.test
returns list of matrices with correlation coefs, # of obs, t-test statistic, and p-value. I know I use it all the time (and AFAICS you're also a psychologist, so it may suite your needs as well). Writing loops is not the most elegant way of doing this.
> library(psych)
> ( k <- corr.test(mtcars[1:5]) )
Call:corr.test(x = mtcars[1:5])
Correlation matrix
mpg cyl disp hp drat
mpg 1.00 -0.85 -0.85 -0.78 0.68
cyl -0.85 1.00 0.90 0.83 -0.70
disp -0.85 0.90 1.00 0.79 -0.71
hp -0.78 0.83 0.79 1.00 -0.45
drat 0.68 -0.70 -0.71 -0.45 1.00
Sample Size
mpg cyl disp hp drat
mpg 32 32 32 32 32
cyl 32 32 32 32 32
disp 32 32 32 32 32
hp 32 32 32 32 32
drat 32 32 32 32 32
Probability value
mpg cyl disp hp drat
mpg 0 0 0 0.00 0.00
cyl 0 0 0 0.00 0.00
disp 0 0 0 0.00 0.00
hp 0 0 0 0.00 0.01
drat 0 0 0 0.01 0.00
> str(k)
List of 5
$ r : num [1:5, 1:5] 1 -0.852 -0.848 -0.776 0.681 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
$ n : num [1:5, 1:5] 32 32 32 32 32 32 32 32 32 32 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
$ t : num [1:5, 1:5] Inf -8.92 -8.75 -6.74 5.1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
$ p : num [1:5, 1:5] 0.00 6.11e-10 9.38e-10 1.79e-07 1.78e-05 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
.. ..$ : chr [1:5] "mpg" "cyl" "disp" "hp" ...
$ Call: language corr.test(x = mtcars[1:5])
- attr(*, "class")= chr [1:2] "psych" "corr.test"