Moving variance in R

rollapply in the zoo package takes an arbitrary function. It's different from filter in that it excludes the NAs by default.

That being said, though, there's not much sense in loading a package for a function that's so simple to roll yourself (pun intended).

Here's one that's right aligned:

my.rollapply <- function(vec, width, FUN) 
    sapply(seq_along(vec), 
           function(i) if (i < width) NA else FUN(vec[i:(i-width+1)]))

set.seed(1)
vec <- sample(1:50, 50)
my.rollapply(vec, 3, sd)
 [1]        NA        NA  7.094599 12.124356 16.522712 18.502252 18.193405  7.234178  8.144528
[10] 14.468356 12.489996  3.055050 20.808652 19.467922 18.009257 18.248288 15.695010  7.505553
[19] 10.066446 11.846237 17.156146  6.557439  5.291503 23.629078 22.590558 21.197484 22.810816
[28] 24.433583 19.502137 16.165808 11.503623 12.288206  9.539392 13.051181 13.527749 19.974984
[37] 19.756855 17.616280 19.347696 18.248288 15.176737  6.082763 10.000000 10.016653  4.509250
[46]  2.645751  1.527525  5.291503 10.598742  6.557439

# rollapply output for comparison
rollapply(vec, width=3, sd, fill=NA, align='right')
 [1]        NA        NA  7.094599 12.124356 16.522712 18.502252 18.193405  7.234178  8.144528
[10] 14.468356 12.489996  3.055050 20.808652 19.467922 18.009257 18.248288 15.695010  7.505553
[19] 10.066446 11.846237 17.156146  6.557439  5.291503 23.629078 22.590558 21.197484 22.810816
[28] 24.433583 19.502137 16.165808 11.503623 12.288206  9.539392 13.051181 13.527749 19.974984
[37] 19.756855 17.616280 19.347696 18.248288 15.176737  6.082763 10.000000 10.016653  4.509250
[46]  2.645751  1.527525  5.291503 10.598742  6.557439

Consider the zoo package. For example filter() gives:

> filter(1:100, rep(1/3,3))
Time Series:
Start = 1 
End = 100 
Frequency = 1 
  [1] NA  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
 [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
 [76] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 NA

whereas rollmean() in zoo gives:

> rollmean(1:100, k = 3, na.pad = TRUE)
  [1] NA  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
 [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
 [76] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 NA

which is the same (for a 3 point moving average in this example).

Whilst zoo doesn't have a rollsd() or rollvar() it does have rollapply(), which works like the apply() functions to apply any R function to the specified window.

> rollapply(1:100, width = 3, FUN = sd, na.pad = TRUE)
  [1] NA  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
 [26]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
 [51]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
 [76]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 NA
Warning message:
In rollapply.zoo(zoo(data), ...) : na.pad argument is deprecated

or on something more interesting:

> rollapply(vec, width = 3, FUN = sd, na.pad = TRUE)
  [1]        NA 0.3655067 0.8472871 0.5660495 0.3491970 0.4732417 0.9236859
  [8] 0.8075226 1.8725851 1.1930784 0.6329325 1.1412416 0.8430772 0.5808005
 [15] 0.3838545 1.1738170 1.1655400 1.3241700 0.6876834 0.1534157 0.4858477
 [22] 0.9843506 0.6002713 0.6897541 2.0619563 2.5675788 6.3522039 6.0066864
 [29] 6.2618432 5.1704866 2.1360853 2.5602557 1.0408528 1.0316396 4.9441628
 [36] 5.0319314 5.7589716 3.2425000 4.8788158 2.0847286 4.5199291 2.5323486
 [43] 2.1987149 1.8393000 1.2278639 1.5998965 1.5341485 4.4287108 4.4159166
 [50] 4.3224546 3.6959067 4.9826264 5.3134044 8.4084322 9.1249234 7.5506725
 [57] 3.8499136 3.9680487 5.6362296 4.9124095 4.3452706 4.0227141 4.5867559
 [64] 4.7350394 4.3203807 4.4506799 7.2759499 7.6536424 7.8487654 2.0905576
 [71] 4.0056880 5.6209853 1.5551659 1.3615268 2.8469458 2.8323588 1.9848578
 [78] 1.1201124 1.4248380 1.7802571 1.4281773 2.5481935 1.8554451 1.0925410
 [85] 2.1823722 2.2788755 2.4205378 2.0733741 0.7462248 1.3873578 1.4265948
 [92] 0.7212619 0.7425993 1.0696432 2.4520585 3.0555819 3.1000885 1.0945292
 [99] 0.3726928        NA
Warning message:
In rollapply.zoo(zoo(data), ...) : na.pad argument is deprecated

You can get rid of the warning by using the fill = NA argument, as in

> rollapply(vec, width = 3, FUN = sd, fill = NA)

The TTR package has runSD among others:

> library(TTR)
> ls("package:TTR", pattern="run*")
 [1] "runCor"    "runCov"    "runMAD"    "runMax"    "runMean"  
 [6] "runMedian" "runMin"    "runSD"     "runSum"    "runVar"

runSD will be much faster than rollapply because it avoids making many R function calls. For example:

rzoo <- function(x,n) rollapplyr(x, n, sd, fill=NA)
rttr <- function(x,n) runSD(x, n)
library(rbenchmark)
set.seed(21)
x <- rnorm(1e4)
all.equal(rzoo(x,250), rttr(x,250))
# [1] TRUE
benchmark(rzoo(x,250), rttr(x,250))[,1:6]
#           test replications elapsed relative user.self sys.self
# 2 rttr(x, 250)          100    0.58    1.000      0.58     0.00
# 1 rzoo(x, 250)          100   54.53   94.017     53.85     0.06

Tags:

R