Basic lag in R vector/dataframe
Using just standard R functions this can be achieved in a much simpler way:
x <- sample(c(1:9), 10, replace = T)
y <- c(NA, head(x, -1))
ds <- cbind(x, y)
ds
Another way to deal with this is using the zoo package, which has a lag method that will pad the result with NA:
require(zoo)
> set.seed(123)
> x <- zoo(sample(c(1:9), 10, replace = T))
> y <- lag(x, -1, na.pad = TRUE)
> cbind(x, y)
x y
1 3 NA
2 8 3
3 4 8
4 8 4
5 9 8
6 1 9
7 5 1
8 9 5
9 5 9
10 5 5
The result is a multivariate zoo object (which is an enhanced matrix), but easily converted to a data.frame via
> data.frame(cbind(x, y))
I had the same problem, but I didn't want to use zoo or xts, so I wrote a simple lag function for data frames:
lagpad <- function(x, k) {
if (k>0) {
return (c(rep(NA, k), x)[1 : length(x)] );
}
else {
return (c(x[(-k+1) : length(x)], rep(NA, -k)));
}
}
This can lag forward or backwards:
x<-1:3;
(cbind(x, lagpad(x, 1), lagpad(x,-1)))
x
[1,] 1 NA 2
[2,] 2 1 3
[3,] 3 2 NA
lag
does not shift the data, it only shifts the "time-base". x
has no "time base", so cbind
does not work as you expected. Try cbind(as.ts(x),lag(x))
and notice that a "lag" of 1 shifts the periods forward.
I would suggesting using zoo
/ xts
for time series. The zoo
vignettes are particularly helpful.