Interpolate missing values in a time series with a seasonal cycle

In this case, I think you want a seasonality correction in the ARIMA model. There's not enough date here to fit the seasonal model, but this should get you started.

library(zoo)
start <- as.POSIXct("2010-01-01") 
freq <- as.difftime(6, units = "hours") 
dayvals <- (1:4)*10 
timevals <- c(3, 1, 2, 4) 
index <- seq(from = start, by = freq, length.out = 16)
obs <- (rep(dayvals, each = 4) + rep(timevals, times = 4))
myzoo <- myzoo.orig <- zoo(obs, index)
myzoo[10] <- NA

myzoo.fixed <- na.locf(myzoo)

myarima.resid <- arima(myzoo.fixed, order = c(3, 0, 3), seasonal = list(order = c(0, 0, 0), period = 4))$residuals
myzoo.reallyfixed <- myzoo.fixed
myzoo.reallyfixed[10] <- myzoo.fixed[10] + myarima.resid[10]

plot(myzoo.reallyfixed)
points(myzoo.orig)

In my tests the ARMA(3, 3) is really close, but that's just luck. With a longer time series you should be able to calibrate the seasonal correction to give you good predictions. It would be helpful to have a good prior on what the underlying mechanisms for both the signal and the seasonal correction to get better out of sample performance.


Try this:

x <- ts(myzoo,f=4)
fit <- ts(rowSums(tsSmooth(StructTS(x))[,-2]))
tsp(fit) <- tsp(x)
plot(x)
lines(fit,col=2)

The idea is to use a basic structural model for the time series, which handles the missing value fine using a Kalman filter. Then a Kalman smooth is used to estimate each point in the time series, including any omitted.

I had to convert your zoo object to a ts object with frequency 4 in order to use StructTS. You may want to change the fitted values back to zoo again.