R: Interpolation of NAs by group
Use data.frame
, rather than cbind
to create your data. cbind
returns a matrix, but you need a data frame for dplyr
. Then use na.approx
inside mutate
. I've commented out group_by
, as you haven't provided the grouping variable in your data, but the approach should work once you've added the grouping variable to the data frame.
df <- data.frame(time=c(1,2,3,4,5,6,7,1,2,3),
Individuals=c(1,1,1,1,1,1,1,2,2,2),
Value=c(NA, 2, 3, NA, 5, NA, 7, 8, NA, 10))
library(dplyr)
library(zoo)
df %>%
group_by(Individuals) %>%
mutate(ValueInterp = na.approx(Value, na.rm=FALSE))
time Individuals Value ValueInterp 1 1 1 NA NA 2 2 1 2 2 3 3 1 3 3 4 4 1 NA 4 5 5 1 5 5 6 6 1 NA 6 7 7 1 7 7 8 1 2 8 8 9 2 2 NA 9 10 3 2 10 10
Update: To interpolate multiple columns, we can use mutate_at
. Here's an example with two value columns. We use mutate_at
to run na.approx
on all columns that include "Value"
in the column name. list(interp=na.approx)
tells mutate_at
to generate new column names by running na.approx
and adding interp
as a suffix to generate the new column names:
df <- data.frame(time=c(1,2,3,4,5,6,7,1,2,3),
Individuals=c(1,1,1,1,1,1,1,2,2,2),
Value1=c(NA, 2, 3, NA, 5, NA, 7, 8, NA, 10),
Value2=c(NA, 2, 3, NA, 5, NA, 7, 8, NA, 10)*2)
df %>%
group_by(Individuals) %>%
mutate_at(vars(matches("Value")), list(interp=na.approx), na.rm=FALSE)
time Individuals Value1 Value2 Value1_interp Value2_interp <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 1 1 NA NA NA NA 2 2 1 2 4 2 4 3 3 1 3 6 3 6 4 4 1 NA NA 4 8 5 5 1 5 10 5 10 6 6 1 NA NA 6 12 7 7 1 7 14 7 14 8 1 2 8 16 8 16 9 2 2 NA NA 9 18 10 3 2 10 20 10 20
If you don't want to preserve the original, uninterpolated columns, you can do:
df %>%
group_by(Individuals) %>%
mutate_at(vars(matches("Value")), na.approx, na.rm=FALSE)
We can use data.table
library(data.table)
library(zoo)
setDT(df1)[, ValueInterp:= na.approx(Value, na.rm=TRUE), by = Individual]