Aggregate Daily Data to Month/Year intervals

A bit late to the game, but another option would be using data.table:

library(data.table)
setDT(temp)[, .(mn_amt = mean(amount)), by = .(yr = year(date), mon = months(date))]

# or if you want to apply the 'mean' function to several columns:
# setDT(temp)[, lapply(.SD, mean), by=.(year(date), month(date))]

this gives:

     yr      mon mn_amt
1: 2011 februari 42.610
2: 2011    maart 23.195
3: 2011    april 61.891

If you want names instead of numbers for the months, you can use:

setDT(temp)[, date := as.IDate(date)
            ][, .(mn_amt = mean(amount)), by = .(yr = year(date), mon = months(date))]

this gives:

     yr      mon mn_amt
1: 2011 februari 42.610
2: 2011    maart 23.195
3: 2011    april 61.891

As you see this will give the month names in your system language (which is Dutch in my case).


Or using a combination of lubridate and dplyr:

temp %>% 
  group_by(yr = year(date), mon = month(date)) %>% 
  summarise(mn_amt = mean(amount))

Used data:

# example data (modified the OP's data a bit)
temp <- structure(list(date = structure(1:6, .Label = c("2011-02-01", "2011-02-02", "2011-03-03", "2011-03-04", "2011-04-05", "2011-04-06"), class = "factor"), 
                       amount = c(1.67, 83.55, 24.4, 21.99, 98.882, 24.9)), 
                  .Names = c("date", "amount"), class = c("data.frame"), row.names = c(NA, -6L))

There is probably a more elegant solution, but splitting into months and years with strftime() and then aggregate()ing should do it. Then reassemble the date for plotting.

x <- as.POSIXct(c("2011-02-01", "2011-02-01", "2011-02-01"))
mo <- strftime(x, "%m")
yr <- strftime(x, "%Y")
amt <- runif(3)
dd <- data.frame(mo, yr, amt)

dd.agg <- aggregate(amt ~ mo + yr, dd, FUN = sum)
dd.agg$date <- as.POSIXct(paste(dd.agg$yr, dd.agg$mo, "01", sep = "-"))

I'd do it with lubridate and plyr, rounding dates down to the nearest month to make them easier to plot:

library(lubridate)
df <- data.frame(
  date = today() + days(1:300),
  x = runif(300)
)
df$my <- floor_date(df$date, "month")

library(plyr)
ddply(df, "my", summarise, x = mean(x))

Tags:

Datetime

R