Plotting by group in data.table
Base R solution using matplot
and dcast
dt_agg <- dt[ , .(mean = mean(outcome)), by=.(grp,time)]
dt_cast <- dcast(dt_agg, time~grp, value.var="mean")
dt_cast[ , matplot(time, .SD[ , !"time"], type="l", ylab="mean", xlab="")]
# alternative:
dt_cast[ , matplot(time, .SD, type="l", ylab="mean", xlab=""), .SDcols = !"time"]
Result:
There is a way to do this with data.table
's by
argument as follows:
DT[ , mean(outcome), by = .(grp, time)
][ , {plot(NULL, xlim = range(time),
ylim = range(V1)); .SD}
][ , lines(time, V1, col = .GRP), by = grp]
Note that the intermediate {...; .SD}
part is necessary to continue chaining. If DT[ , mean(outcome), by = .(grp, time)]
were already stored as another data.table
, DT_m
, then we could just do:
DT_m[ , plot(NULL, xlim = range(time), ylim = range(V1))]
DT_m[ , lines(time, V1, col = .GRP), by = grp]
With output
Much fancier results are possible; for example, if we wanted to specify specific colors for each group:
grp_col <- c(a = "blue", b = "black",
c = "darkgreen", d = "red")
DT[ , mean(outcome), by = .(grp, time)
][ , {plot(NULL, xlim = range(time),
ylim = range(V1)); .SD}
][ , lines(time, V1, col = grp_col[.BY$grp]), by = grp]
NOTE
There is a bug in RStudio which will cause this code to fail if the output is sent to the RStudio graphics device. As such this approach only works from R on the command line or from sending the output to an external device (I sent it to png
to produce the above).
See data.table
issue #1524, this RStudio support ticket, and these SO Qs (1 and 2)
You are very much on the right track. Use ggplot
to do this as follows:
(dt_agg <- dt[,.(mean = mean(outcome)),by=list(grp,time)]) # Aggregated data.table
grp time mean
1: a 1 0.75865672
2: a 2 0.07244879
---
Now ggplot this aggregated data.table
require(ggplot2)
ggplot(dt_agg, aes(x = time, y = mean, col = grp)) + geom_line()
Result: