How calculate growth rate in long format data frame?
Using R base function (ave
)
> dfdf$Growth <- with(df, ave(Value, Category,
FUN=function(x) c(NA, diff(x)/x[-length(x)]) ))
> df
Category Year Value Growth
1 A 2010 1 NA
2 A 2011 2 1.00000000
3 A 2012 3 0.50000000
4 A 2013 4 0.33333333
5 A 2014 5 0.25000000
6 A 2015 6 0.20000000
7 B 2010 7 NA
8 B 2011 8 0.14285714
9 B 2012 9 0.12500000
10 B 2013 10 0.11111111
11 B 2014 11 0.10000000
12 B 2015 12 0.09090909
@Ben Bolker's answer is easily adapted to ave
:
transform(df, Growth=ave(Value, Category,
FUN=function(x) c(NA,exp(diff(log(x)))-1)))
You can simply use dplyr
package:
> df %>% group_by(Category) %>% mutate(Growth = (Value - lag(Value))/lag(Value))
which will produce the following result:
# A tibble: 12 x 4
# Groups: Category [2]
Category Year Value Growth
<fct> <int> <int> <dbl>
1 A 2010 1 NA
2 A 2011 2 1
3 A 2012 3 0.5
4 A 2013 4 0.333
5 A 2014 5 0.25
6 A 2015 6 0.2
7 B 2010 7 NA
8 B 2011 8 0.143
9 B 2012 9 0.125
10 B 2013 10 0.111
11 B 2014 11 0.1
12 B 2015 12 0.0909
For these sorts of questions ("how do I compute XXX by category YYY")? there are always solutions based on by()
, the data.table()
package, and plyr
. I generally prefer plyr
, which is often slower, but (to me) more transparent/elegant.
df <- data.frame(Category=c(rep("A",6),rep("B",6)),
Year=rep(2010:2015,2),Value=1:12)
library(plyr)
ddply(df,"Category",transform,
Growth=c(NA,exp(diff(log(Value)))-1))
The main difference between this answer and @krlmr's is that I am using a geometric-mean trick (taking differences of logs and then exponentiating) while @krlmr computes an explicit ratio.
Mathematically, diff(log(Value))
is taking the differences of the logs, i.e. log(x[t+1])-log(x[t])
for all t
. When we exponentiate that we get the ratio x[t+1]/x[t]
(because exp(log(x[t+1])-log(x[t])) = exp(log(x[t+1]))/exp(log(x[t])) = x[t+1]/x[t]
). The OP wanted the fractional change rather than the multiplicative growth rate (i.e. x[t+1]==x[t]
corresponds to a fractional change of zero rather than a multiplicative growth rate of 1.0), so we subtract 1.
I am also using transform()
for a little bit of extra "syntactic sugar", to avoid creating a new anonymous function.