Expand ranges defined by "from" and "to" columns
You can use the plyr
package:
library(plyr)
ddply(presidents, "name", summarise, year = seq(from, to))
# name year
# 1 Barack Obama 2009
# 2 Barack Obama 2010
# 3 Barack Obama 2011
# 4 Barack Obama 2012
# 5 Bill Clinton 1993
# 6 Bill Clinton 1994
# [...]
and if it is important that the data be sorted by year, you can use the arrange
function:
df <- ddply(presidents, "name", summarise, year = seq(from, to))
arrange(df, df$year)
# name year
# 1 Bill Clinton 1993
# 2 Bill Clinton 1994
# 3 Bill Clinton 1995
# [...]
# 21 Barack Obama 2011
# 22 Barack Obama 2012
Edit 1: Following's @edgester's "Update 1", a more appropriate approach is to use adply
to account for presidents with non-consecutive terms:
adply(foo, 1, summarise, year = seq(from, to))[c("name", "year")]
An alternate tidyverse
approach using unnest
and map2
.
library(tidyverse)
presidents %>%
unnest(year = map2(from, to, seq)) %>%
select(-from, -to)
# name year
# 1 Bill Clinton 1993
# 2 Bill Clinton 1994
...
# 21 Barack Obama 2011
# 22 Barack Obama 2012
Edit: From tidyr v1.0.0
new variables can no longer be created as part of unnest()
.
presidents %>%
mutate(year = map2(from, to, seq)) %>%
unnest(year) %>%
select(-from, -to)
Here's a data.table
solution. It has the nice (if minor) feature of leaving the presidents in their supplied order:
library(data.table)
dt <- data.table(presidents)
dt[, list(year = seq(from, to)), by = name]
# name year
# 1: Bill Clinton 1993
# 2: Bill Clinton 1994
# ...
# ...
# 21: Barack Obama 2011
# 22: Barack Obama 2012
Edit: To handle presidents with non-consecutive terms, use this instead:
dt[, list(year = seq(from, to)), by = c("name", "from")]