Expand ranges defined by "from" and "to" columns

You can use the plyr package:

library(plyr)
ddply(presidents, "name", summarise, year = seq(from, to))
#              name year
# 1    Barack Obama 2009
# 2    Barack Obama 2010
# 3    Barack Obama 2011
# 4    Barack Obama 2012
# 5    Bill Clinton 1993
# 6    Bill Clinton 1994
# [...]

and if it is important that the data be sorted by year, you can use the arrange function:

df <- ddply(presidents, "name", summarise, year = seq(from, to))
arrange(df, df$year)
#              name year
# 1    Bill Clinton 1993
# 2    Bill Clinton 1994
# 3    Bill Clinton 1995
# [...]
# 21   Barack Obama 2011
# 22   Barack Obama 2012

Edit 1: Following's @edgester's "Update 1", a more appropriate approach is to use adply to account for presidents with non-consecutive terms:

adply(foo, 1, summarise, year = seq(from, to))[c("name", "year")]

An alternate tidyverse approach using unnest and map2.

library(tidyverse)

presidents %>%
  unnest(year = map2(from, to, seq)) %>%
  select(-from, -to)

#              name  year
# 1    Bill Clinton  1993
# 2    Bill Clinton  1994
...
# 21   Barack Obama  2011
# 22   Barack Obama  2012

Edit: From tidyr v1.0.0 new variables can no longer be created as part of unnest().

presidents %>%
  mutate(year = map2(from, to, seq)) %>%
  unnest(year) %>%
  select(-from, -to)

Here's a data.table solution. It has the nice (if minor) feature of leaving the presidents in their supplied order:

library(data.table)
dt <- data.table(presidents)
dt[, list(year = seq(from, to)), by = name]
#               name year
#  1:   Bill Clinton 1993
#  2:   Bill Clinton 1994
#  ...
#  ...
# 21:   Barack Obama 2011
# 22:   Barack Obama 2012

Edit: To handle presidents with non-consecutive terms, use this instead:

dt[, list(year = seq(from, to)), by = c("name", "from")]

Expand ranges defined by "from" and "to" columns

Tags:

R

Dataframe

Related

Recent Posts