Reshape multiple value columns to wide format
data.table can cast on multiple value.var
variables. This is quite direct (and efficient).
Therefore:
library(data.table) # v1.9.5+
dcast(setDT(expensesByMonth), expense_type ~ month, value.var = c("value", "percent"))
Your best option is to reshape your data to long format, using melt
, and then to dcast
:
library(reshape2)
meltExpensesByMonth <- melt(expensesByMonth, id.vars=1:2)
dcast(meltExpensesByMonth, expense_type ~ month + variable, fun.aggregate = sum)
The first few lines of output:
expense_type 2012-02-01_value 2012-02-01_percent 2012-03-01_value 2012-03-01_percent
1 Adjustment 442.37 0.124025031 2.00 0.0005064625
2 Bank Service Charge 200.00 0.056072985 200.00 0.0506462461
3 Cable 21.33 0.005980184 36.33 0.0091998906
4 Charity 0.00 0.000000000 0.00 0.0000000000
Reshaping from long to wide format with multiple value/measure columns is now possible with the new function pivot_wider()
introduced in tidyr 1.0.0.
This is superior to the previous tidyr strategy of gather()
than spread()
, because the attributes are no longer dropped (e.g., dates remain dates, strings remain strings).
pivot_wider()
(counterpart: pivot_longer()
) works similar to spread()
.
However, it offers additional functionality such as multiple value columns.
To this end, the argument values_from
—that indicates from which column(s) the values are taken—may take more than one column name.
NA
s may be filled using the argument values_fill
.
library("tidyr")
library("magrittr")
pivot_wider(expensesByMonth,
id_cols = expense_type,
names_from = month,
values_from = c(value, percent))
#> # A tibble: 23 x 13
#> expense_type `value_2012-02-~ `value_2012-03-~ `value_2012-04-~
#> <chr> <dbl> <dbl> <dbl>
#> 1 Adjustment 442. 2 16.4
#> 2 Bank Servic~ 200 200 200
#> 3 Cable 21.3 36.3 NA
#> 4 Clubbing 75 207. 325.
#> 5 Dining 22.5 74.5 80.5
#> 6 Education 1800 NA NA
#> 7 Gifts 10 89 100
#> 8 Groceries 233. 373. 398.
#> 9 Lunch 155. 384. 326.
#> 10 Personal Ca~ 30 30 90
#> # ... with 13 more rows, and 9 more variables: `value_2012-05-01` <dbl>,
#> # `value_2012-06-01` <dbl>, `value_2012-07-01` <dbl>,
#> # `percent_2012-02-01` <dbl>, `percent_2012-03-01` <dbl>,
#> # `percent_2012-04-01` <dbl>, `percent_2012-05-01` <dbl>,
#> # `percent_2012-06-01` <dbl>, `percent_2012-07-01` <dbl>
Alternatively, the reshape may be done using a pivot spec that offers finer control (see link below):
# see also ?build_wider_spec
spec <- expensesByMonth %>%
expand(month, .value = c("percent", "value")) %>%
dplyr::mutate(.name = paste(.$month, .$.value, sep = "_"))
pivot_wider_spec(expensesByMonth, spec = spec)
Created on 2019-03-26 by the reprex package (v0.2.1)
See also: https://tidyr.tidyverse.org/dev/articles/pivot.html