Reshaping wide to long with multiple values columns

This seems to do what you want except that the f is removed from elements in time.

reshape(dw, idvar = "sbj", varying = list(c(2,4),c(3,5)), v.names = c("ave", "sd"), direction = "long")

    sbj blabla time ave sd
A.1   A     bA    1  10  6
B.1   B     bB    1  12  5
C.1   C     bC    1  20  7
D.1   D     bD    1  22  8
A.2   A     bA    2  50 10
B.2   B     bB    2  70 11
C.2   C     bC    2  20  8
D.2   D     bD    2  22  9

Another option using Hadley's new tidyr package.

library(tidyr)
library(dplyr)

dw <- read.table(header=T, text='
 sbj f1.avg f1.sd f2.avg f2.sd  blabla
   A   10    6     50     10      bA
   B   12    5     70     11      bB
   C   20    7     20     8       bC
   D   22    8     22     9       bD
 ')

dw %>% 
  gather(v, value, f1.avg:f2.sd) %>% 
  separate(v, c("var", "col")) %>% 
  arrange(sbj) %>% 
  spread(col, value)

reshape does this with the appropriate arguments.

varying lists the columns which exist in the wide format, but are split into multiple rows in the long format. v.names is the long format equivalents. Between the two, a mapping is created.

From ?reshape:

Also, guessing is not attempted if v.names is given explicitly. Notice that the order of variables in varying is like x.1,y.1,x.2,y.2.

Given these varying and v.names arguments, reshape is smart enough to see that I've specified that the index is before the dot here (i.e., order 1.x, 1.y, 2.x, 2.y). Note that the original data has the columns in this order, so we can specify varying=2:5 for this example data, but that is not safe in general.

Given the values of times and v.names, reshape splits the varying columns on a . character (the default sep argument) to create the columns in the output.

times specifies values that are to be used in the created var column, and v.names are pasted onto these values to get column names in the wide format for mapping to the result.

Finally, idvar is specified to be the sbj column, which identifies individual records in the wide format (thanks @thelatemail).

reshape(dw, direction='long', 
        varying=c('f1.avg', 'f1.sd', 'f2.avg', 'f2.sd'), 
        timevar='var',
        times=c('f1', 'f2'),
        v.names=c('avg', 'sd'),
        idvar='sbj')

##      sbj blabla var avg sd
## A.f1   A     bA  f1  10  6
## B.f1   B     bB  f1  12  5
## C.f1   C     bC  f1  20  7
## D.f1   D     bD  f1  22  8
## A.f2   A     bA  f2  50 10
## B.f2   B     bB  f2  70 11
## C.f2   C     bC  f2  20  8
## D.f2   D     bD  f2  22  9

melt from the >=1.9.6 version of data.table, does this by specifying the column index in measure.vars as a list.

 melt(setDT(dw), measure.vars=list(c(2,4), c(3,5)), 
     variable.name='var', value.name=c('avg', 'sd'))[, 
      var:= paste0('f',var)][order(sbj)]
#   sbj blabla var avg sd
#1:   A     bA  f1  10  6
#2:   A     bA  f2  50 10
#3:   B     bB  f1  12  5
#4:   B     bB  f2  70 11
#5:   C     bC  f1  20  7
#6:   C     bC  f2  20  8
#7:   D     bD  f1  22  8
#8:   D     bD  f2  22  9

Or you could use the new patterns function:

melt(setDT(dw), 
     measure = patterns("avg", "sd"),
     variable.name = 'var', value.name = c('avg', 'sd'))
#    sbj blabla var avg sd
# 1:   A     bA   1  10  6
# 2:   B     bB   1  12  5
# 3:   C     bC   1  20  7
# 4:   D     bD   1  22  8
# 5:   A     bA   2  50 10
# 6:   B     bB   2  70 11
# 7:   C     bC   2  20  8
# 8:   D     bD   2  22  9

Reshaping wide to long with multiple values columns

Tags:

R

Reshape

Reshape2

Melt

Related

Recent Posts