Convert named vector to list in R
Like I said in the comment, you can use split
to create a list.
a.list <- split(a, names(a))
a.list <- lapply(a.list, unname)
A one-liner would be
a.list <- lapply(split(a, names(a)), unname)
#$I
#[1] 1 2 3 4
#
#$II
#[1] 5 6 7 8
EDIT.
Then, thelatemail posted a simplification of this in his comment. I've timed it using Devin King's way and it's not only simpler it's also 25% faster.
a.list <- split(unname(a),names(a))
I'd suggest looking at packages that excel at working with aggregating large amounts of data, like the data.table
package. With data.table
, you could do:
a <- 1:5e7
names(a) <- c(rep('I',1e7), rep('II',1e7), rep('III',1e7),
rep('IV',1e7), rep('V',1e7))
library(data.table)
temp <- data.table(names(a), a)[, list(V2 = list(a)), V1]
a.list <- setNames(temp[["V2"]], temp[["V1"]])
Here are some functions to test the various options out with:
myFun <- function(invec) {
x <- data.table(names(invec), invec)[, list(V2 = list(invec)), V1]
setNames(x[["V2"]], x[["V1"]])
}
rui1 <- function(invec) {
a.list <- split(invec, names(invec))
lapply(a.list, unname)
}
rui2 <- function(invec) {
split(unname(invec), names(invec))
}
op <- function(invec) {
names.uniq <- unique(names(invec))
a.list <- setNames(vector('list', length(names.uniq)), names.uniq)
for(i in 1:length(names.uniq)) {
names.i <- names.uniq[i]
a.i <- a[names(invec) == names.i]
a.list[[names.i]] <- unname(a.i)
}
a.list
}
And the results of microbenchmark on 10 replications:
library(microbenchmark)
microbenchmark(myFun(a), rui1(a), rui2(a), op(a), times = 10)
# Unit: milliseconds
# expr min lq mean median uq max neval
# myFun(a) 698.1553 768.6802 932.6525 934.6666 1056.558 1168.889 10
# rui1(a) 2967.4927 3097.6168 3199.9378 3185.1826 3319.453 3413.185 10
# rui2(a) 2152.0307 2285.4515 2372.9896 2362.7783 2426.821 2643.033 10
# op(a) 2672.4703 2872.5585 2896.7779 2901.7979 2971.782 3039.663 10
Also, note that in testing the different solutions, you might want to consider other scenarios, for instance, cases where you expect to have lots of different names. In that case, your for
loop slows down significantly. Try, for example, the above functions with the following data:
set.seed(1)
b <- sample(100, 5e7, TRUE)
names(b) <- sample(c(letters, LETTERS, 1:100), 5e7, TRUE)