Select multiple columns in data.table by their numeric indices
It's a bit verbose, but i've gotten used to using the hidden .SD
variable.
b<-data.table(a=1,b=2,c=3,d=4)
b[,.SD,.SDcols=c(1:2)]
It's a bit of a hassle, but you don't lose out on other data.table features (I don't think), so you should still be able to use other important functions like join tables etc.
For versions of data.table >= 1.9.8
, the following all just work:
library(data.table)
dt <- data.table(a = 1, b = 2, c = 3)
# select single column by index
dt[, 2]
# b
# 1: 2
# select multiple columns by index
dt[, 2:3]
# b c
# 1: 2 3
# select single column by name
dt[, "a"]
# a
# 1: 1
# select multiple columns by name
dt[, c("a", "b")]
# a b
# 1: 1 2
For versions of data.table < 1.9.8
(for which numerical column selection required the use of with = FALSE
), see this previous version of this answer. See also NEWS on v1.9.8, POTENTIALLY BREAKING CHANGES, point 3.
If you want to use column names to select the columns, simply use .()
, which is an alias for list()
:
library(data.table)
dt <- data.table(a = 1:2, b = 2:3, c = 3:4)
dt[ , .(b, c)] # select the columns b and c
# Result:
# b c
# 1: 2 3
# 2: 3 4