how to skip reading certain columns in readr

There is an answer out there, I just didn't search hard enough: https://github.com/hadley/readr/issues/132

Apparently this was a documentation issue that has been corrected. This functionality may eventually get added but Hadley thought it was more useful to be able to just update one column type and not drop the others.

Update: The functionality has been added

The following code is from the readr documentation:

read_csv("iris.csv", col_types = cols_only( Species = col_factor(c("setosa", "versicolor", "virginica"))))

This will read only the Species column of the iris data set. In order to read only a specific column you must also pass the column specification i.e. col_factor, col_double, etc...

"According to the read_csv documentation, one way to accomplish this is to pass a named list for col_types and only name the columns you want to keep"

WRONG: read_csv('test.csv', col_types=list(colB='c', colC='c'))

No, the doc is misleading, you have to either specify that unnamed cols get dropped (class='_'/col_skip()), or else explicitly specify their class as NULL:

read_csv('test.csv', col_types=list('*'='_', colB='c', colC='c'))

read_csv('test.csv', col_types=list('colA'='_', colB='c', colC='c'))

how to skip reading certain columns in readr

Tags:

R

Readr

Related

Recent Posts