Select columns based on multiple attribute conditions

For the first example:

starwars %>%
  select_if(function(col) {is.numeric(col) | is.character(col)})

This is taken directly from the RDocumentation page.

For the second:

toKeep <- sapply(starwars, is.numeric)
starwars %>%
  select("name", names(toKeep)[as.numeric(toKeep) == 1])

I cannot make something prettier up at the moment, but I'm sure there is a better way :)


From version 1.0.0, as mentioned in the news,

select() and rename() use the latest version of the tidyselect interface. Practically, this means that you can now combine selections using Boolean logic (i.e. !, & and |), and use predicate functions (e.g. is.character) to select variables by type (#4680).

### Install development version on GitHub first until CRAN version is available
# install.packages("devtools")
# devtools::install_github("tidyverse/dplyr")
library(dplyr, warn.conflicts = FALSE)

starwars %>% 
  as_tibble() %>% 
  glimpse()
#> Rows: 87
#> Columns: 14
#> $ name       <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia...
#> $ height     <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, 180...
#> $ mass       <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.0, ...
#> $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "brown"...
#> $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "light"...
#> $ eye_color  <chr> "blue", "yellow", "red", "yellow", "brown", "blue", "blu...
#> $ birth_year <dbl> 19.0, 112.0, 33.0, 41.9, 19.0, 52.0, 47.0, NA, 24.0, 57....
#> $ sex        <chr> "male", "none", "none", "male", "female", "male", "femal...
#> $ gender     <chr> "masculine", "masculine", "masculine", "masculine", "fem...
#> $ homeworld  <chr> "Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan",...
#> $ species    <chr> "Human", "Droid", "Droid", "Human", "Human", "Human", "H...
#> $ films      <list> [<"The Empire Strikes Back", "Revenge of the Sith", "Re...
#> $ vehicles   <list> [<"Snowspeeder", "Imperial Speeder Bike">, <>, <>, <>, ...
#> $ starships  <list> [<"X-wing", "Imperial shuttle">, <>, <>, "TIE Advanced ...

To select either numeric or character columns:

starwars %>%
  select(is.numeric | is.character) %>% 
  glimpse()
#> Rows: 87
#> Columns: 11
#> $ height     <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, 180...
#> $ mass       <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.0, ...
#> $ birth_year <dbl> 19.0, 112.0, 33.0, 41.9, 19.0, 52.0, 47.0, NA, 24.0, 57....
#> $ name       <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia...
#> $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "brown"...
#> $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "light"...
#> $ eye_color  <chr> "blue", "yellow", "red", "yellow", "brown", "blue", "blu...
#> $ sex        <chr> "male", "none", "none", "male", "female", "male", "femal...
#> $ gender     <chr> "masculine", "masculine", "masculine", "masculine", "fem...
#> $ homeworld  <chr> "Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan",...
#> $ species    <chr> "Human", "Droid", "Droid", "Human", "Human", "Human", "H...

Or select non-list columns

starwars %>%
  select(!is.list) %>% 
  glimpse()
#> Rows: 87
#> Columns: 11
#> $ name       <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia...
#> $ height     <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, 180...
#> $ mass       <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.0, ...
#> $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "brown"...
#> $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "light"...
#> $ eye_color  <chr> "blue", "yellow", "red", "yellow", "brown", "blue", "blu...
#> $ birth_year <dbl> 19.0, 112.0, 33.0, 41.9, 19.0, 52.0, 47.0, NA, 24.0, 57....
#> $ sex        <chr> "male", "none", "none", "male", "female", "male", "femal...
#> $ gender     <chr> "masculine", "masculine", "masculine", "masculine", "fem...
#> $ homeworld  <chr> "Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan",...
#> $ species    <chr> "Human", "Droid", "Droid", "Human", "Human", "Human", "H...

To select name & character columns

starwars %>%
  select(name | is.character) %>% 
  glimpse()
#> Rows: 87
#> Columns: 8
#> $ name       <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia...
#> $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "brown"...
#> $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "light"...
#> $ eye_color  <chr> "blue", "yellow", "red", "yellow", "brown", "blue", "blu...
#> $ sex        <chr> "male", "none", "none", "male", "female", "male", "femal...
#> $ gender     <chr> "masculine", "masculine", "masculine", "masculine", "fem...
#> $ homeworld  <chr> "Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan",...
#> $ species    <chr> "Human", "Droid", "Droid", "Human", "Human", "Human", "H...

Created on 2020-02-17 by the reprex package (v0.3.0)


You can either write your own function:

 to_keep <- function(x) is.numeric(x) | is.character(x)
 starwars %>% select_if(to_keep)

or you can use "quosure-style lambda functions":

starwars %>% select_if(funs(is.numeric(.) | is.character(.)))

I don't know of a good way of combining different logic for column selection, so I'd use an hybrid approach (even if it's not very elegant as you have to repeat the initial dataset):

 starwars %>%
    select("name") %>%
    bind_cols(select_if(starwars, funs(is.numeric(.) | is.character(.))))

Tags:

R

Dplyr