Get column names with zero variance using dplyr

With a reproducible example, I think what you are aiming for is below. Please note that as pointed out by Colin, I have not dealt with the issue of you selecting variables with a character variable. See his answer for details on that.

# reproducible data
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$qsec <- 7

library(dplyr)

mtcars2 %>% 
  summarise_all(var) %>% 
  select_if(function(.) . == 0) %>% 
  names()
# [1] "mpg"  "qsec"

Personally, I think that obfuscates what you are doing. One of the following using the purrr package (if you wish to remain in the tidyverse) would be my preference, with a well written comment.

library(purrr)

# Return a character vector of variable names which have 0 variance
names(mtcars2)[which(map_dbl(mtcars2, var) == 0)]
names(mtcars2)[map_lgl(mtcars2, function(x) var(x) == 0)]

If you'd like to optimize it for speed, stick with base R

# Return a character vector of variable names which have 0 variance
names(mtcars2)[vapply(mtcars2, function(x) var(x) == 0, logical(1))]

You have two problems.

1. Passing names of columns as a variable to `select()`

The vignette about that is here. programming with dplyr. The solution here is to use the select_at() scoped variant of the select function.

2. Variance equals 0

noVar <- june %>% 
    select_at(.vars=numericVars) %>% 
    summarise_all(.funs=var) %>%
    filter_all(any_vars(. == 0))

Select columns if unique count is 1 then get column names, using @Benjamin's example data mtcars2:

mtcars2 %>% 
  select_if(function(.) n_distinct(.) == 1) %>% 
  names()
# [1] "mpg"  "qsec"

Get column names with zero variance using dplyr

1. Passing names of columns as a variable to `select()`

2. Variance equals 0

Tags:

R

Lapply

Dplyr

Related

Recent Posts

Get column names with zero variance using dplyr

1. Passing names of columns as a variable to select()

2. Variance equals 0

Tags:

R

Lapply

Dplyr

Related

1. Passing names of columns as a variable to `select()`