Get column names with zero variance using dplyr
With a reproducible example, I think what you are aiming for is below. Please note that as pointed out by Colin, I have not dealt with the issue of you selecting variables with a character variable. See his answer for details on that.
# reproducible data
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$qsec <- 7
library(dplyr)
mtcars2 %>%
summarise_all(var) %>%
select_if(function(.) . == 0) %>%
names()
# [1] "mpg" "qsec"
Personally, I think that obfuscates what you are doing. One of the following using the purrr
package (if you wish to remain in the tidyverse) would be my preference, with a well written comment.
library(purrr)
# Return a character vector of variable names which have 0 variance
names(mtcars2)[which(map_dbl(mtcars2, var) == 0)]
names(mtcars2)[map_lgl(mtcars2, function(x) var(x) == 0)]
If you'd like to optimize it for speed, stick with base R
# Return a character vector of variable names which have 0 variance
names(mtcars2)[vapply(mtcars2, function(x) var(x) == 0, logical(1))]
You have two problems.
1. Passing names of columns as a variable to select()
The vignette about that is here. programming with dplyr. The solution here is to use the select_at()
scoped variant of the select function.
2. Variance equals 0
noVar <- june %>%
select_at(.vars=numericVars) %>%
summarise_all(.funs=var) %>%
filter_all(any_vars(. == 0))
Select columns if unique count is 1 then get column names, using @Benjamin's example data mtcars2:
mtcars2 %>%
select_if(function(.) n_distinct(.) == 1) %>%
names()
# [1] "mpg" "qsec"