How to remove rows where all columns are zero using dplyr pipe

Here's a dplyr option:

library(dplyr)
filter_all(dat, any_vars(. != 0))

#       A-XXX  fBM-XXX    P-XXX  vBM-XXX
#1 1.51653276 2.228752 1.733567 3.003979
#2 0.07703724 0.000000 0.000000 0.000000

Here we make use of the logic that if any variable is not equal to zero, we will keep it. It's the same as removing rows where all variables are equal to zero.

Regarding row.names:

library(tidyverse)
dat %>% rownames_to_column() %>% filter_at(vars(-rowname), any_vars(. != 0))
#         rowname      A-XXX  fBM-XXX    P-XXX  vBM-XXX
#1  BATF::JUN_AHR 1.51653276 2.228752 1.733567 3.003979
#2 BATF::JUN_CCR9 0.07703724 0.000000 0.000000 0.000000

Adding to the answer by @mgrund, a shorter alternative with dplyr 1.0.0 is:

# Option A:
data %>% filter(across(everything(.)) != 0))

# Option B:
data %>% filter(across(everything(.), ~. == 0))

Explanation:
across() checks for every tidy_select variable, which is everything() representing every column. In Option A, every column is checked if not zero, which adds up to a complete row of zeros in every column. In Option B, on every column, the formula (~) is applied which checks if the current column is zero.

EDIT:
As filter already checks by row, you don't need rowwise(). This is different for select or mutate.

IMPORTANT:
In Option A, it is crucial to write across(everything(.)) != 0,
and NOT across(everything(.) != 0))!

Reason:
across requires a tidyselect variable (here everything()), not a boolean (which would be everything(.) != 0))

How to remove rows where all columns are zero using dplyr pipe

Tags:

R

Dplyr

Tidyverse

Related

Recent Posts