How to remove rows where all columns are zero using dplyr pipe
Here's a dplyr option:
library(dplyr)
filter_all(dat, any_vars(. != 0))
# A-XXX fBM-XXX P-XXX vBM-XXX
#1 1.51653276 2.228752 1.733567 3.003979
#2 0.07703724 0.000000 0.000000 0.000000
Here we make use of the logic that if any variable is not equal to zero, we will keep it. It's the same as removing rows where all variables are equal to zero.
Regarding row.names:
library(tidyverse)
dat %>% rownames_to_column() %>% filter_at(vars(-rowname), any_vars(. != 0))
# rowname A-XXX fBM-XXX P-XXX vBM-XXX
#1 BATF::JUN_AHR 1.51653276 2.228752 1.733567 3.003979
#2 BATF::JUN_CCR9 0.07703724 0.000000 0.000000 0.000000
Adding to the answer by @mgrund, a shorter alternative with dplyr 1.0.0 is:
# Option A:
data %>% filter(across(everything(.)) != 0))
# Option B:
data %>% filter(across(everything(.), ~. == 0))
Explanation: across()
checks for every tidy_select variable, which is everything()
representing every column. In Option A, every column is checked if not zero, which adds up to a complete row of zeros in every column. In Option B, on every column, the formula (~) is applied which checks if the current column is zero.
EDIT:
As filter
already checks by row, you don't need rowwise()
. This is different for select
or mutate
.
IMPORTANT:
In Option A, it is crucial to write across(everything(.)) != 0
,
and NOT
across(everything(.) != 0))
!
Reason: across
requires a tidyselect variable (here everything()
), not a boolean (which would be everything(.) != 0)
)