Remove NA values from a vector
The na.omit
function is what a lot of the regression routines use internally:
vec <- 1:1000
vec[runif(200, 1, 1000)] <- NA
max(vec)
#[1] NA
max( na.omit(vec) )
#[1] 1000
Use discard
from purrr (works with lists and vectors).
discard(v, is.na)
The benefit is that it is easy to use pipes; alternatively use the built-in subsetting function [
:
v %>% discard(is.na)
v %>% `[`(!is.na(.))
Note that na.omit
does not work on lists:
> x <- list(a=1, b=2, c=NA)
> na.omit(x)
$a
[1] 1
$b
[1] 2
$c
[1] NA
Trying ?max
, you'll see that it actually has a na.rm =
argument, set by default to FALSE
. (That's the common default for many other R functions, including sum()
, mean()
, etc.)
Setting na.rm=TRUE
does just what you're asking for:
d <- c(1, 100, NA, 10)
max(d, na.rm=TRUE)
If you do want to remove all of the NA
s, use this idiom instead:
d <- d[!is.na(d)]
A final note: Other functions (e.g. table()
, lm()
, and sort()
) have NA
-related arguments that use different names (and offer different options). So if NA
's cause you problems in a function call, it's worth checking for a built-in solution among the function's arguments. I've found there's usually one already there.