How to remove rows with inf from a dataframe in R

To remove the rows with +/-Inf I'd suggest the following:

df <- df[!is.infinite(rowSums(df)),]

or, equivalently,

df <- df[is.finite(rowSums(df)),]

The second option (the one with is.finite() and without the negation) removes also rows containing NA values in case that this has not already been done.


To keep the rows without Inf we can do:

df[apply(df, 1, function(x) all(is.finite(x))), ]

Also NAs are handled by this because of:
a rowindex with value NA will remove this row in the result.

Also rows with NaN are not in the result.

set.seed(24)
df <- as.data.frame(matrix(sample(c(0:9, NA, -Inf, Inf, NaN),  20*5, replace=TRUE), ncol=5))
df2 <- df[apply(df, 1, function(x) all(is.finite(x))), ]

Here are the results of the different is.~-functions:

x <- c(42, NA, NaN, Inf)
is.finite(x)
# [1]  TRUE FALSE FALSE FALSE
is.na(x)
# [1] FALSE  TRUE  TRUE FALSE
is.nan(x)
# [1] FALSE FALSE  TRUE FALSE

The is.finite works on vector and not on data.frame object. So, we can loop through the data.frame using lapply and get only the 'finite' values.

lapply(df, function(x) x[is.finite(x)])

If the number of Inf, -Inf values are different for each column, the above code will have a list with elements having unequal length. So, it may be better to leave it as a list. If we want a data.frame, it should have equal lengths.


If we want to remove rows contain any NA or Inf/-Inf values

df[Reduce(`&`, lapply(df, function(x) !is.na(x)  & is.finite(x))),]

Or a compact option by @nicola

df[Reduce(`&`, lapply(df, is.finite)),]

If we are ready to use a package, a compact option would be NaRV.omit

library(IDPmisc)
NaRV.omit(df)

data

set.seed(24)
df <- as.data.frame(matrix(sample(c(1:5, NA, -Inf, Inf), 
                      20*5, replace=TRUE), ncol=5))

Depending on the data, there are a couple options using scoped variants of dplyr::filter() and is.finite() or is.infinite() that might be useful:

library(dplyr)

# sample data
df <- data_frame(a = c(1, 2, 3, NA), b = c(5, Inf, 8, 8), c = c(9, 10, Inf, 11), d = c('a', 'b', 'c', 'd'))

# across all columns:
df %>% 
  filter_all(all_vars(!is.infinite(.)))

# note that is.finite() does not work with NA or strings:
df %>% 
  filter_all(all_vars(is.finite(.)))

# checking only numeric columns:
df %>% 
  filter_if(~is.numeric(.), all_vars(!is.infinite(.)))

# checking only select columns, in this case a through c:
df %>% 
  filter_at(vars(a:c), all_vars(!is.infinite(.)))

Tags:

R

Dataframe