How to eliminate "NA/NaN/Inf in foreign function call (arg 7)" running predict with randomForest
Your code is not entirely reproducible (there's no running of the actual randomForest
algorithm) but you are not replacing Inf
values with the means of column vectors. This is because the na.rm = TRUE
argument in the call to mean()
within your impute.mean
function does exactly what it says -- removes NA
values (and not Inf
ones).
You can see this, for example, by:
impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x, na.rm = TRUE))
losses <- apply(losses, 2, impute.mean)
sum( apply( losses, 2, function(.) sum(is.infinite(.))) )
# [1] 696
To get rid of infinite values, use:
impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x[!is.na(x) & !is.nan(x) & !is.infinite(x)]))
losses <- apply(losses, 2, impute.mean)
sum(apply( losses, 2, function(.) sum(is.infinite(.)) ))
# [1] 0
One cause of the error message:
NA/NaN/Inf in foreign function call (arg X)
When training a randomForest is having character
-class variables in your data.frame. If it comes with the warning:
NAs introduced by coercion
Check to make sure that all of your character variables have been converted to factors.
Example
set.seed(1)
dat <- data.frame(
a = runif(100),
b = rpois(100, 10),
c = rep(c("a","b"), 100),
stringsAsFactors = FALSE
)
library(randomForest)
randomForest(a ~ ., data = dat)
Yields:
Error in randomForest.default(m, y, ...) : NA/NaN/Inf in foreign function call (arg 1) In addition: Warning message: In data.matrix(x) : NAs introduced by coercion
But switch it to stringsAsFactors = TRUE
and it runs.