Replace all particular values in a data frame
Here are a couple dplyr
options:
library(dplyr)
# all columns:
df %>%
mutate_all(~na_if(., ''))
# specific column types:
df %>%
mutate_if(is.factor, ~na_if(., ''))
# specific columns:
df %>%
mutate_at(vars(A, B), ~na_if(., ''))
# or:
df %>%
mutate(A = replace(A, A == '', NA))
# replace can be used if you want something other than NA:
df %>%
mutate(A = as.character(A)) %>%
mutate(A = replace(A, A == '', 'used to be empty'))
Since PikkuKatja and glallen asked for a more general solution and I cannot comment yet, I'll write an answer. You can combine statements as in:
> df[df=="" | df==12] <- NA
> df
A B
1 <NA> <NA>
2 xyz <NA>
3 jkl 100
For factors, zxzak's code already yields factors:
> df <- data.frame(list(A=c("","xyz","jkl"), B=c(12,"",100)))
> str(df)
'data.frame': 3 obs. of 2 variables:
$ A: Factor w/ 3 levels "","jkl","xyz": 1 3 2
$ B: Factor w/ 3 levels "","100","12": 3 1 2
If in trouble, I'd suggest to temporarily drop the factors.
df[] <- lapply(df, as.character)
Like this:
> df[df==""]<-NA
> df
A B
1 <NA> 12
2 xyz <NA>
3 jkl 100