Replace contents of factor column in R dataframe

You can use the function revalue from the package plyr to replace values in a factor vector.

In your example to replace the factor virginica by setosa:

 data(iris)
 library(plyr)
 revalue(iris$Species, c("virginica" = "setosa")) -> iris$Species

For the things that you are suggesting you can just change the levels using the levels:

levels(iris$Species)[3] <- 'new'

I bet the problem is when you are trying to replace values with a new one, one that is not currently part of the existing factor's levels:

levels(iris$Species)
# [1] "setosa"     "versicolor" "virginica" 

Your example was bad, this works:

iris$Species[iris$Species == 'virginica'] <- 'setosa'

This is what more likely creates the problem you were seeing with your own data:

iris$Species[iris$Species == 'virginica'] <- 'new.species'
# Warning message:
# In `[<-.factor`(`*tmp*`, iris$Species == "virginica", value = c(1L,  :
#   invalid factor level, NAs generated

It will work if you first increase your factor levels:

levels(iris$Species) <- c(levels(iris$Species), "new.species")
iris$Species[iris$Species == 'virginica'] <- 'new.species'

If you want to replace "species A" with "species B" you'd be better off with

levels(iris$Species)[match("oldspecies",levels(iris$Species))] <- "newspecies"

Tags:

R