R error "Can't join on ... because of incompatible types"
This is a frequently viewed question, so many others must run into the error, so deserves a more complete answer.
The simple solution for correcting this join error is to simply mutate the class of the column(s) causing the problem. This can be done as follows:
- glimpse the column classes in the dataframes to be joined
mutate the column class to match using as.numeric, as.logical or as.character. For example:
df2 <- df2 %>% mutate(column1 = as.numeric(column1))
A solution for production environments is in the matchColClasses function shown, which does the following:
- Identify columns that share the same name (sharedColNames)
- Use the master data frame (df1) to identify the shared columns classes
Reassign column classes in df2 to match df1
matchColClasses <- function(df1, df2) { sharedColNames <- names(df1)[names(df1) %in% names(df2)] sharedColTypes <- sapply(df1[,sharedColNames], class) for (n in sharedColNames) { class(df2[, n]) <- sharedColTypes[n] } return(df2) }
This function works well in our production environment, with heterogenous data types; character, numeric and logical.