Error in tolower() invalid multibyte string

Here's how I solved my problem:

First, I opened the raw data in a texteditor (Geany, in this case), clicked properties and identified the Encoding type.

After which I used the iconv() function.

x <- iconv(x,"WINDOWS-1252","UTF-8")

To be more specific, I did this for every column of the data.frame from the imported CSV. Important to note that I set stringsAsFactors=FALSE in my read.csv() call.

dat[,sapply(dat,is.character)] <- sapply(
    dat[,sapply(dat,is.character)],
    iconv,"WINDOWS-1252","UTF-8")

I was getting the same error. However, in my case it wasn't when I was reading the file, but a bit later when processing it. I realised that I was getting the error, because the file wasn't read with the correct encoding in the first place.

I found a much simpler solution (at least for my case) and wanted to share. I simply added encoding as below and it worked.

read.csv(<path>, encoding = "UTF-8")

Tags:

R