Changing values when converting column type to numeric
I had the same issue, but as I found, the root cause was different, and so I share this as an answer but not a comment.
df <- read.table(doc.csv, header = TRUE, sep = ",", dec = ".")
df$value
# Results in
[1] 2254 1873 2201 2147 2456 1785
# So..
as.numeric(df$value)
[1] 26 14 22 20 32 11
In my case, the reason was that there were spaces with the values in the original csv document. Removing the spaces fixed the issue.
From the dput(df)
" 1178 ", " 1222 ", " 1223 ", " 1314 ", " 1462 ",
It looks like your second column is a factor. You need to use as.character
before as.numeric
. This is because factors are stored internally as integers with a table to give the factor level labels. Just using as.numeric
will only give the internal integer codes. There is no need to use sapply
since these functions are vectorized.
data[,2] <- as.numeric(as.character(data[,2]))
It is likely that the column is a factor because there are some non-numeric characters in some of the entries. Any such entries will be converted to NA
with the appropriate warning, but you may want to investigate this in your raw data.
As a side note, data
is a poor (though not invalid) choice for a variable name since there is a base function of the same name.