Principal Components Analysis:Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric

You can convert a character vector to numeric values by going via factor. Then each unique value gets a unique integer code. In this example, there's four values so the numbers are 1 to 4, in alphabetical order, I think:

> d = data.frame(country=c("foo","bar","baz","qux"),x=runif(4),y=runif(4))
> d
  country          x         y
1     foo 0.84435112 0.7022875
2     bar 0.01343424 0.5019794
3     baz 0.09815888 0.5832612
4     qux 0.18397525 0.8049514
> d$country = as.numeric(as.factor(d$country))
> d
  country          x         y
1       3 0.84435112 0.7022875
2       1 0.01343424 0.5019794
3       2 0.09815888 0.5832612
4       4 0.18397525 0.8049514

You can then run prcomp:

> prcomp(d)
Standard deviations:
[1] 1.308665216 0.339983614 0.009141194

Rotation:
               PC1          PC2          PC3
country -0.9858920  0.132948161 -0.101694168
x       -0.1331795 -0.991081523 -0.004541179
y       -0.1013910  0.009066471  0.994805345

Whether this makes sense for your application is up to you. Maybe you just want to drop the first column: prcomp(d[,-1]) and work with the numeric data, which seems to be what the other "answers" are trying to achieve.


The first column of the data frame is character. So you can recode it to row names as :

library(tidyverse)
data2 %>% remove_rownames %>% column_to_rownames(var="country")
princ <- prcomp(data2)

Alternatively as :

data2 <- data2[,-1]
rownames(data2) <- data2[,1]
princ <- prcomp(data2)

Tags:

R