cor shows only NA or 1 for correlations - Why?

The 1s are because everything is perfectly correlated with itself, and the NAs are because there are NAs in your variables.

You will have to specify how you want R to compute the correlation when there are missing values, because the default is to only compute a coefficient with complete information.

You can change this behavior with the use argument to cor, see ?cor for details.


Tell the correlation to ignore the NAs with use argument, e.g.:

cor(data$price, data$exprice, use = "complete.obs")

very simple and correct answer

Tell the correlation to ignore the NAs with use argument, e.g.:

cor(data$price, data$exprice, use = "complete.obs")

NAs also appear if there are attributes with zero variance (with all elements equal); see for instance:

cor(cbind(a=runif(10),b=rep(1,10)))

which returns:

   a  b
a  1 NA
b NA  1
Warning message:
In cor(cbind(a = runif(10), b = rep(1, 10))) :
  the standard deviation is zero

Tags:

R

Correlation