cor shows only NA or 1 for correlations - Why?
The 1
s are because everything is perfectly correlated with itself, and the NA
s are because there are NA
s in your variables.
You will have to specify how you want R to compute the correlation when there are missing values, because the default is to only compute a coefficient with complete information.
You can change this behavior with the use
argument to cor
, see ?cor
for details.
Tell the correlation to ignore the NAs with use
argument, e.g.:
cor(data$price, data$exprice, use = "complete.obs")
very simple and correct answer
Tell the correlation to ignore the NAs with use argument, e.g.:
cor(data$price, data$exprice, use = "complete.obs")
NAs also appear if there are attributes with zero variance (with all elements equal); see for instance:
cor(cbind(a=runif(10),b=rep(1,10)))
which returns:
a b
a 1 NA
b NA 1
Warning message:
In cor(cbind(a = runif(10), b = rep(1, 10))) :
the standard deviation is zero