Getting Warning: " 'newdata' had 1 row but variables found have 32 rows" on predict.lm
This is a problem of using different names between your data
and your newdata
and not a problem between using vectors or dataframes.
When you fit a model with the lm
function and then use predict
to make predictions, predict
tries to find the same names on your newdata
. In your first case name x
conflicts with mtcars$wt
and hence you get the warning.
See here an illustration of what I say:
This is what you did and didn't get an error:
a <- mtcars$mpg
x <- mtcars$wt
#here you use x as a name
fitCar <- lm(a ~ x)
#here you use x again as a name in newdata.
predict(fitCar, data.frame(x = mean(x)), interval = "confidence")
fit lwr upr
1 20.09062 18.99098 21.19027
See that in this case you fit your model using the name x and also predict using the name x in your newdata
. This way you get no warnings and it is what you expect.
Let's see what happens when I change the name to something else when I fit the model:
a <- mtcars$mpg
#name it b this time
b <- mtcars$wt
fitCar <- lm(a ~ b)
#here I am using name x as previously
predict(fitCar, data.frame(x = mean(x)), interval = "confidence")
fit lwr upr
1 23.282611 21.988668 24.57655
2 21.919770 20.752751 23.08679
3 24.885952 23.383008 26.38890
4 20.102650 19.003004 21.20230
5 18.900144 17.771469 20.02882
Warning message:
'newdata' had 1 row but variables found have 32 rows
The only thing I did now was to change the name x
when fitting the model to b
and then predict using the name x
in the newdata
. As you can see I got the same error as in your question.
Hope this is clear now!
In the formula for lm function do not refer to the variables using the datasetname$variablename pattern. Instead use variablename + variablename ...This will not throw the warning: 'newdata' had nrow(test) row but variables found have nrow(train) rows.
A way around this without making names is to use the following:
fitCar<-lm(mpg ~ wt, mtcars) #here you use x as a name
predict(fitCar,data.frame(wt=mean(mtcars$wt)), interval="confidence")