How does glmnet compute the maximal lambda value?
To get the same result you need to standardize the variables using a standard deviation with n
instead of n-1
denominator.
mysd <- function(y) sqrt(sum((y-mean(y))^2)/length(y))
sx <- scale(x,scale=apply(x, 2, mysd))
sx <- as.matrix(sx, ncol=20, nrow=100)
sy <- as.vector(scale(y, scale=mysd(y)))
max(abs(colSums(sx*sy)))/100
## [1] 0.1758808
fitGLM <- glmnet(sx,sy)
max(fitGLM$lambda)
## [1] 0.1758808
For the unscaled (original) x
and y
, the maximum lambda should be
mysd <- function(y) sqrt(sum((y-mean(y))^2)/length(y))
sx <- scale(x,scale=apply(x, 2, mysd))
norm(t(sx) %*% y, 'i') / nrow(x)
## [1] 0.1975946
# norm of infinity is also equal to
max(abs(colSums(sx*y)))/100
## [1] 0.1975946
max(fitGLM$lambda) - norm(t(sx) %*% y, 'i') / nrow(x)
## [1] 2.775558e-17
It seems lambda_max
for a logistic regression is calculated similarly as for linear regression, but with weights based on class proportions:
set.seed(1)
library("glmnet")
x <- matrix(rnorm(100*20),100,20)
y <- rnorm(100)
mysd <- function(y) sqrt(sum((y-mean(y))^2)/length(y))
sx <- scale(x, scale=apply(x, 2, mysd))
sx <- as.matrix(sx, ncol=20, nrow=100)
y_bin <- factor(ifelse(y<0, -1, 1))
prop.table(table(y_bin))
# y_bin
# -1 1
# 0.62 0.38
fitGLM_log <- glmnet(sx, y_bin, family = "binomial")
max(fitGLM_log$lambda)
# [1] 0.1214006
max(abs(colSums(sx*ifelse(y<0, -.38, .62))))/100
# [1] 0.1214006