Why is logistic regression called regression?
There is a strict link between linear regression and logistic regression.
With linear regression you're looking for the ki parameters:
h = k0 + Σ ki ˙ Xi = Kt ˙ X
With logistic regression you've the same aim but the equation is:
h = g(Kt ˙ X)
Where g
is the sigmoid function:
g(w) = 1 / (1 + e-w)
So:
h = 1 / (1 + e-Kt ˙ X)
and you need to fit K to your data.
Assuming a binary classification problem, the output h
is the estimated probability that the example x
is a positive match in the classification task:
P(Y = 1) = 1 / (1 + e-Kt ˙ X)
When the probability is greater than 0.5 then we can predict "a match".
The probability is greater than 0.5 when:
g(w) > 0.5
and this is true when:
w = Kt ˙ X ≥ 0
The hyperplane:
Kt ˙ X = 0
is the decision boundary.
In summary:
- logistic regression is a generalized linear model using the same basic formula of linear regression but it is regressing for the probability of a categorical outcome.
This is a very abridged version. You can find a simple explanation in these videos (third week of Machine Learning by Andrew Ng).
You can also take a look at http://www.holehouse.org/mlclass/06_Logistic_Regression.html for some notes on the lessons.