Why is logistic regression called regression?

There is a strict link between linear regression and logistic regression.

With linear regression you're looking for the ki parameters:

h = k0 + Σ ki ˙ Xi = Kt ˙ X

With logistic regression you've the same aim but the equation is:

h = g(Kt ˙ X)

Where g is the sigmoid function:

g(w) = 1 / (1 + e-w)

So:

h = 1 / (1 + e-Kt ˙ X)

and you need to fit K to your data.

Assuming a binary classification problem, the output h is the estimated probability that the example x is a positive match in the classification task:

P(Y = 1) = 1 / (1 + e-Kt ˙ X)

When the probability is greater than 0.5 then we can predict "a match".

The probability is greater than 0.5 when:

g(w) > 0.5

and this is true when:

w = Kt ˙ X ≥ 0

The hyperplane:

Kt ˙ X = 0

is the decision boundary.

In summary:

  • logistic regression is a generalized linear model using the same basic formula of linear regression but it is regressing for the probability of a categorical outcome.

This is a very abridged version. You can find a simple explanation in these videos (third week of Machine Learning by Andrew Ng).

You can also take a look at http://www.holehouse.org/mlclass/06_Logistic_Regression.html for some notes on the lessons.