Difference between logistic regression and softmax regression

You can think of logistic regression as a binary classifier and softmax regression is one way(there are other ways) to implement an multi-class classifier. The number of output layers in softmax regression is equal to the number of class you want to predict.

Example: In the case of digit recognition, you have 10 class to predict[0-9] so you can think of this as a situation where you model outputs 10 probabilities for each class and in practice we choose the class that has the highest probability as our predicted class.

From the example above it can be seen that the output a softmax function is equal to the number of classes. These outputs are indeed equal to the probabilities of each class and so they sum up to one. For the algebraic explanation please take a look into Stanford University website which has a good and short explanation of this topic.

link:http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/

Difference: In the link above it is described in detail that a softmax with only 2 class is same as a logistic regression. Therefore it can be said the major difference is only the naming convention. we call it as logistic regression when we are dealing with a 2-class problem and softmax when we are dealing with a multinational(more than 2 classes) problem.

Note: It is worth to remember that softmax regression can also be used in other models like neural networks.

Hope this helps.


There are minor differences in multiple logistic regression models and a softmax output.

Essentially you can map an input of size d to a single output k times, or map an input of size d to k outputs a single time. However, multiple logistic regression models are confusing, and perform poorer in practice. This is because most libraries (TensorFlow, Caffe, Theano) are implemented in low level compiled languages and are highly optimized. Since managing the multiple logistic regression models is likely handled at a higher level, it should be avoided.


You can relate logistic regression to binary softmax regression when you transfer the latent model outputs pairs (z1, z2) to z = z1-z2 and apply the logistic function

softmax(z1, z2) = exp(z1)/(exp(z1) + exp(z2)) = exp(z1 - z2)/(exp(z1-z2) + exp(0)) = exp(z)/(exp(z) + 1) 

Echoing on what others have already conveyed.

  1. Softmax Regression is a generalization of Logistic Regression that summarizes a 'k' dimensional vector of arbitrary values to a 'k' dimensional vector of values bounded in the range (0, 1).
  2. In Logistic Regression we assume that the labels are binary (0 or 1). However, Softmax Regression allows one to handle classes.
  3. Hypothesis function:
    • LR:
    • Softmax Regression:

Reference: http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/