Image preprocessing in deep learning

This is certainly late reply for this post, but hopefully help who stumble upon this post.

Here's an article I found online Image Data Pre-Processing for Neural Networks, I though this certainly was a good in article into how the network should be trained.

Main gist of the article says

1) As data(Images) few into the NN should be scaled according the image size that the NN is designed to take, usually a square i.e 100x100,250x250

2) Consider the MEAN(Left Image) and STANDARD DEVIATION(Right Image) value of all the input images in your collection of a particular set of images

enter image description here

3) Normalizing image inputs done by subtracting the mean from each pixel and then dividing the result by the standard deviation, which makes convergence faster while training the network. This would resemble a Gaussian curve centred at zero enter image description here

4)Dimensionality reduction RGB to Grayscale image, neural network performance is allowed to be invariant to that dimension, or to make the training problem more tractable enter image description here


For pre-processing of images before feeding them into the Neural Networks. It is better to make the data Zero Centred. Then try out normalization technique. It certainly will increase the accuracy as the data is scaled in a range than arbitrarily large values or too small values.

An example image will be: -

enter image description here

Here is a explanation of it from Stanford CS231n 2016 Lectures.

*

Normalization refers to normalizing the data dimensions so that they are of approximately the same scale. For Image data There are two common ways of achieving this normalization. One is to divide each dimension by its standard deviation, once it has been zero-centered:
(X /= np.std(X, axis = 0)). Another form of this preprocessing normalizes each dimension so that the min and max along the dimension is -1 and 1 respectively. It only makes sense to apply this preprocessing if you have a reason to believe that different input features have different scales (or units), but they should be of approximately equal importance to the learning algorithm. In case of images, the relative scales of pixels are already approximately equal (and in range from 0 to 255), so it is not strictly necessary to perform this additional preprocessing step.

*

Link for the above extract:- http://cs231n.github.io/neural-networks-2/