Are modern CNN (convolutional neural network) as DetectNet rotate invariant?
No
In classification problems, CNNs are not rotate invariant. You need to include in your training set images with every possible rotation.
You can train a CNN to classify images into predefined categories (if you want to detect several objects in a image as in your example you need to scan every place of a image with your classifier).
However, this is an object detection problem, not only a classification problem.
In object detection problems, you can use a sliding window approach, but it is extremely inefficient. Instead a simple CNN other architectures are the state of art. For example:
- Faster RCNN: https://arxiv.org/pdf/1506.01497.pdf
- YOLO NET: https://pjreddie.com/darknet/yolo/
- SSD: https://arxiv.org/pdf/1512.02325.pdf
These architectures can detect the object anywhere in the image, but you also must include in the training set samples with different rotations (and the training set must be labelled using bounding boxes, that it is very time consuming).
Adding on to Rob's answer, in general CNN itself is translation invariant, but not rotation and scale. However, it is not compulsory to include all possible rotations into your training data. A max pooling layer would introduce rotation invariant.
This image posted by Franck Dernoncourt here might be what you're looking for.
Secondly, regarding Kershaw's comment on Rob's answer which says:
A CNN is invariant to small horizontal or vertical movements in your training data mainly because of max pooling.
The main reason CNNs are translation invariant is the convolution. The filter would extract the feature regardless of where it is in the image since the filter will be moving across the entire image. It is when the image is rotated or scaled that the filter would fail because of the difference in pixel representation of the features.
Source: Aditya Kumar Praharaj's answer from this link.