Simple and fast method to compare images for similarity
Can the screenshot or icon be transformed (scaled, rotated, skewed ...)? There are quite a few methods on top of my head that could possibly help you:
- Simple euclidean distance as mentioned by @carlosdc (doesn't work with transformed images and you need a threshold).
- (Normalized) Cross Correlation - a simple metrics which you can use for comparison of image areas. It's more robust than the simple euclidean distance but doesn't work on transformed images and you will again need a threshold.
- Histogram comparison - if you use normalized histograms, this method works well and is not affected by affine transforms. The problem is determining the correct threshold. It is also very sensitive to color changes (brightness, contrast etc.). You can combine it with the previous two.
- Detectors of salient points/areas - such as MSER (Maximally Stable Extremal Regions), SURF or SIFT. These are very robust algorithms and they might be too complicated for your simple task. Good thing is that you do not have to have an exact area with only one icon, these detectors are powerful enough to find the right match. A nice evaluation of these methods is in this paper: Local invariant feature detectors: a survey.
Most of these are already implemented in OpenCV - see for example the cvMatchTemplate method (uses histogram matching): http://dasl.mem.drexel.edu/~noahKuntz/openCVTut6.html. The salient point/area detectors are also available - see OpenCV Feature Detection.
I face the same issues recently, to solve this problem(simple and fast algorithm to compare two images) once and for all, I contribute an img_hash module to opencv_contrib, you can find the details from this link.
img_hash module provide six image hash algorithms, quite easy to use.
Codes example
origin lena
blur lena
resize lena
shift lena
#include <opencv2/core.hpp>
#include <opencv2/core/ocl.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/img_hash.hpp>
#include <opencv2/imgproc.hpp>
#include <iostream>
void compute(cv::Ptr<cv::img_hash::ImgHashBase> algo)
{
auto input = cv::imread("lena.png");
cv::Mat similar_img;
//detect similiar image after blur attack
cv::GaussianBlur(input, similar_img, {7,7}, 2, 2);
cv::imwrite("lena_blur.png", similar_img);
cv::Mat hash_input, hash_similar;
algo->compute(input, hash_input);
algo->compute(similar_img, hash_similar);
std::cout<<"gaussian blur attack : "<<
algo->compare(hash_input, hash_similar)<<std::endl;
//detect similar image after shift attack
similar_img.setTo(0);
input(cv::Rect(0,10, input.cols,input.rows-10)).
copyTo(similar_img(cv::Rect(0,0,input.cols,input.rows-10)));
cv::imwrite("lena_shift.png", similar_img);
algo->compute(similar_img, hash_similar);
std::cout<<"shift attack : "<<
algo->compare(hash_input, hash_similar)<<std::endl;
//detect similar image after resize
cv::resize(input, similar_img, {120, 40});
cv::imwrite("lena_resize.png", similar_img);
algo->compute(similar_img, hash_similar);
std::cout<<"resize attack : "<<
algo->compare(hash_input, hash_similar)<<std::endl;
}
int main()
{
using namespace cv::img_hash;
//disable opencl acceleration may(or may not) boost up speed of img_hash
cv::ocl::setUseOpenCL(false);
//if the value after compare <= 8, that means the images
//very similar to each other
compute(ColorMomentHash::create());
//there are other algorithms you can try out
//every algorithms have their pros and cons
compute(AverageHash::create());
compute(PHash::create());
compute(MarrHildrethHash::create());
compute(RadialVarianceHash::create());
//BlockMeanHash support mode 0 and mode 1, they associate to
//mode 1 and mode 2 of PHash library
compute(BlockMeanHash::create(0));
compute(BlockMeanHash::create(1));
}
In this case, ColorMomentHash give us best result
- gaussian blur attack : 0.567521
- shift attack : 0.229728
- resize attack : 0.229358
Pros and cons of each algorithm
The performance of img_hash is good too
Speed comparison with PHash library(100 images from ukbench)
If you want to know the recommend thresholds for these algorithms, please check this post(http://qtandopencv.blogspot.my/2016/06/introduction-to-image-hash-module-of.html). If you are interesting about how do I measure the performance of img_hash modules(include speed and different attacks), please check this link(http://qtandopencv.blogspot.my/2016/06/speed-up-image-hashing-of-opencvimghash.html).
Does the screenshot contain only the icon? If so, the L2 distance of the two images might suffice. If the L2 distance doesn't work, the next step is to try something simple and well established, like: Lucas-Kanade. Which I'm sure is available in OpenCV.