Two images one on the other
Starting point. Now, you only need to connect the other anchors and decide the right position for the squares.
\documentclass{report}
\usepackage{tikz}
\usepackage{graphicx}
\usepackage{lipsum}
\begin{document}
\lipsum[1]
\begin{figure}[h]\centering
\begin{tikzpicture}
\node[draw] (one) at (0,0){\includegraphics[width=6cm]{example-image-a}};
\node[anchor=north,draw,inner sep=0pt] (two) at ([xshift=-1cm]one.north west){\includegraphics[width=1cm]{example-image-b}};
\node[draw] (rect) at (-1,0){};
\draw (two.south east) -- (rect.south west);
\end{tikzpicture}
\caption{text}
\end{figure}
\lipsum[2]
\end{document}
A solution using path picture
to clip the zoom.
And as @Sigur said 'Starting point. Now, you only need to connect the other anchors and decide the right position for the' circles.
\documentclass[a4paper, 12pt]{article}
\usepackage{mwe}
\usepackage{tikz}
\begin{document}
\begin{figure}[h!]
\centering
\includegraphics[width=.7\linewidth]{example-image}
\caption[Text for the list of figures]{Text under the figure}
\label{fig:theReference0}
\end{figure}
\begin{figure}[h!]
\centering
\begin{tikzpicture}
[path image/.style={path picture={\node at (path picture bounding box.center) {\includegraphics[height=3cm]{#1}};}}]
\node (img) {\includegraphics[width=.7\linewidth]{example-image}};
\node (c1) [draw, circle, red, text width=.7cm] at (img.center) {};
\draw [red] (c1.east) -- (img.east);
\draw [path image=example-image-a,draw=red,thick] (img.east) circle (2cm);
\end{tikzpicture}
\caption[Text for the list of figures]{Text under the figure}
\label{fig:theReference}
\end{figure}
\end{document}