Understanding “the mean minimizes the mean squared error”
If you have $(y_i)_{i=1}^n$, consider the mean squared difference from the $y_i$ to a value $a$.
This is $s(a) =\sum_{i=1}^n (y_i-a)^2 $.
Manipulating this,
$\begin{array}\\ s(a) &=\sum_{i=1}^n (y_i-a)^2\\ &=\sum_{i=1}^n (y_i^2-2ay_i+a^2)\\ &=\sum_{i=1}^n y_i^2-\sum_{i=1}^n2ay_i+\sum_{i=1}^na^2\\ &=\sum_{i=1}^n y_i^2-2a\sum_{i=1}^ny_i+na^2\\ \end{array} $
There are a number of ways to minimize this expression. Perhaps the easiest is to differentiate with respect to $a$. This gives $s'(a) =-2\sum_{i=1}^ny_i+2na $ and this is zero when $a =\dfrac{\sum_{i=1}^ny_i}{n} $, the mean of the values.
Note that, since $s''(a) =2n > 0 $, this value of $a$ gives a minimum.
(added later)
An even easier way is to write $\bar{y} =\dfrac{\sum_{i=1}^ny_i}{n} $ and $\bar{y^2} =\dfrac{\sum_{i=1}^ny_i^2}{n} $.
Then
$\begin{array}\\ \frac1{n}s(a) &=\frac1{n}\sum_{i=1}^n y_i^2-2a\frac1{n}\sum_{i=1}^ny_i+a^2\\ &=a^2-2a\bar{y}+\bar{y^2}\\ &=a^2-2a\bar{y}+\bar{y}^2-\bar{y}^2+\bar{y^2}\\ &=(a-\bar{y})^2+\bar{y^2}-\bar{y}^2\\ \end{array} $
Since $\bar{y^2}-\bar{y}^2$ is independent of $a$, this is clearly a minimum when $a = \bar{y}$ and the value at the minimum is $\bar{y^2}-\bar{y}^2$.
Check your calculation, we have $$ \frac{(4-5)^2 + (4-3)^2+ (4-2)^2 + (4-7)^2 + (4-4)^2}{5} = \frac{15}{5} = 3 > 2.96. $$ The mean minimizes the MSE indeed, no contradiction here.