Approximating a convex disk by an ellipse
Not an answer, just an illustration to accompany the question. $K$ is an isosceles triangle with base $2$ and altitude $3$ (and so area $3$). First, I mistakenly computed the ellipse $E$ of any area with the smallest area symmetric difference with $K$. It has area about $2.4$:
After Gerhard's comment, I recomputed constraining $E$ to have area $3$. Then its center is $(0,1)$ and its axes are roughly $0.74$ and $1.29$:
I don't have a proof, but I have an idea which suggests the answer is no, the minimal ellipse may not be unique. Right or wrong, I hope someone will generate a picture to illustrate the idea, and see if in addition there is a near-octagon which exhibits the pair of minimal ellipses.
I use eight-fold symmetry and restrict attention to the positive quadrant. Draw a quarter circle of unit radius, a nearly circular ellipse quarter with major axis $1+ \epsilon$ and minor axis $1/(1+\epsilon)$, and a reflection about $x=y$ of this quarter ellipse. (The ellipses axes lie on the axes bordering the quadrant.) Consider the point of intersection $P$ between the ellipses and the line x=y. One vertex of the proposed octagon will be $P$, and another will be the point $Q=(0, 1+ \epsilon)$. There will be a curve between the two which bisects that portion of the symmetric difference between the two ellipses.
I challenge the illustrator to find a curve which does such an area bisection, and induces a near octagon which does not have a smaller symmetric difference with the unit circle. I believe it possible because 1) the point $P$ is far enough in the interior of the circle that much of the circle "sticks out", and 2) the freedom one has in bisecting the portion of the ellipse symmetric difference. This may not prove that the given ellipses are minimal, but it may be possible to draw the curve to show that any minimal ellipse must have a reflection which is also minimal.
Gerhard "Easily Writes One Thousand Words..." Paseman, 2016.12.23
Figure added by J.O'Rourke. $\epsilon=\frac{1}{10}$.
Figure added by Jairo Bochi.
Now the answer is almost complete: modulo some extra work on the strictness of relevant inequalities, we do have the uniqueness. The additional ideas used to come to this conclusion are these: (i) to kind of reduce approximating a convex set by an ellipse by, vice versa, approximating an ellipse (and even a round disk) by shear-shifted versions of the convex set; (ii) shear-symmetrization; and
(iii)
minimizing the symmetric difference given the same areas is equivalent to maximizing the area of the intersection (the latter observation borrowed from Matt F.).
How to show that the best possible ellipse is unique?
Suppose that $E_1$ and $E_2$ are distinct best possible ellipses. By an appropriate rescaling of everything in the directions of principal axes of $E_1$, without loss of generality (wlog) $E_1$ is a round disk, say $D$, of radius $1/\sqrt\pi$.
Then the width of $E_2$ in some direction is the same as that of the round disk $E_1$ (that is, $2/\sqrt\pi$), where the width of a set in a given direction is defined as the width of the narrowest infinite band that contains the set and goes in the direction perpendicular to the given one. This follows because the product of the widths of ellipse $E_2$ in the directions of its principal axes is $(2/\sqrt\pi)^2$.
So, there exist the following: (i) a real $t$; (ii) a vector $b\in\mathbb{R}^2$; and (iii) an orthonormal basis $(e_1,e_2)$ of $\mathbb{R}^2$ such that, for the (shear-shift) affine operator $A=A_t=A_{b,t}$ on $\mathbb{R}^2$ defined by the conditions $A\mathbf0=b$, $Ae_1=b+e_1$, and $Ae_2=b+e_2+te_1$, we have $AE_1=AD=E_2$.
Let then $E_0:=\dfrac{I+A}2\,D$, where $I$ is the identity operator. Then $E_0$ is an ellipse of area $1$.
For real $y$, let $[u,v]=[u(y),v(y)]=K_y:=K(y):=\{x\in\mathbb R\colon xe_1+ye_2\in K\}$ be the $y$-"cross-section" of $K$. Similarly define the $y$-"cross-sections" $E_1(y)=[r_1,s_1]$ and $E_2(y)=[r_2,s_2]$ of $E_1$ and $E_2$. Then the $y$-"cross-section" $E_0(y)$ of $E_0$ is $[r_0,s_0]=[\frac{r_1+r_2}2,\frac{s_1+s_2}2]$. Let $|\cdot|$ denote the Lebesgue measure on $\mathbb{R}$ or $\mathbb{R}^2$. Then $|E_j\cap K|=\int_{\mathbb{R}}\delta_j(y)\,dy$ for $j=0,1,2$, where $\delta_j(y):=|E_j(y)\cap K(y)|$. So, if we could show that \begin{equation} \delta_0(y)\ge\tfrac12\,\delta_1(y)+\tfrac12\,\delta_2(y) \tag{*} \end{equation} for all $y$ and that this inequality is strict for some $y$ given that $E_1$ and $E_2$ are distinct, then we would obtain the desired contradiction: $|E_0\cap K|>\tfrac12\,|E_1\cap K|+\tfrac12\,|E_2\cap K|=|E_1\cap K|=|E_2\cap K|$. Here we have used the mentioned additional idea (iii).
Unfortunately, inequality $(*)$ does not hold in general, without any assumptions on the convex sets $E_j$. Here we need the mentioned additional ideas (i) and (ii).
Consider first the round disk $E_1=D$, which is an optimal approximation of $K$. I then claim that for all the $y$-"cross-sections" of $E_1$ and $K$ we have $\frac{r_1+s_1}2=\frac{u+v}2$. Indeed, since $D$ is an optimal approximation of $K$, it is also an optimal approximation of $K^-$, where $K^-$ is obtained from $K$ by the reflection about the diameter of $D$ parallel to $e_2$. Wlog $D$ is centered at the origin. So, for all $y$ we have $r_1=-s_1$, and the $y$-"cross-section" of $K^-$ is $K^-_y=[-v,-u]=-K_y$. Then for $K^0:=\frac12\,(K+K^-)$ we can check that
$|D_y\cap K^0_y|\ge\tfrac12\,|D_y\cap K_y| +\tfrac12\,|D_y\cap K^-_y|$ for all $y$. So,
the disk $D$ approximates $K^0$ no worse than it does $K$ (or $K^-$). Extra work is needed here to show that $D$ approximates $K^0$ (strictly) better than it does $K$ (or $K^-$) unless $K^-=K$.
But $K^0$ is the image of $K$ under an (area-preserving) shear-shift transformation (say $B$). So, modulo the mentioned extra work, we will conclude that $B^{-1}E_1=B^{-1}D$ is a better (than $D$) approximation of $K$ -- unless $K^-=K$. So, wlog $K^-=K$, which verifies the claim $\frac{r_1+s_1}2=\frac{u+v}2$. Similarly, by shear-shifting, $\frac{r_2+s_2}2=\frac{u+v}2$. Given these two conditions, one can verify that $(*)$ holds (in that verification, wlog $\frac{r_1+s_1}2=\frac{r_2+s_2}2=\frac{u+v}2=0$). It will remain to show that inequality $(*)$ will be strict for some $y$ unless $E_1=E_2$.
For readers' convenience(?), I am retaining below, under the horizontal line, the previous version of the answer.
This is not a complete answer, but I think it has a chance to lead to one.
I think the best possible ellipse is unique. Suppose that $E_1$ and $E_2$ are distinct best possible ellipses. By an appropriate rescaling of everything in the directions of principal axes of $E_1$, without loss of generality $E_1$ is a round disk, say $D$, of radius $1/\sqrt\pi$.
Then the width of $E_2$ in some direction is the same as that of the round disk $E_1$ (that is, $2/\sqrt\pi$), where the width of a set in a given direction is defined as the width of the narrowest infinite band that contains the set and goes in the direction perpendicular to the given one. This follows because the product of the widths of ellipse $E_2$ in the directions of its principal axes is $(2/\sqrt\pi)^2$.
So, there exist the following: (i) a real $t$; (ii) a vector $b\in\mathbb{R}^2$; (iii) an orthonormal basis $(e_1,e_2)$ of $\mathbb{R}^2$; and (iv) a (shearing) affine operator $A$ on $\mathbb{R}^2$ such that $Ae_1=b+e_1$, $Ae_2=b+e_2+te_1$, and $AE_1=AD=E_2$.
Let then $E_0:=\dfrac{I+A}2\,D$, where $I$ is the identity operator. Then $E_0$ is an ellipse of area $1$.
For real $y$, let $[u,v]=[u(y),v(y)]=K(y):=\{x\in\mathbb R\colon xe_1+ye_2\in K\}$ be the $y$-"cross-section" of $K$. Similarly define the $y$-"cross-sections" $E_1(y)=[r_1,s_1]$ and $E_2(y)=[r_2,s_2]$ of $E_1$ and $E_2$. Then the $y$-"cross-section" $E_0(y)$ of $E_0$ is $[r_0,s_0]=[\frac{r_1+r_2}2,\frac{s_1+s_2}2]$. Let $\oplus$ denote the symmetric difference, and let $|\cdot|$ denote the Lebesgue measure on $\mathbb{R}$ or $\mathbb{R}^2$. Then $|E_j\oplus K|=\int_{\mathbb{R}}\delta_j(y)\,dy$ for $j=0,1,2$, where $\delta_j(y):=|E_j(y)\oplus K(y)|$. So, if we could show that \begin{equation} \delta_0(y)\le\tfrac12\,\delta_1(y)+\tfrac12\,\delta_2(y) \tag{*} \end{equation} and this inequality is strict for some $y$ given that $E_1$ and $E_2$ are distinct, then we would obtain the desired contradiction: $|E_0\oplus K|<\tfrac12\,|E_1\oplus K|+\tfrac12\,|E_2\oplus K|=|E_1\oplus K|=|E_2\oplus K|$.
Unfortunately, inequality $(*)$ does not hold in general, without any assumptions on the convex sets $E_j$. However, it will hold for all real $y$ such that $E_j(y)\cap K(y)\ne\emptyset\ \forall j\in\{1,2\}$ or $K(y)=\emptyset$.
Anyway, we need $(*)$ to hold on the average, given that both ellipses $E_1=D$ and $E_2$ are optimal approximations of $K$; this "average" inequality seems likely.