Definition of metric tensor and line elements
The text book is a bit misleading as it uses the natural scalar product on $\mathbb R^n$ to define the metric but if we ignore this fact everything is good. Suppose we are talking about polar coordinates that is $x= r \cos \theta$ and $y = r \sin \theta$. Then the "covariant basis vectors" would be (choosing $X_1 \leftrightarrow \hat r$ and $X_2 \leftrightarrow \hat \theta$)
$$\hat r = \frac{\partial x}{ \partial r} \hat x + \frac{\partial y}{ \partial r} \hat y = \cos \theta \,\hat x + \sin \theta \,\hat y $$
and similarly
$$\hat \theta = \frac{\partial x}{ \partial \theta} \hat x+ \frac{\partial y}{ \partial \theta} \hat y = -r\sin \theta \,\hat x + r \cos \theta \,\hat y$$
The components of the metric tensor is then
$$g_{rr} = \hat r \cdot \hat r =1 \qquad g_{\theta\theta} = \hat \theta\cdot\hat \theta = r^2 $$
and the other components $g_{r\theta} = \hat r \cdot \hat \theta$ are zero. So the total line element is then
$$ \mathrm{d}l^2= \mathrm{d}r^2+ r^2 \mathrm{d}\theta^2$$
To get back to your example, first note that the metric you gave is not a good one because it has zero component for $x_1=0$ and even negative one for $x_1<0$. Suppose we chose the line element
$$\mathrm{d}l^2=2 \mathrm{d}X_1^2+ X_1^2 \mathrm{d}X_2^2$$
instead. You can clearly see the how the polar coordinate line element is related to this one. The book assumes that you know first an equation like the polar coordinate one $x= r \cos \theta$ and $y = r \sin \theta$ and then tells you how to compute the metric from this information, which is what I did above.
Now of course, line elements are much more general than that and a priori you can write down a line element without referring to any "covariant basis vectors" or stuff like that (hence my complaint in the beginning). I think this was your main confusion. You wrote down a metric and asked where are the "covariant basis vectors"? The upshot is you don't need them!
The metric tensor's components $g_{\mu\nu}$ is defined as the inner product $(e_\mu,e_\nu)$ where $e_\mu$ and $e_\nu$ are basis of some vector space $V$. The inner product defined for the vector space is linear. One can also show that $\bf{g}=\it{g_{\mu\nu}e^\mu\otimes e^\nu}$ linearly maps from $V\times V$ to $\mathbb{R}$. In differential geometry these basis vectors are defined as $$e_\mu=\partial_\mu\vec{R}\tag{1}$$ Where $\vec{R}$ is an arbitrary vector expressed in the coordinate system. Since a line element is defined as $$ds^2=(dx^\mu e_\mu, dx^\nu e_\nu)=\bf{g}\langle dx,dx \rangle$$ Using linearity $$ds^2=dx^\mu dx^\nu g_{\mu\nu}$$ Using $(1)$ $$ds^2=dx^\mu dx^\nu (\partial_\mu\vec{R},\partial_\nu\vec{R})$$ This is the relation between the line element and definition of metric in terms of the basis.
Why is $(X_1,X_1)=2$?
That is what comes when you take inner product of the basis. Let me take a well known example to demonstrate. Consider radial coordinate system $(r,\theta)$. We know that In Cartesian system $x=r\cos(\theta)$ and $y=r\sin(\theta)$. Let the vector $\vec{R}=xe_x + ye_y$. Solving for $e_r$ and $e_\theta$: $$e_r=\cos(\theta)e_x+\sin(\theta)e_y$$ $$e_\theta=-r\sin(\theta)e_x+r\cos(\theta)e_y$$ The metric components are $$g_{rr}=(e_r,e_r)=1$$ $$g_{r\theta}=g_{\theta r}=(e_r,e_\theta)=0$$ $$g_{\theta\theta}=r^2$$ Just like in polar coordinates $(e_r,e_r)=1$ a coordinate system can be defined such that it is $2$.
Hope this helps.
Note that in the notation I have used $(,)$ is the inner product and $\langle,\rangle$ is an ordered pair of vectors belonging to $V\times V$.