Questions about the degree of freedom in General Relativity
The key point in all of this is that general relativity is a gauge theory, and, as the saying goes, "the gauge always hits twice" (apparently attributed to Claudio Teitelboim). What this means is that (1) you have an arbitrary freedom in defining your evolution, corresponding to the ability to make gauge transformations, and (2) some of the evolution equations will be constraints. This second fact means that you are not allowed to choose arbitrary initial data for your theory; rather, the initial data that you pick is subject to the constraints, which arise since your action is gauge invariant.
It's usually easiest to start with vacuum electrodynamics. There the equations of motion read $$\partial^\mu(\partial_\mu A_\nu - \partial_\nu A_\mu)=0.$$ Not all of these equations are second order in time; just look at the $\nu=0$ component: $$\partial_t^2 A_0 - \nabla^2A_0 -\partial_t(\partial_t A_0 - \nabla\cdot\vec{A}) = 0 \\ \implies\partial_t\nabla\cdot\vec{A}-\nabla^2A_0 = 0.$$
This is basically the $\nabla\cdot \vec{E} = 0$ vacuum Maxwell equation (i.e. Coulomb gauge with $\nabla\cdot\vec{A}=0$ and $\vec{E} = -\nabla A_0$). This is a constraint on your initial data, because you are not allowed to make an arbitrary choice for $(A_0, \vec{A})$ and $(\partial_t A_0, \partial_t \vec{A})$; rather, they need to satisfy this constraint. So this cuts down the number of initial conditions from 4 to 3. Then the gauge transformation $A_\mu \mapsto A_\mu + \partial_\mu \lambda$ allows you to cut off another piece of initial data, by imposing a gauge fixing condition (i.e. $\nabla\cdot\vec{A}=0$). This brings us to 2 degrees of freedom.
For general relativity, you now have 4 gauge freedoms generated by diffeomorphisms described by a vector $\xi^\mu$. So applying the maxim, we should expect to cut down $4\times2=8$ degrees of freedom. In fact the Bianchi identity tells where to look for the constraints. Let's expand it out a bit: $$0=\nabla_\mu G^{\mu\nu} = \partial_0 G^{0\mu}+\partial_i G^{i\mu} + \Gamma^\mu_{\mu\alpha}G^{\alpha \nu}+ \Gamma^{\nu}_{\mu\alpha}G^{\mu\alpha}.$$ This tells us that the first time derivative ($\partial_0$) of $G^{0\mu}$ is related to spatial derivatives of $G^{i\mu}$ as well as terms with no derivatives of $G^{\mu\alpha}$. The important thing here is that this is an identity, so it holds even if you don't impose the vacuum Einstein equations $G^{\mu\nu}=0$. The tensor $G^{\mu\nu}$ has two derivatives of the metric in it. But if $G^{0\mu}$ had two time derivatives appearing, there would be no way to satisfy the Bianchi identity because no other term in the identity has three time derivatives acting on the metric. This means $G^{0\mu}$ are not evolution equations--they involve only one time derivative of the dynamical variables, and thus are initial value constraints. So that kills 4 degrees of freedom, and you kill 4 more from gauge fixing. This is how you get the $10-4-4=2$ degrees of freedom in general relativity.
And in regards to your second question, yes general relativity describes the two degrees of freedom of a massless spin-2 particle.
It is interesting to look at a linearized version of gravity, with $g_{\mu\nu} = \eta_{\mu\nu} + h_{\mu\nu}$
If you choose the Lorentz gauge : $$\partial^\mu \bar h_{\mu\nu}=0 \quad\quad \bar h_{\mu\nu} = h_{\mu\nu} - \frac{1}{2} h^i_i \,\eta_{\mu\nu} \tag{0}$$ the equations of movement in the vaccuum are simply : $$\square \bar h_{\mu\nu}=0 \tag{1}$$
The Lorentz gauge kills $4$ degrees of freedom. Moreover, there is a residual gauge freedom compatible with the Lorentz gauge, we may consider the transformation:
$$ h_{\mu\nu} \to h_{\mu\nu} + \partial_\mu \xi_\nu+ \partial_\nu \xi_\mu \quad\quad \square \xi_\mu = 0\tag{2}$$ In terms of the $\bar h_{\mu\nu}$, this gives :
$$\bar h_{\mu\nu} \to \bar h_{\mu\nu} + \partial_\mu \xi_\nu+ \partial_\nu \xi_\mu - (\partial^i \xi_i) \eta_{\mu\nu} \tag{3}$$
It is easy to see that this transformation is compatible with the Lorentz gauge, and you have absolute freedom on the $\xi_\mu$, so it kills $4$ other degrees of freedom.
Finally, you will get $10-4-4=2$ degrees of freedom.