Why do Todd classes appear in Grothendieck-Riemann-Roch formula?
You look at the case when $X=D$ is a Cartier divisor on $Y$ (so that the relative tangent bundle -- as an element of the K-group -- is the normal bundle $\mathcal N_{D/X}=\mathcal O_D(D)$ (conveniently a line bundle, so is its own Chern root), and $\mathcal F=\mathcal O_D$. And the Todd class pops out right away.
Indeed from the exact sequence $0\to \mathcal O_Y(-D) \to \mathcal O_Y\to \mathcal O_D\to 0$, you get that $$ch(f_! \mathcal O_D)=ch(O_Y(D))= ch(\mathcal O_Y) - ch(\mathcal O_Y(-D)) = 1- e^{-D}.$$ And you need to compare this with the pushforward of $[D]$ in the Chow group, which is $D$. The ratio $$ \frac{D}{1-e^{-D}} = Td( \mathcal O(D) )$$ is what you are after. Now you have just discovered the Todd class. I suspect that this is how Grothendieck discovered his formula, too -- after seeing that this case fits with Hirzebruch's formula, that the same Todd class appears in both cases.
Yes you are right! You can in fact prove that the Todd class is the only cohomology class satisfying a GRR-type formula.
Indeed, assume that for any smmoth quasiprojective variety $X$, you have an invertible cohomology class $\alpha(X)$ satisfying that:
(i) for any proper morphism $f \colon X \rightarrow Y$ between smooth quasi-projective morphism and for any bounded complex $\mathcal{F}$ of coherent sheaves on $X$, $f_{*}(ch(\mathcal{F})\alpha(X))=ch(f_{!}\mathcal{F})\alpha(Y)$.
(ii) for any $X$ and $Y$, $\alpha(X \times Y)=pr_1^*\alpha(X) \otimes pr_2^*\alpha(Y) $ (this is a kind of base change compatibility condition).
Then for any $X$, $\alpha(X)$ is the Todd class of $X$. In fact, it is sufficient to know (i) for closed immersions and (ii) for $X = Y$.
Here is a quick proof:
1-First you prove GRR for arbitrary immersions. This is done in two steps:
(a) $Y$ is a vector bundle over $X$ and f is the immersion of $X$ in $Y$, where $X$ is identified with the zero section of $Y$. Then $\mathcal{O}_X$ admits a natural locally free resolution on $Y$ which is the Koszul resolution. Then a direct computation gives you that $ch(\mathcal{O}_X)$ is the Todd class of $E^* $, which is therefore the Todd class of the conormal bundle $N^*_{X/Y}$. Thus the Todd class pops out this computation just like in the divisor case.
(b) For an arbitrary closed immersion $f \colon X \rightarrow Y$, a standard deformation technique (which is called deformation to the normal cone) allows to deform $f$ to the immersion of $X$ in its normal bundle in $Y$, and then to use part (a)
2-Then you compare the two GRR formulas you have for the diagonal injection $\delta$ of $X$ in $X \times X$: the one with Todd classes and the ones with alpha classes. It gives you the identity $\delta_* (td(X) \delta^* td(X \times X)^{-1}) = \delta_* (\alpha(X) \delta^* \alpha(X \times X)^{-1})$, so that $\delta_* td(X)^{-1} = \delta_* \alpha(X)^{-1}$. Then you get $\alpha(X)=td(X)$ by applying $pr_1* $.
Here is another sanity check. Consider the map $f$ from $X$ to a point. The (topological) Euler characteristic of $X$ is $$\sum (-1)^k h^k(X) = \sum (-1)^{p+q} h^{pq}(X) = \sum (-1)^{p+q} h^q(X, \Omega^p) = \sum (-1)^p f_{!} \Omega^p.$$
If the Chern roots of the tangent bundle are $r_1$, $r_2$, ... $r_n$, then the Chern character of $\Omega^p$ is $e_p(e^{-r_1}, e^{-r_2}, \ldots, e^{-r_n})$, where $e_p$ is the $p$-th elementary symmetric function.
So the Euler characteristic of $X$ is $$\sum (-1)^p f_* \left( e_p(e^{-r_1}, e^{-r_2}, \ldots, e^{-r_n}) t_f \right) = f_* \left( \prod(1-e^{-r_i}) t_f \right).$$
Now, what has class $u$ has the property that $f_*(u)$ is the Euler characteristic of $X$? The top chern class of $T_X$, in other words, $\prod r_i$. So it seems very likely that we should have $$\prod(1-e^{-r_i}) t_f= \prod r_i$$ and $$t_f = \prod \frac{r_i}{1-e^{-r_i}}.$$
That's not a proof, because there are other classes with the same pushforward and because $\prod(1-e^{-r_i})$ is a zero-divisor, but I find it persuasive.