Distance between distributions and distance of moments

The natural thing to compare in this context seems to be the moment generating functions of $X_n$ and $X$. In particular, consider: \begin{align*} \mathbf{E} \exp(t X_n) - \mathbf{E} \exp(t X) &= \sum_{k=0}^{\infty} \frac{t^k}{k!} \left( \mathbf{E} X_n^k - \mathbf{E} X^k \right) \\ &\le \frac{1}{\sqrt{n}} \sum_{k=0}^{\infty} \frac{t^k}{k!} (k/2)! = \frac{1}{\sqrt{n}} \left( 1 + e^{t^2/4} \sqrt{\pi} t + e^{t^2/4} \sqrt{\pi} t \operatorname{Erf}(t/2) \right) \end{align*} where we used the hypothesis given by the OP.

Given this bound on their moment generating functions (or a similar bound on the characteristic function), to what extent do the laws of $X_n$ and $X$ agree? There seems to be an interesting discussion about this in the statistics literature.

  • McCullagh, Peter. "Does the moment-generating function characterize a distribution?" The American Statistician 48.3 (1994): 208-208.
  • Waller, Lance A. "Does the characteristic function numerically distinguish distributions?" The American Statistician 49.2 (1995): 150-152.
  • Luceño, Alberto. "Further evidence supporting the numerical usefulness of characteristic functions." The American Statistician 51.3 (1997): 233-234.

The classical approach to this is called "smoothing" and is described in Chapter 16 of Feller's Introduction to Probability Theory, Vol 2. Basically you use the bounds on the moment differences to bound the difference between the characteristic functions near the origin, then adjust a parameter (called $T$ by Feller) to get the best result.