Approximating the normal CDF for $0 \leq x \leq 7$
This isn't a definitive answer, but hopefully points you in the right direction. The short of it is, I don't know how the coefficients were derived, but it is likely related to some Pade, Remez, interpolating, or least-squares approximant. Read on for details.
The function we're talking about is $$ Q(x) = \frac{1}{\sqrt{2\pi}}\int_{x}^\infty e^{-t^2/2}\;dt = \frac{1}{2}\operatorname{erfc}\left(\frac{x}{\sqrt{2}}\right), $$ for $x>0$, where $\operatorname{erfc}$ is the complimentary error function.
The link you provide gives code for an approximation of the following form, slightly different than how it was stated in the original question (note the lack of normalization by $\sqrt{2\pi}$ and a different form of the polynomials). $$ \tilde{Q}(x) = e^{\frac{-x^2}{2}} \frac{(((((N_6 z+N_5)z+N_4)z+N_3)z+N_2)z+N_1)z+N_0}{((((((M_7 z+M_6)z+M_5)z+M_4)z+M_3)z+M_2)z+M_1)Z+M_0} $$ Why do we pick this form? Well, asymptotically, the $Q$ function is well approximated by $\frac{e^{-x^2/2}}{x\sqrt{2\pi}}$ as $x\rightarrow\infty$. See here for details on that. This justifies factoring out the asymptotic behavior and looking for a rational approximation of what remains. If we factored out the asymptotic form, we could approximate what remains with something that goes to $1$ for large $x$, i.e., something with equal order in the numerator and denominator. If we leave that extra factor of $x$ in the denominator from the asymptotic piece, we can account for it by absorbing it into the denominator of the rational function, giving one higher degree in the denominator polynomial. The same goes for the factor of $\sqrt{2\pi}$, it gets absorbed in to the approximation.
Now, how on earth did Hart come up with those $N_i$ and the $M_j$ coefficients? I thought it was by requiring that the Taylor series of $\tilde{Q}(x)e^{x^2/2}$ matches that of $Q(x)e^{x^2/2}$ at some a point, up to a high enough order to uniquely determine all the coefficients. This is the Pade approximation technique, and you should be able to find more about the error bounds for approximations resulting from this technique. I used Mathematica to calculate these below.
I thought that surely, if you calculate that Pade approximant around $x=0$, and scale the coefficients to match this weird $M_0=440.413735824752$ (Pade approximants are usually normalized so that the constant term in the denominator is $1$) you'd get the formula. Instead, I get:
$$ \begin{array}{ccc} i & N_i & M_i\\ 0 & 220.207 & 440.414\\ 1 & 152.909 & 711.438\\ 2 & 95.568 & 525.241 \\ 3 & 23.168 & 234.066 \\ 4 & 4.6336 & 61.9813\\ 5 & 0.34752 & 10.2661 \\ 6 & 0.154453 & 1.04799 \\ 7 & {} & 0.048886 \end{array} $$
It's a similar pattern to Hart's numbers, and the values are all in the same ballpark as Hart's numbers, but it is not a match. Then I thought that there is this weird breakpoint in the code that appears to be $\frac{10}{\sqrt{2}}\approx 7.07$. This lead me down the path of looking to see if the Pade approximant near $\frac{10}{\sqrt{2}}$ or at the halfway point of the interval, $\frac{10}{2\sqrt{2}}$, appropriately scaled, gave Hart's original coefficients. They don't.
So here are some possibilities.
- Hart's coefficients are for a Pade approximant around SOME point on the interval $[0,10/\sqrt{2}]$, and I don't have the patience to find which point.
- Hart used some other method to generate these coefficients, and they don't correspond to a Pade approximant at all. One candidate is the Remez algorithm for rational functions, which should guarantee the error oscillates equally across the domain of approximation. Another candidate is that the coefficients are derived from an interpolation between some fixed points with known values of the domain. Yet another is a least-squares fit between the rational approximant a a bunch of known values of the function.
This has been a long winded answer, and boils down to "I don't know". But I hope the techniques I mentioned can be fruitful avenues for you to continue looking into this. I would finally suggest getting the original "Algorithm 5666 for the error function" and seeing if there is an explanation of the methodology in book in which it appears.