Proportion of polynomials of a fixed degree with a certain number of real roots

This is a bit too long to be a comment. Let me phrase this probabilistically by saying that the coefficients $c_i$ are independent (discrete) random variables uniformly distributed on the finite set $$ \big\{\, -M,-(M-1),\dotsc, -1,0, 1, \dotsc, (M-1), M\,\big\}. $$ Denote by $N_d(f)$ the number of real zeros of the random degree $d$ polynomial $f$ whose coefficients $c_i$ are random variables as above. Your question asks for the probability distribution of the random variable $N_d(f)$. This is a very difficult question for fixed $d$. However, for fixed $M$ and $d\to\infty$ some nontrivial information is available.

A highly nontrivial result of Ibragimov and Maslova

The mean number of real zeros of random polynomials. I. Coefficients with zero mean. (Russian) Teor. Verojatnost. i Primenen. 16 1971 229–248.

generalizing earlier work of Kac, Erdos-Offord and D.C. Stevens shows that the expectation of $N_d(f)$ satisfies the asymptotic estimate $\newcommand{\bE}{\mathbb{E}}$

$$ \mu_d:=\bE\big[ N_d(f)\big]\sim \frac{2}{\pi}\log d\;\;\mbox{as $d\to\infty$}. $$

Recently (see this paper) this result was improved to $$ \mu_d:=\bE\big[ N_d(f)\big]= \frac{2}{\pi}\log d+O(1)\;\;\mbox{as $d\to\infty$}. $$

Remarkably, the variance of the random variable $N_d$ is about the same size. A result of N. B. Maslova

The variance of the number of real roots of random polynomials, (Russian) Teor. Verojatnost. i Primenen. 19 (1974), 36–51.

shows that $\DeclareMathOperator{\var}{var}$ the variance of $N_d(f)$ satisfies the asymptotic estimate $$ \sigma^2_d:=\var\big[ N_d(f)\big]\sim \frac{1}{\pi}\left(1-\frac{2}{\pi}\right)\log d\;\;\mbox{as $d\to\infty$}. $$

This suggests that the typical polynomial in your family does not have too many roots, compared to its degree. Finally, another result of N. B. Maslova

The distribution of the number of real roots of random polynomials, (Russian. English summary) Teor. Verojatnost. i Primenen. 19 (1974),488–500.

shows that the normalized random variable $$ Z_d=\frac{1}{\sigma_d}\Big( N_d(f)-\mu_d\Big) $$ converges in distribution to a standard normal random variable. You can use this result to estimate the probability that the number of zeros of $f$ lies in an interval of the form $$ [\mu_d+a\sigma_d, \mu_d+b\sigma_d], \;\; a, b\in\mathbb{R}, \;\; a<b. $$


The quadratic case can be dealt with as follows. A quadratic polynomial $f(x) = ax^2 + bx + c \in \mathbb{Z}[x]$ has two distinct real roots if and only if $\Delta(f) = b^2 - 4ac > 0$, and a pair of complex conjugate roots if and only if $\Delta(f) < 0$.

We now let $a,b,c$ vary in the box $[-X,X]^3$. We first pick a pair $(a,c) \in \mathbb{Z}^2 \cap [-X,X]^2$. If $ac < 0$, that is, if the pair $(a,c)$ lies in two of four quadrants, then $\Delta(f) > 0$; hence 100% of quadratic polynomials with $a,c$ coming from those two quadrants have two real roots. The remaining two quadrants are symmetric to each other, so we might as well consider only the positive quadrant. We can exploit symmetry once more to assume that $a \leq c$, and from density considerations we can carve out the 0-density sets corresponding to $a = 0$ and $a =c$; whence we assume $0 < a < c$.

The count of triples $(a,b,c)$ satisfying $0 < a < c \leq X$ and $\Delta(f) < 0$ can be estimated by the triple integral

$$\displaystyle \int_1^X \int_1^c \int_{-2\sqrt{ac}}^{2 \sqrt{ac}} db da dc = \frac{8}{9} X^3 + O(X^2).$$

Multiplying by 4 to account for the assumption that $a \leq c$ and $a,c > 0$ (and using standard geometry of numbers arguments), we see that the total number of negative discriminant quadratic polynomials of height at most $X$ is

$$\displaystyle N^+(X) = \frac{32}{9} X^3 + O(X^2).$$

The number of positive discriminant forms is then

$$\displaystyle N^{-}(X) = 8X^3 - \frac{32}{9} X^3 + O(X^2) = \frac{40}{9} X^3 + O(X^2).$$

One can do a similar (but much more difficult) argument for cubic polynomials (binary forms), by exploiting the fact that for a cubic binary form $g(x,y) \in \mathbb{Z}[x,y]$, its Hessian covariant $q_g(x,y)$ (which is a quadratic form) has discriminant $-3\Delta(g)$; and hence the problem of counting cubic binary forms with three or one real linear factors is reduced to dealing with the Hessians. However, the inequalities involved are no longer linear in general, and hence the application of geometry of numbers methods will be more complicated. Cremona also worked out the exact conditions for quartic polynomials to have 0, 2, or 4 real roots in https://homepages.warwick.ac.uk/~masgaj/papers/r34jcm.pdf

I suspect the methods I used above become intractable very quickly with respect to the degree, so perhaps a different formulation is necessary to make progress.

Addendum: I should add that the answer to the question is known for degrees 3 and 4 if instead one counts $\operatorname{GL}_2(\mathbb{Z})$-classes of binary forms (of degrees 3 and 4 respectively) with respect to an appropriate $\operatorname{GL}_2(\mathbb{Z})$-invariant height. In particular, when $d = 3$ and we put the height as the discriminant, we have

\begin{align*} N_3(X)& = \# \{F = a_3 x^3 + a_2 x^2 y + a_1 xy^2 + a_0 y^3 \in \mathbb{Z}[x,y]: |\Delta(F)| \leq X\} \\ & = \frac{\pi^2}{18} X + O(X^{5/6}), \end{align*} and $N_3^{\pm}(X)$ (which counts the number of forms of bounded positive/negative discriminant, respectively) is given by

$$\displaystyle N_3^+(X) = \frac{\pi^2}{72} X + O(X^{5/6}), N_3^-(X) = \frac{\pi^2}{24} X + O(X^{5/6}).$$

The main term was first obtained by Davenport, and the error term as given was obtained by Shintani. Taniguchi and Thorne and Bhargava-Shankar-Tsimerman independently obtained a secondary main term of order $X^{5/6}$. If one includes this secondary term, then the error term is $O(X^{3/4+\varepsilon})$.

For the degree 4 case, if we put the height as $H(F) = \max\{|I(F)|^3, J(F)^2/4\}$, as in Bhargava-Shankar (http://annals.math.princeton.edu/2015/181-1/p03), and put $N_4^{(0)}(X), N_4^{(1)}(X), N_4^{(2)}(X)$ for the number of $\operatorname{GL}_2(\mathbb{Z})$-classes of integral quartic forms of height at most $X$ with 0 pairs of complex conjugate linear factors, 1 pair of complex conjugate linear factors, and 2 pairs of complex conjugate linear factors respectively. They showed that

$$\displaystyle N_4^{(0)}(X) = \frac{4 \zeta(2)}{135} X^{5/6} + O_\varepsilon(X^{3/4 + \varepsilon}),$$ $$\displaystyle N_4^{(1)}(X) = \frac{32 \zeta(2)}{135} X^{5/6} + O_\varepsilon(X^{3/4 + \varepsilon}),$$ and $$\displaystyle N_4^{(2)}(X) = \frac{8 \zeta(2)}{135} X^{5/6} + O_\varepsilon(X^{3/4 + \varepsilon}).$$

Sorting by a $\operatorname{GL}_2(\mathbb{Z})$-invariant height is likely a more natural question and possibly easier than the original question.


Liviu's answer is very informative, though it answers a question orthogonal to that of the OP. I believe (Liviu can correct me if I am wrong), the results don't actually depend on the coefficients being discrete, and qualitatively, the results are not much different when the coefficients are uniform centered real random numbers (experiment certainly bears this out). If so, then the obvious rescaling shows the quantity is essentially independent of $M$ (the error terms [essentially addressing discretization errors] will, of course depend on $M,$ but I am guessing that this is quite hard).