Why is the Chebyshev function relevant to the Prime Number Theorem

There are several ideas here, some mentioned in the other answers:

One: When Gauss was a boy (by the dates found on his notes he was approximately 16) he noticed that the primes appear with density $ \frac{1}{\log x}$ around $x$. Then, instead of counting primes and looking at the function $\pi (x)$, lets weight by the natural density and look at $\sum_{p\leq x} \log p$. Since we are weighting by what we think is the density, we expect it to be asymptotic to be $x$.

Two: Differentiation of Dirichlet series. If $$ A(s)=\sum_{n=1}^{\infty} a_{n} n^{-s} $$ then $$ (A(s))'=-\sum_{n=1}^{\infty} a_{n} log(n) n^{-s}$$ The $\log$ term appears naturally in the differentiation of Dirichlet series. Taking the convolution of $\log n$ with the Mobius function (that is multiplying by $\frac{1}{\zeta(s)}$) then gives the $\Lambda(n)$ mentioned above. The $\mu$ function is really the special thing here, not the logarithm.

Expanding on this, there are other weightings besides $\log p$ which arise naturally from taking derivatives. Instead we can look at $\zeta^{''}(s)=\sum_{n=1}^\infty (\log n)^2 n^{-s}$, and then multiply by $\frac{1}{\zeta(s)}$ as before. This leads us to examine the sum $$\sum_{n\leq x} (\mu*\log^2 )(n)$$ (The $*$ is Dirichlet convolution) By looking at the above sum, Selberg was able to prove his famous identity which was at the center of the first elementary proof of the prime number theorem:

$$\sum_{p \leq x} log^2 p +\sum_{pq\leq x}(\log p)(\log q) =2x\log x +O(x).$$

Three: The primes are intimately connected to the zeros of $\zeta(s)$, and contour integrals of $\frac{1}{\zeta(s)}$. (Notice it was featured everywhere here so far) We can actually prove that

$$\sum_{p^k \leq x} \log p= x - \sum_{\rho :\ \zeta(\rho)=0} \frac{x^{\rho}}{\rho} +\frac{\zeta'(0)}{\zeta(0)} $$

Notice that the above is an equality, which is remarkable since the left hand side is a step function. (Somehow, at prime powers all of the zeros of zeta conspire and make the function jump.)


The distribution of sums $\sum_{p\leq x}f(p)$ is intimately related to the analytic properties of the Dirichlet series $\sum_{p\leq x}f(p)p^{-s}$. If $f(p)$ equals the von Mangoldt function, perhaps twisted by some automorphic data like values of a Dirichlet character or Hecke eigenvalues of a cusp form, then $\sum_{p\leq x}f(p)p^{-s}$ equals, up to a Dirichlet series supported on higher prime powers, the negative logarithmic derivative of an automorphic $L$-function $L(s)$, as follows from the Euler product decomposition of $L(s)$. The function $L(s)$ itself is nice, in addition to being an Euler product: it has a meromorphic continuation with at most a few poles and good behavior in vertical strips, has a functional equation, and conjecturally all its zeros lie on the axis of symmetry (Generalized Riemann Hypothesis). The logarithmic derivative $L'(s)/L(s)$ is therefore also meromorphic with simple poles at the zeros and poles of $L(s)$.

In the case of $f(p)=\Lambda(p)$, we have $L(s)=\zeta(s)$ so that there is a simple pole at $s=1$ with residue $1$ and also at every zero of $\zeta(s)$ with residue equal to the negative multiplicity of the zero. Conjecturally the zeros all lie on $\Re(s)=1/2$ which makes $\sum_{n\leq x}\Lambda(n)$ change rather smoothly: it is $x+O(x^{1/2+\epsilon})$. Of course $\sum_{p\leq x}\log(p)$ differs from $\sum_{n\leq x}\Lambda(n)$ only by $O(x^{1/2+\epsilon})$.


It seems to me that the second Chebyshev function $\psi(x) = \sum_{n \le x} \Lambda(n)$ (where $\Lambda$ is the von Mangoldt function) is more natural, but as Gjergji says the two approximate each other. This is because the behavior of the second Chebyshev function is related by certain general theorems that I'm not familiar with to the behavior of the Dirichlet series

$$\sum_{n \ge 1} \frac{\Lambda(n)}{n^s}$$

and this Dirichlet series is precisely $\frac{-\zeta'(s)}{\zeta(s)}$, or the negative logarithmic derivative of $\zeta(s)$. This has the following intuitive interpretation: if we think of $\zeta(s)$ as the partition function

$$\zeta(s) = \sum_{n \ge 1} e^{-s \log n}$$

of the Riemann gas, then the negative logarithmic derivative of a partition function describes the expected value of energy at a given temperature, a fundamental property.

Edit: This observation goes at least as far back as Mackey, Unitary Group Representations in Physics, Probability, and Number Theory (1978), who wrote:

...Our main point here is that one could have been led to the main outline of the proof of the prime number theorem by using the physical interpretation of Laplace transforms provided by statistical mechanics. In particular, the function $-\frac{\zeta'}{\zeta}$ whose representation as a Dirichlet series (Laplace transform with discrete measure) plays a central role in the proof has a direct physical interpretation as the internal energy function.

Regarding why it is natural to assign the state $n$ the energy $\log n$, the point is that for $s > 1$ we get the only probability distributions on the natural numbers which satisfy certain natural properties; see this blog post, for example.

Of course a basic purely mathematical reason to consider the logarithmic derivative is the ideas around the argument principle.