Proving $\sum_{k=1}^{n}{(-1)^{k+1} {{n}\choose{k}}\frac{1}{k}=H_n}$

In the coupon collector's problem, the expected value of the number $N$ of coupons required for a full collection of $m$ coupon types can be determined in two different ways. In the standard treatment, the expected numbers of coupons to get each new type are added up, yielding the result $mH_m$. But we can also determine this expectation more mundanely from the distribution:

\begin{align} \def\stir#1#2{\left\{{#1\atop #2}\right\}} E[N]={}&\sum_{n=0}^\infty P(N\gt n)\\ ={}&\sum_{n=0}^\infty\left(1-\frac{m!}{m^n}\stir nm\right)\\ ={}&\sum_{n=0}^\infty\left(1-\frac{m!}{m^n}\frac1{m!}\sum_{j=0}^m(-1)^{m-j}\binom mjj^n\right)\\ ={}&\sum_{n=0}^\infty\frac1{m^n}\sum_{j=0}^{m-1}(-1)^{m-j+1}\binom mjj^n\\ ={}&\sum_{j=0}^{m-1}(-1)^{m-j+1}\binom mj\sum_{n=0}^\infty\frac{j^n}{m^n}\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\sum_{n=0}^\infty\frac{(m-j)^n}{m^n}\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\frac mj\\ ={}&m\sum_{j=1}^m(-1)^{j+1}\binom mj\frac1j\;. \end{align}

(Here $\stir nm$ is a Stirling number of the second kind.) Equating the two results yields our identity.

Perhaps the derivation appears slightly less opaque if we avoid the detour through the Stirling numbers and directly apply inclusion-exclusion. There are $\binom mj$ ways to select $j$ coupon types that haven't been collected yet, and the probability that $j$ particular coupon types haven't been collected after $n$ draws is $\left(1-\frac jm\right)^n$. Thus

\begin{align} E[N]={}&\sum_{n=0}^\infty P(N\gt n)\\ ={}&\sum_{n=0}^\infty\sum_{j=1}^m(-1)^{j+1}\binom mj\left(1-\frac jm\right)^n\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\sum_{n=0}^\infty\left(1-\frac jm\right)^n\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\frac mj\\ ={}&m\sum_{j=1}^m(-1)^{j+1}\binom mj\frac1j\;. \end{align}

We can generalise this to obtain an expression for $H_m-H_k$. The standard treatment shows that the expected value of the number $N_k$ of coupons required until only $k$ coupon types are missing is $m(H_m-H_k)$. Applying inclusion-exclusion in the form of corollary $(7)$ in this answer, this expected value can also be calculated as

\begin{align} E[N_k] &= \sum_{n=0}^\infty P(N_k\gt n) \\ &= \sum_{n=0}^\infty\sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\left(1-\frac jm\right)^n \\ &= \sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\sum_{n=0}^\infty\left(1-\frac jm\right)^n \\ &= \sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\frac mj \\ &= m\sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\frac1j\;, \end{align}

yielding

$$ H_m-H_k=\sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\frac1j\;. $$


Here is a rather elementary solution:

\begin{align*} \sum_{k=1}^{n} (-1)^{k-1} \binom{n}{k} \frac{1}{k} &= \sum_{k=1}^{n} \sum_{j=k}^{n} (-1)^{k-1} \binom{j-1}{k-1} \frac{1}{k} \tag{1} \\ &= \sum_{j=1}^{n} \sum_{k=1}^{j} (-1)^{k-1} \binom{j}{k} \frac{1}{j} \tag{2} \\ &= \sum_{j=1}^{n} \frac{1}{j}. \tag{3} \end{align*}

  • In $\text{(1)}$, we utilized the identity $ \binom{n}{k} = \sum_{j=k}^{n} \binom{j-1}{k-1} $ for $1 \leq k \leq n$. This follows from the usual 'hockey stick argument'.

  • In $\text{(2)}$, we changed the order of summation and applied the identity $\binom{j-1}{k-1}\frac{1}{k} = \frac{(j-1)!}{(j-k)!k!} = \binom{j}{k}\frac{1}{j}$ for $1 \leq k \leq j$.

  • In $\text{(3)}$, the identity $ \sum_{k=0}^{n} (-1)^k \binom{n}{k} = 0 $ ($n \geq 1$) is used to simplify the inner sum. Notice that this is a direct consequence of the binomial theorem.


Addendum. I realized that my solution can be considered as a shorthand version of @leonbloy's inductive proof (though I came up with this independently).


$\newcommand{\angles}[1]{\left\langle\,{#1}\,\right\rangle} \newcommand{\braces}[1]{\left\lbrace\,{#1}\,\right\rbrace} \newcommand{\bracks}[1]{\left\lbrack\,{#1}\,\right\rbrack} \newcommand{\dd}{\mathrm{d}} \newcommand{\ds}[1]{\displaystyle{#1}} \newcommand{\expo}[1]{\,\mathrm{e}^{#1}\,} \newcommand{\half}{{1 \over 2}} \newcommand{\ic}{\mathrm{i}} \newcommand{\iff}{\Leftrightarrow} \newcommand{\imp}{\Longrightarrow} \newcommand{\Li}[2]{\,\mathrm{Li}_{#1}\left(\,{#2}\,\right)} \newcommand{\ol}[1]{\overline{#1}} \newcommand{\pars}[1]{\left(\,{#1}\,\right)} \newcommand{\partiald}[3][]{\frac{\partial^{#1} #2}{\partial #3^{#1}}} \newcommand{\ul}[1]{\underline{#1}} \newcommand{\root}[2][]{\,\sqrt[#1]{\,{#2}\,}\,} \newcommand{\totald}[3][]{\frac{\mathrm{d}^{#1} #2}{\mathrm{d} #3^{#1}}} \newcommand{\verts}[1]{\left\vert\,{#1}\,\right\vert}$

The Sangchul Lee comment is related to the $\ul{first\ answer}$. Indeed, I decided to add another one.

  1. \begin{align} &\color{#f00}{\sum_{k = 1}^{n}\pars{-1}^{k + 1}{n \choose k}\,{1 \over k}} = \sum_{k = 1}^{n}\pars{-1}^{k + 1}{n \choose k}\int_{0}^{1}x^{k - 1}\,\dd x = -\int_{0}^{1}\sum_{k = 1}^{n}{n \choose k}\pars{-x}^{k}\,{\dd x \over x} \\[3mm] = &\ -\int_{0}^{1}{\pars{1 - x}^{n} - 1\over x}\,\dd x\ \stackrel{x\ \to\ 1 - x}{=}\ \int_{0}^{1}{x^{n} - 1\over x - 1}\,\dd x = \int_{0}^{1}\sum_{k = 1}^{n}x^{k - 1}\,\dd x = \sum_{k = 1}^{n}\int_{0}^{1}x^{k - 1}\,\dd x\ \\[3mm] = &\ \sum_{k = 1}^{n}{1 \over k}\ \stackrel{\mathrm{def.}}{=} \color{#f00}{H_{n}} \end{align}
  2. \begin{align} &\color{#f00}{\sum_{k = 1}^{n}\pars{-1}^{k + 1}{n \choose k}\,{1 \over k}} = \sum_{k = 1}^{\infty}\pars{-1}^{k + 1}\,{1 \over k}{n \choose n - k} = \sum_{k = 1}^{\infty}\pars{-1}^{k + 1}\,{1 \over k}\oint_{\verts{z} = 1^{-}} {\pars{1 + z}^{n} \over z^{n - k + 1}}\,{\dd z \over 2\pi\ic} \\[3mm] = & \oint_{\verts{z} = 1^{-}} {\pars{1 + z}^{n} \over z^{n + 1}}\ \overbrace{% \sum_{k = 1}^{\infty}\pars{-1}^{k + 1}\,{z^{k} \over k}} ^{\ds{=\ \ln\pars{1 + z}}}\ \,{\dd z \over 2\pi\ic} = \oint_{\verts{z} = 1^{-}} {\pars{1 + z}^{n}\ln\pars{1 + z} \over z^{n + 1}}\,{\dd z \over 2\pi\ic} \\[3mm] = &\ \lim_{x \to 0}\partiald{}{x}\oint_{\verts{z} = 1^{-}} {\pars{1 + z}^{n + x} \over z^{n + 1}}\,{\dd z \over 2\pi\ic} = \lim_{x \to 0}\partiald{}{x}{x + n \choose n} = {1 \over n!}\lim_{x \to 0}\partiald{}{x} \bracks{\Gamma\pars{x + n + 1} \over \Gamma\pars{x + 1}} \\[3mm] = &\ {1 \over n!} \bracks{{\Gamma\pars{n + 1}\Psi\pars{n + 1}\Gamma\pars{1} - \Gamma\pars{1}\Psi\pars{1}\Gamma\pars{n + 1} \over \Gamma^{2}\pars{1}}} = \Psi\pars{n + 1} - \Psi\pars{1} \\[3mm] = &\ n\sum_{k = 0}^{\infty}{1 \over \pars{k + n + 1}\pars{k + 1}} = \sum_{k = 1}^{\infty}\pars{{1 \over k} - {1 \over k + n}} = \sum_{k = 1}^{n}{1 \over k}\ \stackrel{\mathrm{def.}}{=}\ \color{#f00}{H_{n}} \end{align}