Why can't an improper transfer function be realized?
To realize an improper transfer function, derivatives of the input would be needed. The answer above by Rodrigo de Azevedo helps make clear why. The problem is that it is not possible to realize perfect derivatives. A number of arguments are helpful in understanding why.
The modulus of the frequency response of a differentiator increases with frequency. However it is not possible to construct an apparatus whose gain becomes arbitrary large at large frequencies. On the contrary, any device known will have a cutoff frequency after which its response falls.
Or, suppose you feed a discontinuous signal into a perfect differentiator. It will have to compute the derivative of the signal, before noticing that the derivative doesn't exist! So any "differentiator" will be at best an approximation.
Suppose we have a state-space model
$$\begin{align} \dot{\mathrm x} &= \mathrm A \mathrm x + \mathrm B \mathrm u\\ \mathrm y &= \mathrm C \mathrm x + \mathrm D \mathrm u \end{align}$$
where $\mathrm A \in \mathbb R^{n \times n}$. Laplace-transforming both the state equation and the output equation, we conclude that the transfer function is the following matrix-valued function
$$\mathrm G (s) = \mathrm C (s \mathrm I_n - \mathrm A)^{-1} \mathrm B + \mathrm D$$
Note that
$$(s \mathrm I_n - \mathrm A)^{-1} = \frac{\mbox{adj} (s \mathrm I_n - \mathrm A)}{\det (s \mathrm I_n - \mathrm A)}$$
and that
- each entry of the adjugate is a polynomial in $s$ of degree at most equal to $n-1$.
- the determinant of $s \mathrm I_n - \mathrm A$ is a polynomial in $s$ of degree $n$.
Thus, we can conclude that each of the $n^2$ SISO transfer functions in $\mathrm G (s)$ has the property that the degree of the numerator is less than or equal to the degree of the denominator.
How can an improper transfer function have a state-space realization, then?