Definition of Polynomials in Linear Algebra Done Right

The notation is very common. If one is very strict one could see a problem, but I would say the main issue is at saying "$\mathcal{P}({\bf{F}})$ is the vector space of all polynomials from $\bf{F}$ to $\bf{F}$." Instead it is "the vector space of all polynomials over $\bf{F}$" or "with coefficients from $\bf{F}$."

A polynomial is an object in its own right.

The polynomial $P$ then gives rise to a map from $\bf{F}$ to $\bf{F}$ that is usually also denoted by $P$.

In the same way the polynomial $P$ gives rise to a map from $\mathcal{L}(V)$ to $\mathcal{L}(V)$ that is usually also denoted by $P$.

It is true that very strictly the polynomial and the maps it induces are different objects, but usually one does not distinguish those notationally.

Let me add that if you really insist that "$\mathcal{P}({\bf{F}})$ is the vector space of all polynomial [maps] from $\bf{F}$ to $\bf{F}$" then indeed there would be a problem. But I doubt that is actually what is done in your source, and if it were done then this would be the main problem.

Well, turns out this is done. In this way, the definition is not really coherent. The main problem is that the coefficients $a_i$ are not necessarily uniquely determined by the map $P$ from $F$ to $F$ and thus $P(T)$ is not well-defined (different choices of $a_i$ yielding the same function from $F$ to $F$ will yield different results for $P(T)$).

However, this is only an actual problem if the field is finite. If the field is infinite such as the reals or the complex numbers, than a polynomial map uniquely (up to spurious $0$ coefficients) determines the coefficients.

So that to each polynomial function one has a tuple of coefficients and then one defines $P(T)$ as it is defined. (Michael Hardy makes a very good point there.)

Either way, there is no problem to consider a polynomial as a formal expressions to begin with $a_0X^0+ a_1 X^1 + \dots + a_n X^n$. To define it as a function and then retrieve the coefficients is not my preferred way, but I can see why somebody might want to take it.


Polynomials have their own independent existence - they are not inherently functions from $\mathbf{F}$ to $\mathbf{F}$ or functions at all for that matter.

For example, over the finite field $\mathbf{F}=\mathbb{F}_p$, the polynomial $z^p-z\in\mathbb{F}_p[z]$ evaluates to $0$ for any element of $\mathbb{F}_p$ that you plug in, but it is extremely, importantly, different than the zero polynomial $0\in\mathbb{F}_p[z]$.

In general, if $p$ is an element of the ring of polynomials $\mathbf{F}[z]$, the notation $p(\theta)$ is explained like this:

  • there is an $\mathbf{F}$-algebra $R$ (sometimes left implicit) containing the element $\theta$

  • there is a unique $\mathbf{F}$-algebra homomorphism $\varphi:\mathbf{F}[z]\to R$ with the property that $\varphi(z)=\theta$

  • the notation $p(\theta)$ refers to the element of $R$ that is the image of $p$ under that homomorphism $\varphi$

To evaluate a polynomial at an element of $\mathbf{F}$, you have $R=\mathbf{F}$.

In the definition you're citing, you have $R=\mathcal{L}(V)$.


The fact that the second expression does not make sense because $T$ is not in the domain of $p$ is precisely the reason why this definition is given.

Of course, the judgment that it does not make sense stems from our currently conventional ways of reasoning about mathematics, which will further evolve over time, just as our current modes of reasoning are not the same as those that Euler used.