What is functional analysis in simple words?

For me, doing functional analysis is best described as 'going beyond linear algebra'.

In linear algebra, the objects you deal with are (coordinate) vectors, i.e. objects from a vector space $V$ which you can multiply with a scalar or add together and again get a vector: For $v,w\in V$ and $\alpha \in \mathbb R$ we have $v + w \in V$ and $\alpha v \in V$.

Functional analysis answers the question 'What happens if $V$ infinite-dimensional?'. The idea behind this is the observation that these vector axioms hold for other objects than coordinate vectors with a finite number of rows as well. For example, the sum of two differentiable functions is a differentiable function again (and a number times a differentiable function is differentiable, too). The same holds true for other classes of functions, e.g. polynomials or square-summable sequences (which are really just functions from $\mathbb N$ to $\mathbb R$/$\mathbb C$). Note that there are other examples of infinite-dimensional vector spaces which are not function spaces, and examples of function spaces which are finite-dimensional. But one of the things one wanted to do in early 20th century to handle quantum mechanics is to get some kind of "linear algebra for functions, not row vectors".

When we allow functions instead of vectors from a finite-dimensional space, there are a lot of things which work similarly, but a lot of things which don't work similarly compared to linear algebra. For instance:

  • We can still measure the length of these vectors, but suddenly it's important which norm we take (not all norms are equivalent on an infinite-dimensional vector space).

  • We can look at linear operators $A$, but they cannot be represented as a matrix (in fact, in the early days of functional analysis, Heisenberg did represent differential operators as matrices with an infinite number of rows and columns).

  • We can calculate eigenvalues $\lambda$, but since the rank-nullity theorem ($\dim V = \operatorname{rank}A + \dim \operatorname{ker}A $) doesn't help if $\dim V = \infty$, we're not only interested in cases where $(A-\lambda I)$ is not injective (eigenvalues), but also cases where $(A-\lambda I)$ is not surjective (so-called continuous spectrum). Also, calculating eigenvalues gets harder since we can't calculate a characteristic polynomial.

  • There's a lot of room in infinite-dimensional spaces. We can have Cauchy sequences which don't converge since we picked the 'wrong' norm. This is why Banach (and Hilbert) spaces are interesting.

  • Not all linear operators are continuous anymore. In fact, the most interesting operators (i.e. differential operators) are not continuous.

All of these things require a more rigorous analytical framework than linear algebra does and this is where the analysis part in functional analysis comes from.

Addendum: I just realized that I talked a lot about the 'what' and not the 'why'.

Essentially, these questions help to answer hard questions about functions, for example if you're interested in solving differential equations - eigenvalues of a differential operator $D$ are just the points where you can solve the differential equation $(D - \lambda)f = 0$.


Imagine sampling the sounds in your environment over a period of time starting at time $0$ and ending at time $T$. One way to encode those sounds would be to measure the air pressure levels at your ear drum at each time $0 \le t \le T$. The sounds would correspond to a function $p(t)$, where $p(t)$ is the air pressure level at time $t$. You can add two sounds by adding the functions. You can increase the volume of a sound by multiplying all the pressure levels by a constant. These are linear operations on the function space of sounds: $$ \alpha_1 p_1(t) + \alpha_2 p_2(t). $$ So the sounds that you hear over a time interval $[0,T]$ can be described in terms of functions that quantify the sound pressure levels on your ear drum as a function of $t$, and the collection of sounds is a linear space because you can multiplying a sound by a scalar (change the volume) and you can superimpose two sounds by adding their corresponding sound pressure functions.

A pure tone would be $\cos(2\pi f t+\phi)$ where $f$ is the frequency in units of cycles per second, and $\phi$ is an offset. For example, if $f=400$, then $\cos(2\pi f t+\phi)$ would cycle through 400 complete cycles as $t$ varies over an interval of $1$ second. One cycle per second is referred to as one Hertz, named after the German Physicist Heinrich Hertz. The extremes of the typical human hearing range is 20Hz to 20,000Hz. Middle C on the piano is about 261.6 Hz. (Middle C is designated C4 in scientific pitch notation because of the note's position as the fourth C key on a standard 88-key piano keyboard.)

Suppose a sound pressure level function $p(t)$ starts at $0$ at $t=0$ and ends at $0$ at $t=T$ for some fixed interval of time $[0,T]$. The first remarkable thing you learn about such a sound pressure level function $p$ is that $p$ can be written as an infinite sum of pure tones of the form $$ \sin(\pi t/T),\sin(2\pi t/T),\sin(6\pi t),\sin(8\pi t),\cdots. $$ That is, there are unique amplitudes $A_1,A_2,A_3,\cdots$ such that $$ p(t) = A_1\sin(\pi t/T)+A_2\sin(2\pi t/T)+A_3\sin(3\pi t/T)+\cdots . $$ This may not seem significant, but imagine that $p$ is the sound pressure function of your favorite song, complete with instrumentation and/or voices over a 3 minute period of time. Then you can reconstruct the entire song by added together pure tones, starting with the lowest being 1/2 cycle in 3*60=180 seconds, which translates to $(1/2 cycle)/(180 sec)=\frac{1}{360}\mbox{ Hz.}$. As you begin adding the tones of $\frac{1}{360}\mbox{ Hz., }\frac{2}{360}\mbox{ Hz., }\frac{3}{360}\mbox{ Hz. }, \cdots$ with just the right amplitudes, the entire sound pressure level function is duplicated entirely over that 3 minute period of time. In other words, every sound function $p$ can be written as a linear combination of pure tones. The set of functions $$ \{ \sin(\pi t/T),\sin(2\pi t/T),\sin(3\pi t/T),\cdots \} $$ is a basis of functions from which all sounds pressure functions $p$ can be written.

Functional Analysis deals with such function spaces, with the determination of the amplitudes $A_n$, and with the convergence of the infinite sums to the original sound pressure function. The decomposition of a sound pressure function into "harmonics" (pure tones that are integer multiples of a common base frequency) is the original meaning of "Harmonic Analysis."

You can apply the same analysis to a picture as well where you view scan lines of an image as pressure functions. The scan lines are then written in terms of basic periodic variations by determining amplitudes. The very "high frequencies" of pixel data are eliminated (called filtering) and the changes from one line to the next are then stored after some compression. This is the JPG format.

The decomposition of functions into basic independent modes has numerous generalizations.


Algebraic analysis is finding an unknown [ function ] in terms of an infinite polynomial. The unknown function is specified by some kind of differential equation.

Functional analysis is finding an unknown in terms of an infinite series of functions. The simplest example being 'fourier analysis' which is a general solution of the empty-space wave equation.

Why would we do this ? Surely it involves much more computation, especially if the end result has to be evaluated numerically; as needed in experimental physics and engineering problems ? Remember, an infinite series of transcendental functions, themselves with infinite representations, slow to evaluate numerically.

In some cases, the basis functions may be easy - such as Legendre Polynomials, which often appear in quantum theory; other times difficult functions such as Bessel Functions, which aren't well understood by even many undergraduate level mathematicians.

Even the simplest case, fourier series, in the early days pure mathematicians were unsure if a fourier decomosition of an unknown function was an accurate representation of it, and under what conditions - what ranges of dependent variables were safe ? And how many terms to use for a specified degree of accuracy ?

Later mathematicians Sturm and Liouville showed that for ALL second order linear differential equations , which are the vast majority used in Science and Engineering, the basis functions also known as eigenfunctions are are always orthogonal, linear and the functional decomposition of any solution is an unique and accurate representation of the true solution.

Further, eigenfunctions always have recurrence relations a good example being the Tchebyshef polynomials which help in further analysis, both algebraic and numerical.