Why does notation for functions seem to be abused and ambiguous?

$f(x)$ means both the map $x \mapsto \textrm{whatever}$ and the image of $x$ under $f$, depending on the context.

Some people would prefer a stricter convention of always writing the function as $f$. In practice I find there is usually little room for confusion, and saying "the function $f(x)$" conveniently reminds the reader what the independent variable of $f$ is (in the case that $f$ contains many constants, etc).

However, as you point out there are exceptions where confusion does arise, particularly when taking derivatives. For example, is $$\frac{\partial f(x^2)}{\partial x}$$ the derivative of $f$ evaluated at $x^2$? Or the derivative of the composition of $f$ with $x^2$? What about $$\frac{\partial f}{\partial x}(x^2)?$$ Again, one can usually figure out what is meant, but here there is definitely a potential for confusion. With functions of multiple variables it gets even worse; for instance in physics you often define functions $L(x^i, x^{i+1})$ and then need to differentiate $$\frac{\partial}{\partial x^i} \sum_{j=0}^n L(x^{j}, x^{j+1}).$$ It's hard to write down an expression for this derivative that's not a complete abomination. You could go back and rename the independent variables of $L$ using placeholders less likely to lead to confusion, but perhaps better is to switch to notation like $D_1 f$ to denote partial differentiation of $f$ with respect to its first parameter.


I think the question of notation being abused and ambiguous applies to many more things than functions within mathematics. I could (and I think this has been done before on this very site) make a list of notations (or expressions) whose meaning is dependent on context. In practice, a shorthand, or convenience, notation is usually never a problem.

However, I do think there are times when context is not sufficient. Consider how:

$f^2(x) = f(f(x))$

for most functions, but I have also seen the (very strange to me):

$sin^2(x) = (sin(x))^2$

which puzzled me greatly when I first saw it. But then again, I'm more of a programmer than a mathematician, so I do like it when expressions are non-ambiguous. I one asked a mathematician (back in the 1980s) "How do I know when $f^n(x)$ is iteration and when it is raising the result of application to a power?" and he thought for a few minutes and then said: "I think it is the latter when the function is transcendental." But I am not so sure of that answer: I've seen $\log^2x$ mean both $\log(\log x)$ and $(\log x)^2$ and it drives me crazy! (By the way $\log \log n$ appears often in algorithm analysis.)

I point this out because this is a case where the context is often insufficient to disambiguate the notation. So why did someone use the notation for raising the result to a power to begin with? I believe it was to save time writing parentheses! Yes, they traded convenience for ambiguity! But back when this notation became popular, there were more engineers than there were pure mathematicians and functional programmers concerned with function iteration. :)

EDIT: In the comments below someone said that the notation has a ring theory justification.

Now to return to your question, in the case of $f(x)$ referring to the function versus the result of application, personally, as one who does functional programming, it does make me sad to see "the function $f(x)$" when the codomain of $f$ is the real numbers, because I so badly want the codomain to be functions! Yes, I like higher-order functions, and I almost feel bad for those who.... oh, never mind.

The probable source of the expression $f(x)$ when someone means to write only $f$ is that the former gives an indication of the arity of the function. That said, it does create an ambiguity, which you must try to figure out, but the surrounding text should make it so you usually can. Human beings are not bound to be all the time unambiguous and super-precise, so we take notational liberties.

Now, obviously a problem can occur if you take this abuse of notation and try to use it in a program. I don't know many programming languages that would tolerate that kind of ambiguity.

As to your other point, yes if someone tried to say that $2x+5$ was a function, they are probably only doing so because they do not want to type, or write $\lambda x. 2x+5$ -- perhaps because they don't like Greek letters (just kidding) or any of the other countless representations for anonymous functions. Again, people are allowed to do this because they are being informal. When writing programs, yes we must say:

  • (x) -> 2*x + 5 // CoffeeScript
  • function(x){return 2*x+5} // JavaScript
  • (LAMBDA (x) (+ (* 2 x) 5)) ; Lisp
  • fn x => 2 * x + 5 (* ML *)
  • #(+ (* 2 %) 5) ; Clojure

and so on.

TL;DR It is allowed because it is informal, and yes you are expected to infer it from context. I've given some thoughts as to why some of the ambiguity might have arisen: the same reasons that people shortcut anything in communication! We can live with this in mathematical communication between people but not for programming.

ANSWER TO YOUR EDIT QUESTION:

You asked why we define functions using variables like $x$ and $y$ instead of defining them without reference to any variables. Now if your question was one of differentiating

$$f =_{def} \lambda x. \lambda y. 2x+y$$

from

$$f(x,y) =_{def} 2x+y$$

then the answer is that the second is probably easier to write. However, we can do something more interesting, as is done in the programming language Clojure: let %1 be the first argument to the function, and %2 be the second argument, and so on, and define the brackets #( and ) to wrap a function expression. Now we can write:

$$f =_{def} \#(2(\%1) +\%2)$$

and in fact we can use anonymous function expressions. That particular notation might be a but ugly, but I would encourage you to try to invent a nice notation, and change the world for the better. If it catches on, that is.


Most times, functions can be defined by an expression (e.g., $2x+5$), because using the expression we can guess many things about the function (like its domain, and the "rule" for getting an output given an input). The phrasing "$f(x)$" is just a generic expression.

Also, context and notation helps a lot, so if you are reading a book about vector spaces, you know that when the author says

the function $Ax+b$

he really means something like:

the function $f:E\to F,x\mapsto Ax+b$, where $E$ and $F$ are vector spaces, $b\in F$ and $A$ is a matrix.

You might even know that $E=F$, or that $E=\mathbb R^n$, or something like that. Anyway, as you can see, the first is much simpler.

As for the derivatives, you can differentiate an expression. I've never seen a formalization of this, but people do write $$\frac{d}{dx}\left\{2x^2 + 3x + 5\right\}=4x+3$$ without ever mentioning functions. So that $df(x)/dx$ is an expression, and $df/dx$ is a function $df/dx:D\to Y$, where $D$ is the set of points where $f$ is differentiable, and $Y$ is $\mathbb R$, or more generally, a space of linear operators.

EDIT: as for your added question, you can define a function without mentioning a variable ("Let $f$ be the function that takes a real number and gives the quintuple of its square"), but usually, it's easier to write "Let $f(x)=5x^2$". Also, the expression "$5x^2$" is much more familiar for most readers than writing "the quintuple of the square of a real number".