Why do we say function "parameterized by" vs just function of (x,y,z,...)?

Strictly speaking you're right. It's sometimes difficult to explain clearly what is a parameter and what is a variable. When you write $f(x;a,b,c)$, there is no mathematical difference between $x$ and $a$ for example, they're both variables for the function $f$. I think the real difference is in how we interpret things. When we have an expression such as $f(x;\theta)$ or $f_\theta(x)$, $x$ and $\theta$ don't exactly play the same role, that's why we use different notations for them. In fact it's for the sake of clarity. When you have lots of different variables in an expression, using different notations is an easy way to know what role they play.

In the framework of statistics, the notion of parameter is even more important. As you may know, there's a whole branch of statistics called parametric statistics, where we introduce families of probability distributions, indexed by a set of parameters: $$\mathcal{P}=\{P_\theta,\theta\in\Theta\}$$ where $\Theta$ is some subset of $\mathbb{R}^d$. Note that you could actually see this as a function from $\Theta$ to the set of probability distributions on a certain space. In many cases, $P_\theta$ has a density w.r.t. the Lebesgue measure, and we denote it by $f(x;\theta)$ or $f_\theta(x)$. You see in this example that $x$ and $\theta$ have very different meanings. $\theta$ is the parameter that comes from our statistical model, whereas $x$ is a variable of integration.

In other cases, we use the "opposite" notation. For example when you consider the likelihood function $l(\theta;x_1,\dots,x_n)$, we consider it primarily as a function of $\theta$ (that we want to maximize most of the time), and $x_1,\dots,x_n$ as fixed parameters (they are the observation, we can't change them). Once again, it's a question of context and meaning of the variables.


While the answers already given are correct, I suspect that they may be bringing in terminology and notations you are unfamiliar with. If so, let me try to recast the answer more simply.

You are correct that $f(x; a, b, c)$ is the exact same thing as $f(x, a, b, c)$. But the difference is in how we are using $x$ vs how we are using $a, b, c$. With $f(x, a, b, c)$, all four values are equally important, equally of interest. With the former, we are considering $f$ as a function of $x$ only. $a, b, c$ are simply values that we use in defining that function. By making these values variables too, the mathematics we do applies to all the different functions of $x$ that are defined by the various possible values of $a, b, c$, instead of having to work each particular example we need separately.

For example, when talking of linear functions, we will write things such as $$y = mx + b$$ You could consider this as defining $y$ as a function of three variables, $x, m, b$, but what is important is the dependence of $y$ on $x$. The slope $m$ and intercept $b$ are just values expressing which line is represented by the relationship between $y$ and $x$. So if want to find the root for $y = 0$, that is $x = \frac{-b}{m}$, not $b = -mx$ or $m = \frac{-b}x$.

That is what the notation $f(x; a, b, c)$ means. It is just telling you that $x$ is the one whose variance is important to us. The others we are going to consider fixed (although not currently known) values.


Think to something like this: $f(x;\sigma) = e^{-x^2/\sigma^2}$.

If you read for example "the derivative of $f$" probably you immediately undestand that we are speaking of $\partial f/\partial x$ because you recognize $f$ as a function of $x$, whose definition contains some $\sigma$ which is not a "variable".

Otherwise if you read: $f(x, \sigma) = e^{-x^2/\sigma^2}$ a sentence like "the derivative of $f$" would be ambiguous: it's $\partial f/\partial x$? $\partial f/\partial \sigma$? $\nabla f = (\partial f/\partial x, \partial f/\partial \sigma)$?

So without this distinction every time one has to write explicitly "the derivative of $f$ with respect to the variable $x$".