The usage of chain rule in physics

You are correct that you cannot (globally) write velocity as a function of distance. For example, as one commenter has already mentioned, throw a ball directly up in the air and wait for it to come down. When the ball is at height $h$ on the way up, it has a positive (upward directed) velocity. When it is at the same height $h$ on the way down, it has a negative (downward directed) velocity. So velocity is definitely not a (global) function of distance.

But this much is true: For any height $h$ except for the maximum height the ball ever reaches, there is some open interval around $h$ --- some range of heights from $h-\epsilon$ to $h+\epsilon$ --- in which you can treat velocity as a well-defined function of height while the ball is on its way up, and another well-defined function of height while the ball is on its way back down. And moreover that function is differentiable and obeys the chain rule. All of this is part of the content of the implicit function theorem, which you can google for.

If you just write velocity as a function of height, you do have to be careful to make it clear from context which of the two functions --- the "on the way up" function and the "on the way down" function --- you're referring to. You also have to make sure you don't try to pull this stunt when the ball is at the very top of its trajectory (or more generally, at points where its velocity is zero). Many books take it for granted that you're being careful about this, so they don't have to worry about it on your behalf.

Well, this is the most common thing for which mathematicians make fun of physicists. Because we don't bother to cancel out derivatives, and we "NEVER" check if we can imply some rule on our equations. The thing is, that almost all functions, which can appear in nature or real life systems are, in most times, continuous and differentiable. There are, for sure, some special cases. But for most simple tasks, eg. mechanic, this is quite valid.

So in the case of $v$. In order to define velocity, the object has to change its position in some amount of time. And furthermore, we don't have infinite speed in real life. This implys, that $dx/dt$ has allways that some non infinite value. From this it follows, that $v$ can be rewritten as function of either $t$ or $x$.

I am not sure if there is a special case or not, but for physicists it is not important, because in 99.9% this will be true. If there are special cases, they could be "obviously strange". You should have in mind, that at least in theory, we always check our calculations with experiment, so we have an experimental proof instead of a mathematical one (generally).

It is true, that in nature there is only one true independent variable, time. All others are "pseudo-independent". They are variables humans bless as independent in order to answer what-if scenarios and to establish mathematical models of systems byways of separation of variables. The common term for these "pseudo-independent" quantities is generalized coordinates.

Looking at a complex mechanical system, like a human launching a ball while riding on a skateboard. First, we decide what the degrees of freedom are and assign generalized coordinates to them. These are simple measurable quantities of distance, angle or something else geometrical forming a generalized coordinate vector $$\boldsymbol{q} = \pmatrix{x_1 \\ \theta_2 \\ \vdots \\ q_j \\ \vdots} \tag{1}$$ In this example there are $n$ degrees of freedom. All the positions of important points on our mechanisms can be found from these $n$ quantities. If there are $k$ kinematic hardpoints (such as joints, geometric centers, etc) then the $i=1 \ldots k$ cartesian position vector is some function of the generalized coordinates and time $$ \boldsymbol{r}_i = \boldsymbol{\mathrm{pos}}_i(t,\, \boldsymbol{q}) \tag{2}$$

Here comes the chain rule part. With the assumption that (2) is differentiable with respect to the generalized coordinates, and that contact conditions do not change due to separation, or loss of traction, the velocity vectors of each of the hardpoints is found by the chain rule

$$ \boldsymbol{v}_i = \boldsymbol{\mathrm{vel}}_i(t,\,\boldsymbol{q},\,\boldsymbol{\dot{q}}) = \frac{\partial \boldsymbol{r}_i}{\partial t} + \frac{\partial \boldsymbol{r}_i }{\partial x_1} \dot{x}_1 + \frac{\partial \boldsymbol{r}_i }{\partial \theta_2} \dot{\theta}_2 + \ldots + \frac{\partial \boldsymbol{r}_i }{\partial q_j} \dot{q}_j + \ldots \tag{3} $$ where $q_j$ is the j-th element of $\boldsymbol{q}$, and $\dot{q}_j$ its speed (being linear or angular).

The above is not a division of infinitesimals, but the multiplication of a partial derivative $\tfrac{\partial \boldsymbol{r}_i }{\partial q_j}$ with the particular coordinate degree of freedom speed $\dot{q}_j$.

Maybe you are more comfortable with this more rigorous notation using partial derivatives that what you have seen so far. The term partial derivative means, take the derivative by varying only one quantity and holding all others constant. This is what allows us to use pseudo-independent quantities $q_j$ for the evaluation of the true derivative with time (the one actual independent quantity).

The same logic is applied to higher derivatives as well

$$ \boldsymbol{a}_i = \boldsymbol{\rm acc}_i(t,\boldsymbol{q},\boldsymbol{\dot q}) = \frac{\partial \boldsymbol{v}_i}{\partial t} + \ldots + \frac{ \partial \boldsymbol{v}_i}{\partial q_j}\, \dot{q}_j + \ldots + \frac{ \partial \boldsymbol{v}_i}{\partial \dot{q}_j} \,\ddot{q}_j \tag{4} $$

The last part might be a bit confusing, but when you express it in terms of actual degrees of freedom it might be clear. Consider the degree of freedom $\theta_2$ and its time derivatives $\omega_2$ and $\alpha_2$. Then the terms $\frac{ \partial \boldsymbol{v}_i}{\partial \theta_2} \omega_2 $ and $\frac{ \partial \boldsymbol{v}_i}{\partial \omega_2} \alpha_2 $ are more clear I hope, as $\boldsymbol{v}_i$ depends on both the position $\theta_2$ and the speed $\omega_2$.