Conjugacy in $S_n$ with composing permutations left to right vs. right to left
The "ugly" formula $\tau \sigma \tau^{-1} = (\tau^{-1}(\sigma_1)\ \tau^{-1}(\sigma_2) \ldots\ \tau^{-1}(\sigma_n))$ can be easily rewritten in the form $\tau^{-1} \sigma \tau = (\tau(\sigma_1)\ \tau(\sigma_2) \ldots\ \tau(\sigma_n))$, using the bijection $\tau^{-1}\mapsto \tau$. So I would not say that the left to right convention is "inferior". Each of the conventions has certain advantages which depend on the context. In an abstract group $G$ the words are usually given by $w=a_1^{e_1}\cdots a_r^{e_r}$ from left to right, but if the elements are maps, it seems sometimes better to use the right to left convention in order to have $(fg)(x)=f(g(x))$.
Left-to-right is basically a by-product of western languages being scanned this way. The convention has groups acting (group actions are the raison-d-être for group theory) on the right, and the permutation product is a consequence.
If you have a group $G$ acting on a set $\Omega$, then $\omega^{gh}=(\omega^g)^h$ in left-to-right action, while when acting on the left we have at best that $(gh)(\omega)=g(h(\omega))$ (or in some convention even the perverse $h(g(\omega))$). For many group theorists (the use seems to differ between group theory and other areas) the first version is easier to write down, especially if the product gets longer.
A free bonus is that right action corresponds to row vectors, which are easier to typeset than column vectors.
In my (biased) view the main reasons for the right-to-left convention are the historical use in calculus (one writes $\sin(a)$, not $a^{\sin}$, even though the second use would be on many pocket calculators nowadays), as well as in Linear Algebra textbooks that tend to write equation systems universally as $Ax=b$, not $xA=b$.