Why is modular arithmetic defined as a "similarity" and not an operation?
The "more useful" claim comes because
It follows a pattern that we see often in other parts of math and
It saves you the trouble of having to pick out a "special" element of each equivalence class to use as the "name" of that class.
In your example, for $m = 3$, the class $\{0, \pm 3, \pm 6, \pm 9, \ldots \}$ is called $0$, and similarly for the things named $1$ and $2$, and that works out fine. In this case, the element "0" is the "special" one in its equivalence class.
But if you look at something like $S^3$, the set of pairs of complex numbers $(z, w)$ with $|z|^2 + |w|^2 = 1$, there's a lovely equivalence relation on these: we say that $(z, w) \sim (z', w')$ whenever $zw' = wz'$. It turns out that geometrically $S^3$ consists of all points in 4-dimensional space whose distance from the origin is 1, and each equivalence class turns out to be a circle within this "3-sphere". So we can write the 3-sphere as a union of these circles. But for each circle, there's no 'preferred' choice of $(z, w)$ to represent that circle, so a naming scheme like the one you've used isn't really available. But the decomposition is still very useful: it's called the Hopf Fibration, and is one of the fundamental examples in homotopy theory.
In computer science, it's very common to want to pick out special elements so as to deal with individual things rather than sets, so "mod" (as an operation on ints) is very often defined in computer languages in exactly the way you suggest (although what $n \bmod k$ means when $n$ or $k$ is negative turns out to be profoundly undecided among computer scientists -- almost every imaginable definition has been used, and no two seem to agree).
Let me make one more argument in favor of the "equivalence class" definition. If you think of $\Bbb Z$ and $\Bbb Z/ n \Bbb Z$, with the latter being "sets of equivalence classes, then there's a very natural function from the first to the second: $$ p: \Bbb Z \to \Bbb Z/ n \Bbb Z: k \mapsto \text{the class containing $k$} $$
By contrast, writing down the "rule" to make $k$ to its modular remainder is a (small) pain. The formula I've given for "projection to the quotient" is the same for any equivalence relation, and is so simple that we hardly need to remember it.
When it comes time to define operations on the quotient (like "modular addition"), the strucutre of the proof that the operation makes sense is (in the equivalence class case) always the same: you say
"Suppose that $a \sim b$ and $a' \sim b'$; we will show that $$ p(a+a') = p(b + b') $$ thus showing that addition on the original set ($\Bbb Z$) 'passes to the quotient.' Since $a \sim b$, there's an integer $u$ with $a = b + un$; since $a' \sim b'$, there's an integer $v$ with $a' = b' + vn$. But then $a+a' = (b + b') + (u+v)n$, showing that $a+a' \sim b + b'$."
For some equivalence relations, the details of that middle paragraph are far more complicated, but the structure remains constant, which can be a big help in letting you know what you must prove.
Operational use of mod (remainder) is often more convenient in (mechanical) computational contexts, whereas the relational use (congruences) yields more flexibility in theoretical contexts.
The difference amounts to whether it is more convenient to work with general equivalence classes, or canonical / normal representatives ("reps") thereof. For example, it would be quite cumbersome to state the laws of fraction arithmetic if one required all fractions to be in normal (reduced) form, i.e. in lowest terms. Instead, it proves more convenient to have the flexibility to work with arbitrary equivalent fractions. For example, this allows us to state the fraction addition rule in very simple form by first choosing convenient reps having a common denominator.
Analogously, in modular arithmetic the remainder $\,a\ {\rm mod}\ m\,$ may not be the most convenient choice of representative of the equivalence class $\,[a]_m =\, a + m\,\Bbb Z.\,$ For example, the formula for casting out elevens exploits that $\ {\rm mod}\ 11\!:\ 10\equiv -1\,\Rightarrow\,10^n\equiv (-1)^n\equiv \pm1 .\,$ which involves choosing a rep of least magnitude $\,\color{#c00}{\bf -1}\,$ vs. $\,\color{#0a0}{10}\in [10]_{11}\! = \{\ldots,\, -23,-12,\color{#c00}{\bf -1},\color{#0a0}{10},21,\,\ldots\}.\,$ Hence, analogous to fraction addition, we chose a rep $\,-1\,$ whose powers are easier to compute. Using least magnitude reps often simplifies other computations too, e.g. the Euclidean algorithm.
Thus the use of congruence classes (vs. canonical reps) provides much greater flexibility, which often yields great simplifications.
I think that there are not great distinctions to be made here. $\equiv$ tends to be used when emphasising the underlying equivalence classes. Of course operations can be defined on equivalence classes provided it doesn't matter which member of the class you take as a representative. The equivalence sign tends to be used until this is securely established.
$\equiv$ is also used in modular arithmetic to state that the result of a calculation falls within a certain equivalence class, without implying that every member of the class is involved. An example would be the Chinese Remainder Theorem. The same idea could be expressed using $\in$ and the notation associated with sets. $\equiv$ tends to be more convenient.
As an example, the usual algebraic way of defining the rationals as the field of fractions of the integers defines rational numbers as equivalence classes, but no-one in elementary work would use anything other than $=$ for arithmetic operations.