What can I do with measure theory that I can't with probability and statistics
First, there are things that are much easier given the abstract formultion of measure theory. For example, let $X,Y$ be independent random variables and let $f:\mathbb{R}\to\mathbb{R}$ be a continuous function. Are $f\circ X$ and $f\circ Y$ independent random variables. The answer is utterly trivial in the measure theoretic formulation of probability, but very hard to express in terms of cumulative distribution functions. Similarly, convergence in distribution is really hard to work with in terms of cumulative distribution functions but easily expressed with measure theory.
Then there are things that one can consume without much understanding, but that requires measure theory to actually understand and to be comfortable with it. It may be easy to get a good intuition for sequences of coin flips, but what about continuous time stochastic processes? How irregular can sample paths be?
Then there are powerful methods that actually require measure theory. One can get a lot from a little measure theory. The Borel-Cantelli lemmas or the Kolmogorov 0-1-law are not hard to prove but hard to even state without measure theory. Yet, they are immensely useful. Some results in probability theory require very deep measure theory. The two-volume book Probability With a View Towards Statistics by Hoffman-Jorgensen contains a lot of very advanced measure theory.
All that being said, there are a lot of statisticians who live happily avoiding any measure theory. There are however no real analysts who can really do without measure theory.
The usual answer is that of course measure theory is not only provides the right language for rigorous statements, but allows achieve progress not possible without it. The only place I found a different point of view is a remarkable book by Edwin Jaynes, Probability Theory: The Logic of Science, which is real pleasure to read. Here is an exctract from Appendix B.3: Willy Feller on measure theory:
In contrast to our policy, many expositions of probability theory begin at the outset to try to assign probabilities on infinite sets, both countable or uncountable. Those who use measure theory are, in effect, supposing the passage to an infinite set already accomplished before introducing probabilities. For example, Feller advocates this policy and uses it throughout his second volume (Feller, 1966).
In discussing this issue, Feller (1966) notes that specialists in various applications sometimes ‘deny the need for measure theory because they are unacquainted with problems of other types and with situations where vague reasoning did lead to wrong results’. If Feller knew of any case where such a thing has happened, this would surely have been the place to cite it – yet he does not. Therefore we remain, just as he says, unacquainted with instances where wrong results could be attributed to failure to use measure theory.
But, as noted particularly in Chapter 15, there are many documentable cases where careless use of infinite sets has led to absurdities.We know of no case where our ‘cautious approach’ policy leads to inconsistency or error; or fails to yield a result that is reasonable.
We do not use the notation of measure theory because it presupposes the passage to an infinite limit already carried out at the beginning of a derivation – in defiance of the advice of Gauss, quoted at the start of Chapter 15. But in our calculations we often pass to an infinite limit at the end of a derivation; then we are in effect using ‘Lebesgue measure’ directly in its original meaning. We think that failure to use current measure theory notation is not ‘vague reasoning’; quite the opposite. It is a matter of doing things in the proper order.
You should finish reading the whole text in reference I gave above.