Difference between power law distribution and exponential decay
$$ \begin{array}{rl} \text{power law:} & y = x^{(\text{constant})}\\ \text{exponential:} & y = (\text{constant})^x \end{array} $$
That's the difference.
As for "looking the same", they're pretty different: Both are positive and go asymptotically to $0$, but with, for example $y=(1/2)^x$, the value of $y$ actually cuts in half every time $x$ increases by $1$, whereas, with $y = x^{-2}$, notice what happens as $x$ increases from $1\text{ million}$ to $1\text{ million}+1$. The amount by which $y$ gets multiplied is barely less than $1$, and if you put "billion" in place of "million", then it's even closer to $1$. With the exponential function, it always gets multiplied by $1/2$ no matter how big $x$ gets.
Also, notice that with the exponential probability distribution, you have the property of memorylessness.
How is a power law different from an exponential? (I'm putting this answer here mainly for my own future reference. Hopefully someone else may find it useful.)
Power Law function
(notice the exponent, $k,$ is a constant)
$$
y = x^k
$$
Exponential function
(notice the exponent is a variable)
$$
y = a^x
$$
Technical definition of Power Law:
A power law is any polynomial relationship that exhibits the property of scale invariance.
Scale invariance (from Wikipedia)
One attribute of power laws is their scale invariance. Given a relation $f(x) = ax^k,$ scaling the argument $x$ by a constant factor $c$ causes only a proportionate scaling of the function itself. That is,
$$ f(c x) = a(c x)^k = c^k f(x) \propto f(x) $$
That is, scaling by a constant $c$ simply multiplies the original power-law relation by the constant $c^k.$
Thus, it follows that all power laws with a particular scaling exponent are equivalent up to constant factors, since each is simply a scaled version of the others. This behavior is what produces the linear relationship when logarithms are taken of both $f(x)$ and $x,$ and the straight-line on the log-log plot is often called the signature of a power law.
If you have a limited data set, one way to tell the difference is to put the data into spreadsheet software capable of exponential and power regressions and see which gives the better correlation coefficient. Presumably the coefficient is calculated by comparing the least squares errors of the semi-log and log-log plots. A bit more on that...
Let's call an exponential law one like $y = Ca^x$ and a power function one like $y = Cx^p$. If we take the logarithm of both sides of an exponential function, we get $$ \log y = \log C + x \log a. $$ That is, the collection of ordered pairs $(x, \log y)$ (the semi-log plot) should be roughly linear for exponential data.
On the other hand, for a power function we get $$ \log y = \log C + p \log x, $$ so the collection of ordered pairs $(\log x, \log y)$ (the log-log plot) should be roughly linear for power law data.
Determining which of these two plots is more line-like can tell whether exponential or power laws best model the original data.