How to calculate percentage when old value is ZERO
If you're required to show growth as a percentage it's customary to display [NaN]
or something similar in these cases. A growth rate, on the other hand, would be reported in this case as $/month. So in your example for April
the growth rate would be calculated as ((20-0)/1
.
In any event, determining the correct method for reporting this special case is a user decision. Is it covered in your user requirements?
There is no rate of growth from 0 to any other number. That is to say, there is no percentage of increase from zero to greater than zero and there is no percentage of decrease from zero to less than zero (a negative number). What you have to decide is what to put as an output when this situation happens. Here are two possibilities I am comfortable with:
- Any time you have to show a rate of increase from zero, output the infinity symbol (∞). That's Alt + 236 on your number pad, in case you're wondering. You could also use negative infinity (-∞) for a negative growth rate from zero.
- Output a statement such as "[Increase/Decrease] From Zero" or something along those lines.
Unfortunately, if you need the growth rate for further calculations, the above options will not work, but, on the other hand, any number would give your following calculations incorrect data any way so the point is moot. You'd need to update your following calculations to account for this eventuality.
As an aside, the ((New-Old)/Old) function will not work when your new and old values are both zero. You should create an initial check to see if both values are zero and, if they are, output zero percent as the growth rate.
How to deal with Zeros when calculating percentage changes is the researcher's call and requires some domain expertise. If the researcher believes that it would not be distorting the data, s/he may simply add a very small constant to all values to get rid of all zeros. In financial series, when dealing with trading volume, for example, we may not want to do this because trading volume = 0 just means that: the asset did not trade at all. The meaning of volume = 0 may be very different from volume = 0.00000000001. This is my preferred strategy in cases whereby I can not logically add a small constant to all values. Consider the percentage change formula ((New-Old)/Old) *100. If New = 0, then percentage change would be -100%. This number indeed makes financial sense as long as it is the minimum percentage change in the series (This is indeed guaranteed to be the minimum percentage change in the series). Why? Because it shows that trading volume experiences maximum possible decrease, which is going from any number to 0, -100%. So, I'll be fine with this value being in my percentage change series. If I normalize that series, then even better since this (possibly) relatively big number in absolute value will be analyzed on the same scale as other variables are. Now, what if the Old value = 0. That's a trickier case. Percentage change due to going from 0 to 1 will be equal to that due to going from 0 to a million: infinity. The fact that we call both "infinity" percentage change is problematic. In this case, I would set the infinities equal to np.nan and interpolate them.
The following graph shows what I discussed above. Starting from series 1, we get series 4, which is ready to be analyzed, with no Inf or NaNs.
One more thing: a lot of the time, the reason for calculating percentage change is to stationarize the data. So, if your original series contains zero and you wish to convert it to percentage change to achieve stationarity, first make sure it is not already stationary. Because if it is, you don't have to calculate percentage change. The point is that series that take the value of 0 a lot (the problem OP has) are very likely to be already stationary, for example the volume series I considered above. Imagine a series oscillating above and below zero, thus hitting 0 at times. Such a series is very likely already stationary.
This is most definitely a programming problem. The problem is that it cannot be programmed, per se. When P is actually zero then the concept of percentage change has no meaning. Zero to anything cannot be expressed as a rate as it is outside the definition boundary of rate. Going from 'not being' into 'being' is not a change of being, it is instead creation of being.