Setting a relative frequency in a matplotlib histogram
Because normed option of hist returns the density of points, e.g dN/dx
What you need is something like that:
# assuming that mydata is an numpy array
ax.hist(mydata, weights=np.zeros_like(mydata) + 1. / mydata.size)
# this will give you fractions
Or you can use set_major_formatter
to adjust the scale of the y-axis, as follows:
from matplotlib import ticker as tick
def adjust_y_axis(x, pos):
return x / (len(mydata) * 1.0)
ax.yaxis.set_major_formatter(tick.FuncFormatter(adjust_y_axis))
just call adjust_y_axis
as above before plt.show()
.
For relative frequency format set the option density=True
. The figure below shows a histogram for 1000 samples taken from a normal distribution with mean 5 and standard deviation 2.0.
The code is
import numpy as np
import matplotlib.pyplot as plt
# Generate data from normal distibution
mu, sigma = 5, 2.0 # mean and standard deviation
mydata = np.random.normal(mu, sigma, 1000)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.hist(mydata,bins=100,density=True);
plt.show()
If you want % on the y-axis you can use PercentFormatter
as shown below
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
# Generate data from normal distibution
mu, sigma = 5, 2.0 # mean and standard deviation
mydata = np.random.normal(mu, sigma, 1000)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.hist(mydata,bins=100,density=False);
ax.yaxis.set_major_formatter(PercentFormatter(xmax=100))
plt.show()