Vertical line at the end of a CDF histogram using matplotlib

An alternative way to plot a CDF would be as follows (in my example, X is a bunch of samples drawn from the unit normal):

import numpy as np
import matplotlib.pyplot as plt

X = np.random.randn(10000)
n = np.arange(1,len(X)+1) / np.float(len(X))
Xs = np.sort(X)
fig, ax = plt.subplots()
ax.step(Xs,n) 

enter image description here


I needed a solution where I would not need to alter the rest of my code (using plt.hist(...) or, with pandas, dataframe.plot.hist(...)) and that I could reuse easily many times in the same jupyter notebook.

I now use this little helper function to do so:

def fix_hist_step_vertical_line_at_end(ax):
    axpolygons = [poly for poly in ax.get_children() if isinstance(poly, mpl.patches.Polygon)]
    for poly in axpolygons:
        poly.set_xy(poly.get_xy()[:-1])

Which can be used like this (without pandas):

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

X = np.sort(np.random.randn(1000))

fig, ax = plt.subplots()
plt.hist(X, bins=100, cumulative=True, density=True, histtype='step')

fix_hist_step_vertical_line_at_end(ax)

Or like this (with pandas):

import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.randn(1000))

fig, ax = plt.subplots()
ax = df.plot.hist(ax=ax, bins=100, cumulative=True, density=True, histtype='step', legend=False)

fix_hist_step_vertical_line_at_end(ax)

result

This works well even if you have multiple cumulative density histograms on the same axes.

Warning: this may not lead to the wanted results if your axes contain other patches falling under the mpl.patches.Polygon category. That was not my case so I prefer using this little helper function in my plots.