Two-sample Kolmogorov-Smirnov Test in Python Scipy
This is what the scipy docs say:
If the K-S statistic is small or the p-value is high, then we cannot reject the hypothesis that the distributions of the two samples are the same.
Cannot reject doesn't mean we confirm.
You are using the one-sample KS test. You probably want the two-sample test ks_2samp
:
>>> from scipy.stats import ks_2samp
>>> import numpy as np
>>>
>>> np.random.seed(12345678)
>>> x = np.random.normal(0, 1, 1000)
>>> y = np.random.normal(0, 1, 1000)
>>> z = np.random.normal(1.1, 0.9, 1000)
>>>
>>> ks_2samp(x, y)
Ks_2sampResult(statistic=0.022999999999999909, pvalue=0.95189016804849647)
>>> ks_2samp(x, z)
Ks_2sampResult(statistic=0.41800000000000004, pvalue=3.7081494119242173e-77)
Results can be interpreted as following:
You can either compare the
statistic
value given by python to the KS-test critical value table according to your sample size. Whenstatistic
value is higher than the critical value, the two distributions are different.Or you can compare the
p-value
to a level of significance a, usually a=0.05 or 0.01 (you decide, the lower a is, the more significant). If p-value is lower than a, then it is very probable that the two distributions are different.