Count occurences of True/False in column of dataframe
Use pd.Series.value_counts()
:
>> df = pd.DataFrame({'boolean_column': [True, False, True, False, True]})
>> df['boolean_column'].value_counts()
True 3
False 2
Name: boolean_column, dtype: int64
If you want to count False
and True
separately you can use pd.Series.sum()
+ ~
:
>> df['boolean_column'].values.sum() # True
3
>> (~df['boolean_column']).values.sum() # False
2
With Pandas, the natural way is using value_counts
:
df = pd.DataFrame({'A': [True, False, True, False, True]})
print(df['A'].value_counts())
# True 3
# False 2
# Name: A, dtype: int64
To calculate True
or False
values separately, don't compare against True
/ False
explicitly, just sum
and take the reverse Boolean via ~
to count False
values:
print(df['A'].sum()) # 3
print((~df['A']).sum()) # 2
This works because bool
is a subclass of int
, and the behaviour also holds true for Pandas series / NumPy arrays.
Alternatively, you can calculate counts using NumPy:
print(np.unique(df['A'], return_counts=True))
# (array([False, True], dtype=bool), array([2, 3], dtype=int64))
This alternative works for multiple columns and/or rows as well.
df[df==True].count(axis=0)
Will get you the total amount of True
values per column. For row-wise count, set axis=1
.
df[df==True].count().sum()
Adding a sum()
in the end will get you the total amount in the entire DataFrame.
I couldn't find here what I exactly need. I needed the number of True and False occurrences for further calculations, so I used:
true_count = (df['column']).value_counts()[True]
False_count = (df['column']).value_counts()[False]
Where df is your DataFrame and column is the column with booleans.