pandas count number of duplicate rows code example
Example: pandas add count of repeated elements from column
# Basic syntax:
# Get counts of duplicated elements in one column:
dataframe.pivot_table(index=['column_name'], aggfunc='size')
# Get counts of duplicated elements across multiple columns:
dataframe.pivot_table(index=['column_1', 'column_2',...], aggfunc='size')
# Note, the column (column_name) doesn't need to be sorted
# Note, this will return a Series object containing column_name and
# a column with the number of occurrences of each value in column_name
# One approach to adding the counts back to the original dataframe:
counts = dataframe.pivot_table(index=['column_name'], aggfunc='size')
counts = pd.DataFrame(counts) # Convert Series to DataFrame
counts.index.name = 'column_name'
counts.reset_index(inplace=True) # Change row names to be a column
counts.columns = ['column_name', 'counts']
dataframe = dataframe.merge(counts) # Merge dataframes on common column