pandas count number of duplicate rows code example

Example: pandas add count of repeated elements from column

# Basic syntax:
# Get counts of duplicated elements in one column:
dataframe.pivot_table(index=['column_name'], aggfunc='size')
# Get counts of duplicated elements across multiple columns:
dataframe.pivot_table(index=['column_1', 'column_2',...], aggfunc='size')

# Note, the column (column_name) doesn't need to be sorted
# Note, this will return a Series object containing column_name and
#	a column with the number of occurrences of each value in column_name

# One approach to adding the counts back to the original dataframe:
counts = dataframe.pivot_table(index=['column_name'], aggfunc='size')
counts = pd.DataFrame(counts) # Convert Series to DataFrame
counts.index.name = 'column_name'
counts.reset_index(inplace=True) # Change row names to be a column
counts.columns = ['column_name', 'counts']
dataframe = dataframe.merge(counts) # Merge dataframes on common column

pandas count number of duplicate rows code example

Example: pandas add count of repeated elements from column

Tags:

Python Example

Related

Recent Posts