numpy sort acting weirdly when sorting on a pandas DataFrame

data[genres].sum() returns a Series. The genre column isn't actually a column - it's an index.

np.sort just looks at the values of the DataFrame or Series, not at the index, and it returns a new NumPy array with the sorted data[genres].sum() values. The index information is lost.

The way to sort data[genres].sum() and keep the index information would be to do something like:

genre_count = data[genres].sum()
genre_count.sort(ascending=False) # in-place sort of genre_count, high to low

You can then turn the sorted genre_count Series back into a DataFrame if you like:

pd.DataFrame({'Genre Count': genre_count})