How to sort a group in a way that I get the largest number in the first row and smallest in the second and the second largest in the third and so on

Take the sorted order and then apply a quadratic function to it where the root is 1/2 the length of the array (plus some small offset). This way the highest rank is given to the extremal values (the sign of the eps offset determines whether you want a the highest value ranked above the lowest value). I added a small group at the end to show how it properly handles repeated values or an odd group size.

def extremal_rank(s):
    eps = 10**-4
    y = (pd.Series(np.arange(1, len(s)+1), index=s.sort_values().index) 
         - (len(s)+1)/2 + eps)**2
    return y.reindex_like(s)
    
df['rnk'] = df.groupby('Group')['Performance'].apply(extremal_rank)
df = df.sort_values(['Group', 'rnk'], ascending=[True, False])

   Group              Name  Performance     rnk
2      A      Chad Webster          142  6.2505
0      A      Sheldon Webb           33  6.2495
4      A    Elijah Mendoza          122  2.2503
1      A        Traci Dean           64  2.2497
3      A        Ora Harmon          116  0.2501
5      A   June Strickland           68  0.2499
8      B         Joel Gill          132  2.2503
9      B      Vernon Stone           80  2.2497
7      B      Betty Sutton          127  0.2501
6      B      Beth Vasquez           95  0.2499
11     C                 b          110  9.0006
12     C                 c           68  8.9994
10     C                 a          110  4.0004
13     C                 d           68  3.9996
15     C                 f           70  1.0002
16     C                 g           70  0.9998
14     C                 e           70  0.0000

You can avoid groupby if you use sort_values on Performace once ascending once descending, concat both sorted dataframes, then use sort_index and drop_duplicates to get the expected output:

df_ = (pd.concat([df.sort_values(['Group', 'Performance'], ascending=[True, False])
                    .reset_index(), #need the original index for later drop_duplicates
                  df.sort_values(['Group', 'Performance'], ascending=[True, True])
                    .reset_index()
                    .set_index(np.arange(len(df))+0.5)], # for later sort_index
                 axis=0)
         .sort_index()
         .drop_duplicates('index', keep='first')
         .reset_index(drop=True)
       [['Group', 'Name', 'Performance']] 
      )
print(df_)
  Group              Name  Performance
0     A      Chad Webster          142
1     A      Sheldon Webb           33
2     A    Elijah Mendoza          122
3     A        Traci Dean           64
4     A        Ora Harmon          116
5     A   June Strickland           68
6     B         Joel Gill          132
7     B      Vernon Stone           80
8     B      Betty Sutton          127
9     B      Beth Vasquez           95

Apply the sorted concatenation of nlargest and nsmallest for each group:

>>> (df.groupby('Group')[df.columns[1:]]
      .apply(lambda x:
      pd.concat([x.nlargest(x.shape[0]//2,'Performance').reset_index(),
                 x.nsmallest(x.shape[0]-x.shape[0]//2,'Performance').reset_index()]
            )
            .sort_index()
            .drop('index',1))
      .reset_index().drop('level_1',1))

  Group              Name  Performance
0     A      Chad Webster          142
1     A      Sheldon Webb           33
2     A    Elijah Mendoza          122
3     A        Traci Dean           64
4     A        Ora Harmon          116
5     A   June Strickland           68
6     B         Joel Gill          132
7     B      Vernon Stone           80
8     B      Betty Sutton          127
9     B      Beth Vasquez           95

How to sort a group in a way that I get the largest number in the first row and smallest in the second and the second largest in the third and so on

Tags:

Python

Pandas

Sorting

Pandas Groupby

Columnsorting

Related

Recent Posts