Example 1: df sort values
>>> df.sort_values(by=['col1'], ascending = False)
col1 col2 col3
0 A 2 0
1 A 1 1
2 B 9 9
5 C 4 3
4 D 7 2
3 NaN 8 4
Example 2: sorting by column in pandas
df.sort_values(by=["col1"])
df.sort_values(by=["col1"], inplace = True)
Example 3: how to sort in pandas
// Single sort
>>> df.sort_values(by=['col1'],ascending=False)
// ascending => [False(reverse order) & True(default)]
// Multiple Sort
>>> df.sort_values(by=['col1','col2'],ascending=[True,False])
// with apply()
>>> df[['col1','col2']].apply(sorted,axis=1)
// axis = [1 & 0], 1 = 'columns', 0 = 'index'
Example 4: sort values pandas
>>> df = pd.DataFrame({
... 'col1': ['A', 'A', 'B', np.nan, 'D', 'C'],
... 'col2': [2, 1, 9, 8, 7, 4],
... 'col3': [0, 1, 9, 4, 2, 3],
... 'col4': ['a', 'B', 'c', 'D', 'e', 'F']
... })
>>> df
col1 col2 col3 col4
0 A 2 0 a
1 A 1 1 B
2 B 9 9 c
3 NaN 8 4 D
4 D 7 2 e
5 C 4 3 F
df.sort_values(by=['col1'])
col1 col2 col3 col4
0 A 2 0 a
1 A 1 1 B
2 B 9 9 c
5 C 4 3 F
4 D 7 2 e
3 NaN 8 4 D
Example 5: sort a dataframe
sort_na_first = gapminder.sort_values('lifeExp',na_position='first')
Example 6: Returns a new DataFrame sorted by the specified column(s)
df.sort(df.age.desc()).collect()
df.sort("age", ascending=False).collect()
df.orderBy(df.age.desc()).collect()
from pyspark.sql.functions import *
df.sort(asc("age")).collect()
df.orderBy(desc("age"), "name").collect()
df.orderBy(["age", "name"], ascending=[0, 1]).collect()