Sorting data

Sorting Data in Pandas

Sorting in Pandas allows you to rearrange rows in a DataFrame based on column values.

Sorting DataFrame by a Single Column

Use sort_values() to sort by a specific column.

import pandas as pd

# Sample DataFrame
data = {
    "Name": ["Jasmeet", "Bob", "Charlie", "David", "Emma"],
    "Age": [25, 30, 35, 40, 29],
    "Salary": [50000, 60000, 55000, 70000, 65000]
}

df = pd.DataFrame(data)

# Sort by Age (default: ascending order)
sorted_df = df.sort_values(by="Age")
print(sorted_df)

Output:

    Name      Age  Salary
0   Jasmeet   25   50000
4   Emma      29   65000
1   Bob       30   60000
2   Charlie   35   55000
3   David     40   70000

Sorting in Descending Order

Use ascending=False to sort in descending order.

sorted_df = df.sort_values(by="Age", ascending=False)
print(sorted_df)

Output:

    Name     Age  Salary
3   David    40   70000
2   Charlie  35   55000
1   Bob      30   60000
4   Emma     29   65000
0   Jasmeet  25   50000

Sorting by Multiple Columns

Sort using multiple columns by passing a list of column names.

# Sort by Age (Ascending), then by Salary (Descending)
sorted_df = df.sort_values(by=["Age", "Salary"], ascending=[True, False])
print(sorted_df)

Output:

    Name    Age  Salary
0   Jasmeet 25   50000
4   Emma    29   65000
1   Bob     30   60000
2   Charlie 35   55000
3   David   40   70000
  • The DataFrame is sorted by Age (Ascending) first, and if two values are the same, they are sorted by Salary (Descending).

Sorting by Index

Use sort_index() to sort based on the index.

# Sort by Index (Descending)
sorted_df = df.sort_index(ascending=False)
print(sorted_df)

Sorting with Missing Values

Missing values (NaN) are placed at the end by default.

# Sorting with NaN Values
data = {
    "Name": ["Jasmeet", "Bob", "Charlie", "David", "Emma"],
    "Age": [25, None, 35, 40, 29]
}
df = pd.DataFrame(data)

# Sort by Age
sorted_df = df.sort_values(by="Age")
print(sorted_df)

Output:

    Name    Age
0   Jasmeet 25.0
4   Emma    29.0
2   Charlie 35.0
3   David   40.0
1   Bob     NaN
  • To place NaN at the beginning, use na_position="first".
sorted_df = df.sort_values(by="Age", na_position="first")

Sorting Methods

MethodDescription
df.sort_values(by="col")Sort by a column (ascending).
df.sort_values(by="col", ascending=False)Sort by a column (descending).
df.sort_values(by=["col1", "col2"], ascending=[True, False])Sort by multiple columns.
df.sort_index()Sort by index.
df.sort_values(by="col", na_position="first")Place NaN values first.
No questions available.