Data Frames in Pandas

Pandas DataFrame Explained.

A Pandas DataFrame is a two-dimensional, labeled data structure similar to an Excel spreadsheet or SQL table. It consists of rows and columns, where:

  • Rows represent individual records (indexed by default with numbers 0, 1, 2, …)
  • Columns represent attributes/features, each with a column label

Creating a DataFrame

1. Creating a DataFrame from a Dictionary

import pandas as pd

data = {
    "Name": ["Jasmeet", "Chris", "Charlie"],
    "Age": [25, 30, 35],
    "City": ["New York", "Los Angeles", "Chicago"]
}

df = pd.DataFrame(data)
print(df)

Output:

    Name       Age     City
0   Jasmeet    25      New York
1   Chris      30      Los Angeles
2   Charlie    35      Chicago
  • Each column represents an attribute (Name, Age, City).
  • Each row represents a record (Jasmeet, Chris, Charlie).

2. Creating a DataFrame from a List of Lists

data = [
    ["Jasmeet", 25, "New York"],
    ["Chris", 30, "Los Angeles"],
    ["Charlie", 35, "Chicago"]
]

df = pd.DataFrame(data, columns=["Name", "Age", "City"])
print(df)

3. Creating a DataFrame from a CSV File

df = pd.read_csv("data.csv")  # Read a CSV file into a DataFrame
print(df.head())  # Display the first 5 rows

Accessing Data in a DataFrame

1. Accessing Columns

print(df["Name"])  # Access 'Name' column
print(df.Age)  # Access 'Age' column (alternative syntax)

2. Accessing Rows

print(df.loc[0])  # Access first row using label-based index
print(df.iloc[1])  # Access second row using numeric index

3. Accessing Multiple Columns

print(df[["Name", "Age"]])  # Select multiple columns

4. Filtering Data

print(df[df["Age"] > 30])  # Get all rows where Age > 30

Modifying a DataFrame

1. Adding a New Column

df["Salary"] = [50000, 60000, 70000]  # Add a new column
print(df)

2. Updating Column Values

df["Age"] = df["Age"] + 1  # Increase all ages by 1

3. Deleting a Column

df.drop("Salary", axis=1, inplace=True)  # Remove 'Salary' column

4. Deleting a Row

df.drop(1, axis=0, inplace=True)  # Remove the second row

DataFrame vs. Series

FeaturePandas SeriesPandas DataFrame
Structure1D (Single Column)2D (Multiple Columns)
Data TypeSingle data typeMultiple data types
IndexingOne index per valueRow & Column indexing
Examplepd.Series([1, 2, 3])pd.DataFrame({"A": [1, 2], "B": [3, 4]})
No questions available.