Loading data
Loading data into Data Frames
Pandas provides several methods to load data from different file formats into a DataFrame for analysis. Below are common ways to load data into Pandas.
Loading Data from a CSV File
CSV (Comma-Separated Values) files are the most commonly used format for storing tabular data.
Reading a CSV File
import pandas as pd
df = pd.read_csv("data.csv") # Load CSV file into DataFrame
print(df.head()) # Display the first 5 rows
Common Parameters:
sep=";"
: Use if the file uses a different delimiter (e.g., semicolon instead of a comma).header=None
: Use if the file has no column names.names=['A', 'B', 'C']
: Assign custom column names.index_col=0
: Use the first column as the index.usecols=['Name', 'Age']
: Load only selected columns.dtype={"Age": int}
: Specify data types.
# Example : Reading specific columns
df = pd.read_csv("data.csv", usecols=['Name', 'Age'])
print(df)
Output
Name Age
0 Jasmeet 25
1 Chris 30
2 Charlie 35
3 David 40
Loading Data from an Excel File
Reading an Excel File
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
print(df.head())
Common Parameters:
sheet_name="Sheet1"
: Read a specific sheet.usecols="A:C"
: Read only selected columns.
Loading Data from a JSON File
JSON (JavaScript Object Notation) is widely used for storing structured data.
Reading a JSON File
df = pd.read_json("data.json")
print(df.head())
Common Parameters:
orient="records"
: Use if JSON is formatted as a list of dictionaries.
Loading Data from a SQL Database
Reading from a SQL Table
import sqlite3
conn = sqlite3.connect("database.db") # Connect to SQLite database
df = pd.read_sql("SELECT * FROM customers", conn)
print(df.head())
Common Parameters:
index_col="id"
: Use a specific column as the index.
Loading Data from a Web URL
Reading CSV from a Web URL
url = "https://example.com/data.csv"
df = pd.read_csv(url)
print(df.head())
Loading Data from a Python Dictionary
Reading from a Dictionary
data = {
"Name": ["Jasmeet", "Chris", "Charlie"],
"Age": [25, 30, 35],
"City": ["New York", "Los Angeles", "Chicago"]
}
df = pd.DataFrame(data)
print(df)
Summary
File Format | Method |
---|---|
CSV | pd.read_csv("file.csv") |
Excel | pd.read_excel("file.xlsx") |
JSON | pd.read_json("file.json") |
SQL | pd.read_sql("SQL Query", connection) |
Web URL | pd.read_csv("http://example.com/data.csv") |
Dictionary | pd.DataFrame(data_dict) |
No questions available.