Loading data
Loading data into Data Frames
Pandas provides several methods to load data from different file formats into a DataFrame for analysis. Below are common ways to load data into Pandas.
Loading Data from a CSV File
CSV (Comma-Separated Values) files are the most commonly used format for storing tabular data.
Reading a CSV File
import pandas as pd
df = pd.read_csv("data.csv") # Load CSV file into DataFrame
print(df.head()) # Display the first 5 rows
Common Parameters:
sep=";": Use if the file uses a different delimiter (e.g., semicolon instead of a comma).header=None: Use if the file has no column names.names=['A', 'B', 'C']: Assign custom column names.index_col=0: Use the first column as the index.usecols=['Name', 'Age']: Load only selected columns.dtype={"Age": int}: Specify data types.
# Example : Reading specific columns
df = pd.read_csv("data.csv", usecols=['Name', 'Age'])
print(df)
Output
Name Age
0 Jasmeet 25
1 Chris 30
2 Charlie 35
3 David 40
Loading Data from an Excel File
Reading an Excel File
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
print(df.head())
Common Parameters:
sheet_name="Sheet1": Read a specific sheet.usecols="A:C": Read only selected columns.
Loading Data from a JSON File
JSON (JavaScript Object Notation) is widely used for storing structured data.
Reading a JSON File
df = pd.read_json("data.json")
print(df.head())
Common Parameters:
orient="records": Use if JSON is formatted as a list of dictionaries.
Loading Data from a SQL Database
Reading from a SQL Table
import sqlite3
conn = sqlite3.connect("database.db") # Connect to SQLite database
df = pd.read_sql("SELECT * FROM customers", conn)
print(df.head())
Common Parameters:
index_col="id": Use a specific column as the index.
Loading Data from a Web URL
Reading CSV from a Web URL
url = "https://example.com/data.csv"
df = pd.read_csv(url)
print(df.head())
Loading Data from a Python Dictionary
Reading from a Dictionary
data = {
"Name": ["Jasmeet", "Chris", "Charlie"],
"Age": [25, 30, 35],
"City": ["New York", "Los Angeles", "Chicago"]
}
df = pd.DataFrame(data)
print(df)
Summary
| File Format | Method |
|---|---|
| CSV | pd.read_csv("file.csv") |
| Excel | pd.read_excel("file.xlsx") |
| JSON | pd.read_json("file.json") |
| SQL | pd.read_sql("SQL Query", connection) |
| Web URL | pd.read_csv("http://example.com/data.csv") |
| Dictionary | pd.DataFrame(data_dict) |
No questions available.