Common Table Expressions
Create, Reuse and Repeat.
Introduction
A Common Table Expression (CTE) is a temporary result set that you can reference within a SELECT
,
INSERT
, UPDATE
, or DELETE
statement. CTEs make complex queries easier to read and maintain by breaking
them into smaller, modular components. Unlike subqueries, CTEs provide a more readable structure and can be reused
within the same query.
CTEs are defined using the WITH
keyword, followed by the query they represent.
They are particularly useful in customer management for tasks like hierarchical data representation, recursive queries, and simplifying complex joins.
Syntax of a CTE
The general syntax for a CTE in MS SQL is:
WITH CTE_Name (Column1, Column2, ...)
AS
(
-- CTE Query
SELECT Column1, Column2, ...
FROM TableName
WHERE Condition
)
-- Main query using the CTE
SELECT Column1, Column2, ...
FROM CTE_Name;
Tables
Customers
Orders
Example
Let’s start with a simple example where we list all customers who have spent more than $200 in total.
WITH CustomerTotalSpend AS
(
SELECT customer_id, SUM(total_amount) AS total_spent
FROM Orders
GROUP BY customer_id
)
SELECT c.customer_id, c.first_name, t.total_spent
FROM Customers c
JOIN CustomerTotalSpend t ON c.customer_id = t.customer_id
WHERE t.total_spent > 200;
In this example, the CTE CustomerTotalSpend
calculates the total amount spent by each customer.
The main query retrieves the customer details for those whose total spending exceeds $200.
Multiple CTEs
You can define multiple CTEs in a single query by separating them with commas. For example, if you want to find customers who made a purchase in 2023 and spent over $300, you can use two CTEs.
WITH Customers2023 AS
(
SELECT customer_id
FROM Orders
WHERE YEAR(order_date) = 2023
),
CustomerTotalSpend AS
(
SELECT customer_id, SUM(total_amount) AS total_spent
FROM Orders
GROUP BY customer_id
)
SELECT c.customer_id, c.first_name, t.total_spent
FROM Customers c
JOIN Customers2023 y ON c.customer_id = y.customer_id
JOIN CustomerTotalSpend t ON c.customer_id = t.customer_id
WHERE t.total_spent > 300;
Here, the first CTE (Customers2023
) filters customers who made purchases in 2023,
while the second CTE (CustomerTotalSpend
) calculates the total spending for those customers.
Recursive CTEs
A recursive CTE is a CTE that refers to itself. Recursive CTEs are useful for working with hierarchical data, such as customer referral systems or organizational structures.
- Recursive CTE Example: Customer Referrals
Assume we have a table that tracks customer referrals, where each customer can refer others. We want to find all customers referred by a specific customer (let’s say customer_id = 1), including customers referred by those they referred, and so on.
WITH RecursiveReferrals (customer_id, referred_by) AS
(
-- Anchor member: Start with the original customer
SELECT customer_id, referred_by
FROM Customers
WHERE referred_by= 1
UNION ALL
-- Recursive member: Find customers referred by the customers in the anchor member
SELECT cr.customer_id, cr.referred_by
FROM Customers cr
INNER JOIN RecursiveReferrals rr ON cr.referred_by= rr.customer_id
)
SELECT Top (10)customer_id, referred_by
FROM RecursiveReferrals
option (maxrecursion 0);
In this query, the CTE recursively finds first top 10 customers referred directly or indirectly by the original customer (customer_id = 1).
- Recursive CTE Example: Sales Team Hierarchy
If your customer management system tracks a sales team structure, where each salesperson reports to another, you can use a recursive CTE to explore this hierarchy.
WITH SalesHierarchy (salesperson_id, manager_id, Level) AS
(
-- Base case: Start with the top-level manager
SELECT salesperson_id, manager_id, 0 AS Level
FROM Sales
WHERE manager_id IS NULL
UNION ALL
-- Recursive step: Find employees reporting to the current level of managers
SELECT s.salesperson_id, s.manager_id, h.Level + 1
FROM Sales s
INNER JOIN SalesHierarchy h ON s.manager_id = h.salesperson_id
)
SELECT salesperson_id, manager_id, Level
FROM SalesHierarchy
ORDER BY Level;
Here, the CTE builds a hierarchy of salespersons, where each level represents the depth in the organizational chart.
Benefits of Using CTEs
- Readability: CTEs break down complex queries into manageable parts, improving query readability.
- Reusability: A CTE can be referenced multiple times within the same query, reducing repetition.
- Modularity: CTEs enable a modular approach to query building, allowing you to structure your query logic step by step.
- Simplifying Recursive Queries: Recursive CTEs are a powerful tool for dealing with hierarchical and self-referential data.
Limitations of CTEs
- Scope: CTEs are only valid within the query in which they are defined. Once the query execution is complete, the CTE result is discarded.
- Performance: For very large datasets, complex recursive CTEs may have performance overhead. In such cases, using indexed tables or optimizing joins might be more efficient.
Conclusion
CTEs provide a powerful way to simplify complex queries and make them more readable and maintainable. They are especially useful in customer management for calculating aggregated data, handling hierarchical structures, and constructing recursive queries. When used effectively, CTEs can significantly enhance the flexibility of your MS SQL queries, whether you're analyzing customer purchases, tracking referrals, or managing team hierarchies.
Asked in Interview
- What is a Common Table Expression (CTE) in SQL, and why is it used?
- What is the difference between a CTE and a derived table?
- Can a CTE reference itself in SQL?
- What is the difference between a CTE and a temporary table?
- Can you update data in a table using a CTE?
- Can a CTE contain multiple queries or multiple CTEs?
- What is a Common Table Expression (CTE) in SQL, and how is it different from a subquery?
- What are the advantages of using a CTE over a derived table or subquery?
- What is a recursive CTE, and how does it work?
- How does recursion terminate in a recursive CTE?
- How does a CTE affect query performance compared to a subquery?