Window Functions

Calculations across a set of table rows.

Introduction

Window functions in SQL perform calculations across a set of table rows that are related to the current row. Unlike aggregate functions, which collapse the result set into one row, window functions retain the individual rows while applying calculations.

They are particularly useful in analytical queries for tasks such as running totals, ranking, and calculating moving averages. In a customer management scenario, window functions can help analyze customer behavior and trends over time.

Window functions include:

  • Ranking functions (ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()).
  • Aggregate functions (SUM(), AVG(), COUNT(), MIN(), MAX()).
  • Value functions (LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()).

Syntax of Window Functions

The basic structure of a window function is as follows:

<function_name>() OVER ([PARTITION BY <columns>] ORDER BY <columns>)
  • PARTITION BY: Divides the result set into partitions, performing the calculation on each partition separately.
  • ORDER BY: Defines the order in which rows are processed.

Ranking Functions

  1. ROW_NUMBER()

Assigns a unique row number to each row in the result set, starting from 1.

Example: Rank customers based on their total purchase amount.

SELECT Customer_id, first_name, 
       ROW_NUMBER() OVER (ORDER BY SUM(total_amount) DESC) AS Rank
FROM Customers c
JOIN Purchases p ON c.Customer_id = p.Customer_id
GROUP BY Customer_id, first_name;

This query ranks customers by their total purchase amount, with the highest spender getting Rank = 1.

  1. RANK()

Similar to ROW_NUMBER(), but if two rows have the same value, they get the same rank, and the next rank is skipped.

Example: Rank customers by total purchases, with ties getting the same rank.

SELECT Customer_id, first_name, 
       RANK() OVER (ORDER BY SUM(total_amount) DESC) AS Rank
FROM Customers c
JOIN Purchases p ON c.Customer_id = p.Customer_id
GROUP BY Customer_id, first_name;
  1. DENSE_RANK() Like RANK(), but the ranks are consecutive even if there are ties.

  2. NTILE(n)

  • Divides the result set into n equal parts and assigns a bucket number to each row.

Example: Divide customers into 4 groups based on their spending.

SELECT Customer_id, first_name, 
       NTILE(4) OVER (ORDER BY SUM(total_amount) DESC) AS SpendGroup
FROM Customers c
JOIN Purchases p ON c.Customer_id = p.Customer_id
GROUP BY Customer_id, first_name;

Aggregate Functions with Windowing

Window aggregate functions allow you to compute cumulative totals, running averages, or other aggregates without grouping the result set.

  1. SUM() Over a Window

Example: Calculate the running total of purchases for each customer.

SELECT Customer_id, purchase_date, total_amount,
       SUM(total_amount) OVER (PARTITION BY Customer_id ORDER BY purchase_date) AS RunningTotal
FROM Purchases;

This query calculates the running total of purchases for each customer based on the date of purchase.

  1. AVG() Over a Window

Example: Compute the average purchase amount per customer and show it next to each purchase.

SELECT Customer_id, purchase_date, total_amount,
       AVG(total_amount) OVER (PARTITION BY Customer_id) AS AvgPurchase
FROM Purchases;

Value Functions

Accessing Data from Other Rows

  1. LAG() and LEAD()
  • LAG(): Accesses data from a previous row.
  • LEAD(): Accesses data from a following row.

Example: Compare a customer's current purchase to their previous purchase.

SELECT Customer_id, purchase_date, total_amount,
       LAG(total_amount, 1, 0) OVER (PARTITION BY Customer_id ORDER BY purchase_date) AS PreviousPurchase
FROM Purchases;

In this query, LAG() retrieves the previous purchase amount for each customer.

  1. FIRST_VALUE() and LAST_VALUE()
  • FIRST_VALUE(): Returns the first value in an ordered set.
  • LAST_VALUE(): Returns the last value.

Example: Show the first purchase for each customer along with every other purchase.

SELECT Customer_id, purchase_date, total_amount,
       FIRST_VALUE(total_amount) OVER (PARTITION BY Customer_id ORDER BY purchase_date) AS FirstPurchase
FROM Purchases;

Practical Example

Let’s bring all the window functions together for an analysis of customer purchases.

Scenario: You want to rank customers by spending, see their running total, and compare each purchase to the previous one.

SELECT Customer_id, purchase_date, total_amount,
       ROW_NUMBER() OVER (PARTITION BY Customer_id ORDER BY purchase_date) AS PurchaseRank,
       SUM(total_amount) OVER (PARTITION BY Customer_id ORDER BY purchase_date) AS RunningTotal,
       LAG(total_amount) OVER (PARTITION BY Customer_id ORDER BY purchase_date) AS PreviousPurchase,
       FIRST_VALUE(total_amount) OVER (PARTITION BY Customer_id ORDER BY purchase_date) AS FirstPurchase
FROM Purchases;
  • ROW_NUMBER() ranks each purchase for a customer.
  • SUM() calculates the running total for each customer.
  • LAG() compares the current purchase with the previous one.
  • FIRST_VALUE() shows the first purchase for context.

Conclusion

Window functions in MS SQL provide powerful tools for analyzing data over specific ranges or "windows" without collapsing the result set. In customer management, they allow for in-depth behavioral analysis such as tracking spending patterns, identifying top customers, and comparing purchases over time. Mastering these functions equips you to build advanced queries for real-world data analysis tasks.

Asked in Interview

  • What are window functions in SQL?
  • Explain the difference between aggregate functions and window functions.
  • What is the difference between GROUP BY and PARTITION BY in SQL?
  • What is the use of OVER() in window functions?
  • Explain the ROW_NUMBER() function and give an example of its use.
  • What is the difference between ROW_NUMBER(), RANK(), and DENSE_RANK()?
  • Explain LEAD() and LAG() functions with an example.
  • How do you use NTILE() in SQL?
  • Explain the PARTITION BY clause and give an example.