Pivot tables

Summarizing and reporting data.

Introduction

Pivoting data refers to transforming rows into columns, a technique often used to restructure datasets for better analysis and reporting. In SQL Server (MS SQL), the PIVOT operator allows you to perform this transformation easily. This is especially useful for summarizing and reporting data, such as viewing customer purchases across different months or analyzing product sales across regions.

In this section, we will explore how to pivot data using MS SQL, with customer management examples.

Understanding the PIVOT Operator

The PIVOT operator rotates table data by converting values from rows into columns. It is commonly used with aggregate functions (such as SUM, AVG, COUNT, etc.) to group data.

The basic syntax of the PIVOT operator:

SELECT <non-pivoted column>, 
       [first pivoted column], 
       [second pivoted column], 
       ...
FROM 
    (<source query>) 
PIVOT 
(
    <aggregate function>(<column to aggregate>)
    FOR <column to pivot>
    IN ([first pivoted column], [second pivoted column], ...)
) AS <alias>;

A Practical Example

Let’s consider a customer management example where we have data about customer purchases made in different months, and we want to see the total purchases made by each customer across these months.

Example: Pivoting Monthly Purchases Suppose we have a table CustomerPurchases:

customer_idcustomer_namepurchase_monthtotal_amount
1John DoeJanuary150
2Jane SmithFebruary200
1John DoeFebruary180
2Jane SmithJanuary220

We want to transform this data to display the total purchases for each customer across months in separate columns.

SELECT customer_name, [January], [February], [March]
FROM 
    (SELECT customer_name, purchase_month, total_amount
     FROM CustomerPurchases) AS SourceTable
PIVOT 
(
    SUM(total_amount) 
    FOR purchase_month IN ([January], [February], [March])
) AS PivotTable;

Output:

customer_nameJanuaryFebruaryMarch
John Doe150180NULL
Jane Smith220200NULL

This query turns the values of purchase_month into separate columns (January, February, March), and the SUM function is used to aggregate total_amount for each customer.

Using Dynamic PIVOT

In some cases, the values that need to be pivoted are not known in advance. For example, you may not know the exact months or categories present in your data. In such scenarios, you can use dynamic SQL to generate a dynamic PIVOT query.

Example: Dynamic Pivoting for Monthly Purchases

DECLARE @cols NVARCHAR(MAX), @query NVARCHAR(MAX);

-- Get the distinct months dynamically
SELECT @cols = STRING_AGG(QUOTENAME(purchase_month), ',')
FROM (SELECT DISTINCT purchase_month FROM CustomerPurchases) AS MonthList;

-- Construct the pivot query
SET @query = 
'SELECT customer_name, ' + @cols + '
FROM 
    (SELECT customer_name, purchase_month, total_amount
     FROM CustomerPurchases) AS SourceTable
PIVOT 
(
    SUM(total_amount) 
    FOR purchase_month IN (' + @cols + ')
) AS PivotTable;';

-- Execute the dynamic query
EXEC sp_executesql @query;

This query dynamically pivots data based on the distinct months available in the dataset, making it adaptable to changing datasets.

Unpivoting Data

Unpivoting is the reverse of pivoting, where columns are transformed back into rows. This can be useful for flattening a dataset for further analysis or exporting data in a normalized form.

Example: Unpivoting Monthly Purchases

Suppose we have the pivoted table showing total purchases by month, and we want to transform it back into the original row format.

SELECT customer_name, purchase_month, total_amount
FROM 
    (SELECT customer_name, [January], [February], [March]
     FROM CustomerPurchasesPivot) AS PivotTable
UNPIVOT
(
    total_amount FOR purchase_month IN ([January], [February], [March])
) AS UnpivotTable;

This will return the data in its original form, transforming the columns back into rows.

Common Use Cases for Pivoting

  • Sales Analysis: Pivoting can help create sales reports where rows of sales data are transformed into summary columns, such as total sales by product or region.
  • Customer Behavior Analysis: Pivoting allows for tracking customer purchases, visits, or interactions across different time periods or categories.
  • Financial Reporting: Pivoting data is useful for financial reporting to display data like monthly revenues, expenses, or profits.

Key Considerations When Using PIVOT

  • Aggregation: The PIVOT operator requires an aggregate function like SUM, COUNT, or AVG. Always ensure the data you are pivoting makes sense for the chosen aggregate.
  • Null Values: If there are no matching rows for a specific pivoted column, the result will be NULL. You can handle these with COALESCE to replace NULL values.
  • Performance: Pivoting large datasets can be resource-intensive. Ensure your queries are optimized, especially when using dynamic SQL.

Conclusion

Pivoting data in MS SQL transforms row-based data into column-based formats for better analysis and reporting. It's an essential technique for generating insightful summaries and reports in customer management systems. Whether you’re analyzing monthly purchases or tracking customer behavior, the PIVOT and UNPIVOT operators provide flexible tools for data transformation.

Asked in Interview

  • What is a pivot table in SQL?
  • How do you create a pivot table in SQL?
  • How do you perform multiple aggregations in a pivot table?
  • How do you achieve pivot functionality without using the PIVOT keyword?
  • What happens if the pivot table contains NULL values? How do you handle them?
  • Does pivoting affect performance? If yes, how can you optimize it?
  • Can you combine a pivot with JOIN, WHERE, or ORDER BY clauses?
  • Given a table Sales with columns Product, Year, and SalesAmount, how would you pivot this data so that the years become columns and you can see the sales per product per year?