MERGE statement

INSERT, UPDATE, and DELETE

Introduction

SQL logo

The MERGE statement in SQL is used for combining INSERT, UPDATE, and DELETE operations in one statement. It is especially useful when synchronizing two tables, such as when you need to update one table based on changes in another.

In customer management systems, for example, you might need to synchronize customer information between a master customer database and a new set of updates from an external source.

Why Use MERGE?

MERGE simplifies the process of applying multiple operations in a single query. Instead of writing separate queries for INSERT, UPDATE, and DELETE, you can use one MERGE statement to:

  • Update existing rows.
  • Insert new rows that don’t exist.
  • Optionally delete rows that are no longer needed.

Basic Syntax of MERGE

MERGE INTO target_table AS target
USING source_table AS source
ON target.matching_column = source.matching_column
WHEN MATCHED THEN 
    UPDATE SET target.column = source.column
WHEN NOT MATCHED BY TARGET THEN
    INSERT (column1, column2) VALUES (source.column1, source.column2)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE;

Components:

  • Target table: The table you want to modify (update, insert, or delete).
  • Source table: The table you’re using as the basis for making the changes.
  • Matching condition: A condition that defines how rows from the target and source tables are matched.
  • Actions: You can define what happens when rows are matched, when they exist in the source but not in the target, and when they exist in the target but not in the source.

Example: Customer Information Synchronization

Let’s say we have two tables: Customers (the main customer data) and CustomerUpdates (new or modified customer data). We want to update the Customers table with any changes, add any new customers from the updates, and possibly remove customers who are no longer in the update data.

Step 1: Define the Schema

  • Customers table:

    • customer_id: Primary key
    • customer_name
    • email
    • city
  • CustomerUpdates table:

    • customer_id: Corresponding to Customers.customer_id
    • customer_name
    • email
    • city

Step 2: Write the MERGE Statement

MERGE INTO Customers AS target
USING CustomerUpdates AS source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN 
    UPDATE SET 
        target.customer_name = source.customer_name,
        target.email = source.email,
        target.city = source.city
WHEN NOT MATCHED BY TARGET THEN
    INSERT (customer_id, customer_name, email, city)
    VALUES (source.customer_id, source.customer_name, source.email, source.city)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE;

Explanation:

  • WHEN MATCHED: When a customer exists in both Customers and CustomerUpdates, the UPDATE operation is performed. The customer_name, email, and city fields are updated with the new data.
  • WHEN NOT MATCHED BY TARGET: If a row exists in CustomerUpdates but not in Customers, it inserts the new customer into the Customers table.
  • WHEN NOT MATCHED BY SOURCE: If a customer is found in the Customers table but not in CustomerUpdates, that customer is deleted, as they are no longer present in the updated data.

Conditional Logic with MERGE

You can add additional conditions to customize how each of the operations works.

Example: Only update customer email if the new email is not null.

WHEN MATCHED AND source.email IS NOT NULL THEN
    UPDATE SET target.email = source.email

Performance Considerations

  • Batch Processing: MERGE can be used to perform bulk updates and inserts efficiently in one operation. However, care should be taken with large datasets as it can consume considerable resources.

  • Indexing: Ensure that both the target and source tables are indexed on the matching columns. This will significantly improve the performance of the MERGE operation.

  • Locking: MERGE may cause locking on the target table, so in high-traffic systems, it’s important to be mindful of how it affects database concurrency.

Common Use Cases for MERGE

  • Data Synchronization: Keeping customer data up to date across different systems.
  • Bulk Loading of Data: When importing data from an external source (such as a customer signup list), MERGE helps avoid duplicates and maintain data consistency.
  • Archiving or Purging Data: Use MERGE to identify and remove obsolete records, like customers who are no longer active.

Conclusion

The MERGE statement in MS SQL provides a powerful way to synchronize and update data with a single, flexible query. It simplifies the process of handling multiple data modifications, especially in customer management scenarios, where records need to be accurately maintained. By combining INSERT, UPDATE, and DELETE operations, MERGE helps reduce the complexity of writing and maintaining SQL scripts while ensuring data integrity and performance.

Asked in Interview

  • Explain MERGE statement (UPSERT).
  • What is the difference between UPDATE and MERGE?