Understanding SQL MERGE: When and Why to Use It

08/01/2026

Picture this: you're building an ETL pipeline that syncs customer data from a staging table to production. The old approach involves three separate queries—check if exists, update if found, insert if not. Sounds simple, right? Until you hit a race condition where between your SELECT and INSERT, another process inserts the same record. Now you've got duplicate key violations or inconsistent data.

The SQL MERGE statement solves this by wrapping all three operations into a single atomic transaction. It's been around in SQL Server since 2008 and PostgreSQL since version 15, but I still see developers writing those three-statement patterns that are just waiting to break under concurrency.

Here's when MERGE actually makes sense, when it doesn't, and why it's become my default for data synchronization tasks.

What Is MERGE?

The MERGE statement lets you perform INSERT, UPDATE, and DELETE operations in a single atomic statement based on matching conditions between a source and target table. According to the PostgreSQL documentation, it was added in PostgreSQL 15 to match SQL:2003 standard behavior.

The elegant part is that the database engine handles the logic atomically—you don't have to worry about what happens between checking for existence and performing the operation.

Basic Syntax

The syntax varies slightly between database vendors, but the core structure is consistent:


MERGE INTO target_table AS target
USING source_table AS source
ON target.key_column = source.key_column
WHEN MATCHED THEN
    UPDATE SET target.column1 = source.column1
WHEN NOT MATCHED THEN
    INSERT (column1, column2) VALUES (source.column1, source.column2);

PostgreSQL and SQL Server use similar syntax. Oracle has some differences, and MySQL doesn't support MERGE directly (you'd use INSERT ... ON DUPLICATE KEY UPDATE instead).

Important gotcha: SQL Server requires a semicolon (;) before the MERGE statement if it's not the first statement in the batch. PostgreSQL doesn't have this requirement, but it's good practice to use semicolons anyway.

Real-World Example: Synchronizing Inventory

Let's say you're managing inventory. Every night, a batch process updates your product catalog from an external system. Some products are new, some have price changes, and discontinued items should be marked inactive.

Target Table (products):


CREATE TABLE products (
    product_id INTEGER PRIMARY KEY,
    product_name VARCHAR(100),
    price DECIMAL(10, 2),
    status VARCHAR(20) DEFAULT 'ACTIVE',
    last_updated TIMESTAMP
);

Source Table (staging_updates):


CREATE TABLE staging_updates (
    product_id INTEGER,
    product_name VARCHAR(100),
    price DECIMAL(10, 2),
    is_discontinued BOOLEAN
);

Here's the MERGE statement that handles all three scenarios:


MERGE INTO products AS p
USING staging_updates AS s
ON p.product_id = s.product_id
WHEN MATCHED AND s.is_discontinued = true THEN
    UPDATE SET 
        status = 'DISCONTINUED',
        last_updated = CURRENT_TIMESTAMP
WHEN MATCHED AND s.is_discontinued = false THEN
    UPDATE SET 
        product_name = s.product_name,
        price = s.price,
        status = 'ACTIVE',
        last_updated = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, price, status, last_updated)
    VALUES (s.product_id, s.product_name, s.price, 'ACTIVE', CURRENT_TIMESTAMP);

The trick here is using multiple WHEN MATCHED clauses with conditions. PostgreSQL and SQL Server both support this. You can have conditional updates based on different criteria—something you can't easily do with separate UPDATE statements without risking race conditions.

Why Prefer MERGE Over Separate Statements?

Atomicity

The biggest win is atomicity. With separate statements:


-- Problematic approach
UPDATE products SET price = 100 WHERE product_id = 123;
-- What if another transaction inserts product_id = 123 here?
INSERT INTO products (product_id, price) VALUES (123, 100);

If your application logic decides to update first, then insert on failure, you've got a race condition. Another process could insert the row between your UPDATE and INSERT, causing a duplicate key error or inconsistent state.

MERGE handles this as a single atomic operation. The database engine evaluates all matches first, then applies changes in one transaction.

Performance

Performance is where MERGE really shines at scale. Consider this alternative:


-- Slow approach: multiple scans
UPDATE products p
SET price = s.price
FROM staging_updates s
WHERE p.product_id = s.product_id;

INSERT INTO products (product_id, product_name, price)
SELECT product_id, product_name, price
FROM staging_updates s
WHERE NOT EXISTS (
    SELECT 1 FROM products p WHERE p.product_id = s.product_id
);

This scans both tables twice. With MERGE, the database engine can optimize to read each table once. The SQL Server documentation specifically notes that MERGE can be more efficient than equivalent separate statements because the optimizer sees the full operation.

Conditional Logic

With MERGE, you can have multiple conditional branches:


MERGE INTO customer_orders AS co
USING order_updates AS ou
ON co.order_id = ou.order_id
WHEN MATCHED AND ou.status = 'CANCELLED' THEN
    DELETE
WHEN MATCHED AND ou.amount > co.amount THEN
    UPDATE SET 
        amount = ou.amount,
        notes = 'Price increased after initial order'
WHEN MATCHED THEN
    UPDATE SET status = ou.status
WHEN NOT MATCHED THEN
    INSERT (order_id, customer_id, amount, status)
    VALUES (ou.order_id, ou.customer_id, ou.amount, ou.status);

Try doing that with separate UPDATE and INSERT statements while maintaining the same atomicity guarantees. You'd need complex transaction logic and still risk race conditions.

The order of WHEN clauses matters in some databases. SQL Server evaluates them top-to-bottom and stops at the first matching condition. PostgreSQL follows the same behavior. Always put more specific conditions before general ones.

When MERGE Is Overkill

Not every data sync needs MERGE. Here are scenarios where simpler approaches are better:

Simple inserts only: If you're just adding new records and duplicates are impossible, use INSERT:


-- Don't use MERGE for this
INSERT INTO audit_log (event_type, event_data)
SELECT event_type, event_data FROM staging_events;

Single-row updates: For application-level updates where you're working with one row at a time, use standard UPDATE:


-- Better than MERGE for single-row operations
UPDATE users SET last_login = NOW() WHERE user_id = 123;

Database-specific alternatives: MySQL's INSERT ... ON DUPLICATE KEY UPDATE is simpler and more idiomatic for that database:


-- MySQL: simpler than MERGE
INSERT INTO products (product_id, price)
VALUES (123, 99.99)
ON DUPLICATE KEY UPDATE price = 99.99;

WHEN NOT MATCHED BY SOURCE: Deleting Records

One powerful but dangerous feature is WHEN NOT MATCHED BY SOURCE, which deletes rows in the target that don't exist in the source. This is supported in SQL Server and PostgreSQL 17+ (not available in PostgreSQL 15-16):


-- SQL Server / PostgreSQL 17+
MERGE INTO products AS p
USING staging_updates AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN
    UPDATE SET price = s.price
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, price)
    VALUES (s.product_id, s.product_name, s.price)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE;  -- Deletes products not in staging_updates

The danger: If staging_updates only contains a subset of products (say, just today's updates), this would delete all other products from the target table. Always scope your source:


-- Safer approach: only sync products that should exist
MERGE INTO products AS p
USING (
    SELECT product_id, product_name, price
    FROM staging_updates
    WHERE sync_date = CURRENT_DATE  -- Only sync today's updates
) AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN UPDATE SET price = s.price
WHEN NOT MATCHED THEN INSERT (product_id, product_name, price)
    VALUES (s.product_id, s.product_name, s.price);
-- Intentionally omitting WHEN NOT MATCHED BY SOURCE

Use WHEN NOT MATCHED BY SOURCE only when you're doing a full sync where the source represents the complete desired state.

Advanced Patterns

Outputting Affected Rows

SQL Server supports returning information about what was changed using the OUTPUT clause:


-- SQL Server OUTPUT clause
MERGE products AS p
USING staging_updates AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN UPDATE SET price = s.price
WHEN NOT MATCHED THEN INSERT (product_id, price) VALUES (s.product_id, s.price)
OUTPUT $action, inserted.product_id, deleted.price AS old_price, inserted.price AS new_price;

The $action column returns 'INSERT', 'UPDATE', or 'DELETE' depending on what operation was performed on each row.

PostgreSQL 17+ added a RETURNING clause for MERGE:


-- PostgreSQL 17+ RETURNING clause
MERGE INTO products AS p
USING staging_updates AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN UPDATE SET price = s.price
WHEN NOT MATCHED THEN INSERT (product_id, price) VALUES (s.product_id, s.price)
RETURNING merge_action(), product_id, price;

Note: The RETURNING clause for MERGE is only available in PostgreSQL 17 and later. PostgreSQL 15-16 don't support it.

This is useful for audit logging or triggering downstream processes based on what actually changed.

Handling NULLs in Source

One gotcha: if your source has NULL values and you want to preserve existing target values:


MERGE INTO products AS p
USING staging_updates AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN
    UPDATE SET 
        -- Only update if source has a value
        product_name = COALESCE(s.product_name, p.product_name),
        price = COALESCE(s.price, p.price)
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, price)
    VALUES (s.product_id, s.product_name, s.price);

Without COALESCE, NULL values in the source would overwrite existing data in the target.

Performance Considerations

The performance of MERGE depends heavily on proper indexing. The columns used in the ON clause should be indexed:


-- Essential index for MERGE performance
CREATE INDEX idx_products_product_id ON products(product_id);
CREATE INDEX idx_staging_product_id ON staging_updates(product_id);

Without indexes, MERGE degrades to full table scans. I've seen production queries that worked fine with 1,000 rows but crawled with 1 million—missing indexes on the join columns.

For large batch operations, consider filtering the source table:


-- Better: filter source to only changed rows
MERGE INTO products AS p
USING (
    SELECT * FROM staging_updates 
    WHERE last_modified > CURRENT_DATE - INTERVAL '1 day'
) AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN UPDATE SET price = s.price
WHEN NOT MATCHED THEN INSERT (product_id, price) VALUES (s.product_id, s.price);

Duplicate Matches in Source

If your ON clause matches multiple source rows to a single target row, you'll get an error. This commonly happens when the join condition isn't unique enough:


-- Problem: multiple source rows match one target
MERGE INTO products AS p
USING staging_updates AS s
ON p.product_name = s.product_name  -- Not unique!
WHEN MATCHED THEN UPDATE SET price = s.price;
-- Error: MERGE statement attempted to UPDATE or DELETE the same row more than once

The fix: ensure your ON clause uses a unique key or add conditions to filter duplicates:


-- PostgreSQL: use DISTINCT ON
MERGE INTO products AS p
USING (
    SELECT DISTINCT ON (product_id) product_id, product_name, price
    FROM staging_updates
    ORDER BY product_id, last_modified DESC  -- Take latest if duplicates
) AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN UPDATE SET price = s.price;

-- SQL Server: use ROW_NUMBER()
MERGE INTO products AS p
USING (
    SELECT product_id, product_name, price
    FROM (
        SELECT *, ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY last_modified DESC) AS rn
        FROM staging_updates
    ) ranked
    WHERE rn = 1
) AS s
ON p.product_id = s.product_id
WHEN MATCHED THEN UPDATE SET price = s.price;

Concurrency and Deadlocks

MERGE can cause deadlocks in high-concurrency scenarios, especially when multiple transactions modify the same rows. SQL Server is particularly prone to this because MERGE acquires locks on both source and target.

If you're seeing deadlocks, consider:

Using appropriate isolation levels: READ COMMITTED (default) is usually fine, but SERIALIZABLE prevents phantom reads at the cost of more blocking.
Error handling: Wrap MERGE in TRY-CATCH (SQL Server) or transaction blocks with error handling:


-- SQL Server error handling
BEGIN TRY
    BEGIN TRANSACTION;
    MERGE products AS p
    USING staging_updates AS s
    ON p.product_id = s.product_id
    WHEN MATCHED THEN UPDATE SET price = s.price
    WHEN NOT MATCHED THEN INSERT (product_id, price) VALUES (s.product_id, s.price);
    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION;
    -- Handle error (log, retry, etc.)
    THROW;
END CATCH;

Batch processing: Process data in smaller batches to reduce lock contention.
Avoiding hot rows: If multiple processes update the same row frequently, consider batching or queuing updates instead of processing them immediately.

Database Compatibility

Support varies by database:

PostgreSQL: Basic MERGE supported since version 15 (released 2022). PostgreSQL 17 added RETURNING clause and WHEN NOT MATCHED BY SOURCE/BY TARGET syntax. Before version 15, use INSERT ... ON CONFLICT as an alternative for upsert operations. Official docs
SQL Server: Supported since SQL Server 2008. Full support for OUTPUT clause and BY SOURCE/BY TARGET syntax. Requires semicolon before MERGE if not first statement. Official docs
Oracle: Supported with slightly different syntax. More flexible with conditional clauses. Official docs
MySQL: Not supported. Use INSERT ... ON DUPLICATE KEY UPDATE instead (similar functionality, different syntax)
SQLite: Not supported. Use INSERT OR REPLACE or application logic with separate statements

PostgreSQL alternative: If you're on PostgreSQL < 15, INSERT ... ON CONFLICT covers most upsert use cases:


-- PostgreSQL alternative to MERGE (upsert only)
INSERT INTO products (product_id, product_name, price)
SELECT product_id, product_name, price FROM staging_updates
ON CONFLICT (product_id) 
DO UPDATE SET 
    product_name = EXCLUDED.product_name,
    price = EXCLUDED.price;

Slowly Changing Dimensions (SCD)

In data warehousing, MERGE is essential for handling Slowly Changing Dimensions (SCD). You're tracking historical changes while keeping current records accessible:


-- SCD Type 2: Track history with effective dates
MERGE INTO dim_customer AS target
USING staging_customer AS source
ON target.customer_id = source.customer_id 
   AND target.current_flag = 'Y'  -- Only match current record
WHEN MATCHED AND target.email <> source.email THEN
    -- Close current record, insert new version
    UPDATE SET 
        current_flag = 'N',
        end_date = CURRENT_DATE
WHEN NOT MATCHED THEN
    INSERT (customer_id, email, start_date, current_flag)
    VALUES (source.customer_id, source.email, CURRENT_DATE, 'Y');
    
-- Then insert historical version
INSERT INTO dim_customer (customer_id, email, start_date, current_flag)
SELECT customer_id, email, CURRENT_DATE, 'Y'
FROM staging_customer s
WHERE EXISTS (
    SELECT 1 FROM dim_customer d 
    WHERE d.customer_id = s.customer_id 
    AND d.current_flag = 'N' 
    AND d.end_date = CURRENT_DATE
);

This is a simplified example—real SCD implementations can get complex, but MERGE makes the logic manageable.

The Bottom Line

Use MERGE when you need to synchronize data between tables with conditional logic and atomicity guarantees. It's not always the simplest solution, but for ETL pipelines, data warehousing (especially slowly changing dimensions), and any scenario where you're syncing data between systems, it's usually the right choice.

The pattern I see working best: start with MERGE for any multi-operation sync, then simplify only if profiling shows it's a bottleneck. Watch out for deadlocks in high-concurrency scenarios and always scope your WHEN NOT MATCHED BY SOURCE clauses carefully. Most of the time, the atomicity and clarity are worth it.

🔗 Blog 🔗 LinkedIn 🔗 Medium 🔗 Github

SQL Common Table Expressions (CTE): A Practical Guide with Real-World Examples

Master SQL Common Table Expressions (CTEs) with this practical guide. Learn how to simplify complex queries, write recursive queries, and improve readability through real-world examples.

SQL Joins Explained: Complete Guide with Examples

Master SQL JOINs with this complete guide. Learn INNER, LEFT, RIGHT, and FULL OUTER JOINs with practical examples and visual diagrams.

Discover Top YouTube Creators

Explore Popular Tech YouTube Channels

Find the most popular YouTube creators in tech categories like AI, Java, JavaScript, Python, .NET, and developer conferences. Perfect for learning, inspiration, and staying updated with the best tech content.

JavaScript Java Python .NET AI Conferences

CodeWiz

Understanding SQL MERGE: When and Why to Use It

What Is MERGE?

Basic Syntax

Real-World Example: Synchronizing Inventory

Why Prefer MERGE Over Separate Statements?

Atomicity

Performance

Conditional Logic

When MERGE Is Overkill

WHEN NOT MATCHED BY SOURCE: Deleting Records

Advanced Patterns

Outputting Affected Rows

Handling NULLs in Source

Performance Considerations

Duplicate Matches in Source

Concurrency and Deadlocks

Database Compatibility

Slowly Changing Dimensions (SCD)

The Bottom Line

Related Posts

SQL Common Table Expressions (CTE): A Practical Guide with Real-World Examples

SQL Joins Explained: Complete Guide with Examples

Explore Popular Tech YouTube Channels

Understanding SQL MERGE: When and Why to Use It

What Is MERGE?

Basic Syntax

Real-World Example: Synchronizing Inventory

Why Prefer MERGE Over Separate Statements?

Atomicity

Performance

Conditional Logic

When MERGE Is Overkill

WHEN NOT MATCHED BY SOURCE: Deleting Records

Advanced Patterns

Outputting Affected Rows

Handling NULLs in Source

Performance Considerations

Duplicate Matches in Source

Concurrency and Deadlocks

Database Compatibility

Slowly Changing Dimensions (SCD)

The Bottom Line

Related Posts

SQL Common Table Expressions (CTE): A Practical Guide with Real-World Examples

SQL Joins Explained: Complete Guide with Examples

Explore Popular Tech YouTube Channels

Transform Your Learning