Yet another SQL Server enthusiast

2014-07-08

sp_DeadlockAnalyzer v01

Filed under: concurrency,deadlock,sp_DeadlockAnalyzer,sql server — ---- @ 7:09 AM

 

sp_DeadlockAnalyzer is a stored procedure created with one goal: to help DBAs to debug deadlocks in SQL Server. It has one parameter: @xdl which is the XML content from [deadlock graph event] intercepted with SQL Trace (SQL Profiler). It shows information about isolation level, blocked resources, connections and T-SQL statements with their execution plans. The stored procedure should be executed in the context of the original database (ex. SimpleDeadlock in my example).

Future plans:

  1. Information about execution plan operators (and corresponding T-SQL statements) which cause every lock (see section [blocked resources]).
  2. To correlate information about statement’s type (INSERT, UPDATE, DELETE, MERGE, SELECT), table hints (ex. XLOCK), isolation levels, foreign keys.
  3. To show information about how can be deadlock solved.
  4. To test using deadlock graph events generated by SQL Trace/SQL Profiler != SQL Server 2012 (I’ve used only this version for tests) and by Extended Events sessions (including here system_health; >= SQL Server 2008).

Output:

sp_DeadlockAnalyzer.Example

Source code:

Code Snippet
  1. USE master;
  2. GO
  3.  
  4. SET ANSI_NULLS ON;
  5. GO
  6. SET QUOTED_IDENTIFIER ON;
  7. GO
  8.  
  9. IF NOT EXISTS (
  10.     SELECT    *
  11.     FROM    sys.procedures p
  12.     JOIN    sys.schemas s ON p.schema_id = s.schema_id
  13.     WHERE    s.name = ‘dbo’
  14.     AND        p.name = ‘sp_DeadlockAnalyzer’
  15. )
  16. BEGIN
  17.     EXEC (‘CREATE PROCEDURE dbo.sp_DeadlockAnalyzer AS SELECT 1′);
  18. END
  19. GO
  20.  
  21. ALTER PROCEDURE dbo.sp_DeadlockAnalyzer
  22. (
  23.     @xdl XML – Deadlock graph event
  24. )
  25. AS
  26. BEGIN
  27.     DECLARE @Resources TABLE (
  28.         Id                            INT IDENTITY PRIMARY KEY,
  29.         Resource_LockGranularity    SYSNAME NOT NULL,
  30.         Resource_DatabaseId            INT NOT NULL,
  31.         Resource_HoBtID                BIGINT NOT NULL,
  32.         Resource_ObjectName            SYSNAME NULL,
  33.         Resource_IndexName            SYSNAME NULL,
  34.     
  35.         ProcessOwner_Id                SYSNAME NOT NULL,
  36.         ProcessOwner_LockMode        SYSNAME NOT NULL,
  37.  
  38.         ProcessWaiter_Id            SYSNAME NOT NULL,
  39.         ProcessWaiter_LockMode        SYSNAME NOT NULL
  40.     );
  41.  
  42.     INSERT    @Resources (
  43.         Resource_LockGranularity    ,
  44.         Resource_DatabaseId            ,
  45.         Resource_HoBtID                ,
  46.         Resource_ObjectName            ,
  47.         Resource_IndexName            ,
  48.     
  49.         ProcessOwner_Id                ,
  50.         ProcessOwner_LockMode        ,
  51.  
  52.         ProcessWaiter_Id            ,
  53.         ProcessWaiter_LockMode
  54.     )
  55.     SELECT    Resource_LockGranularity= x.XmlCol.value(‘local-name(.)’,‘SYSNAME’),
  56.             Resource_Database        = x.XmlCol.value(‘@dbid’,‘INT’),
  57.             Resource_HoBtID            = x.XmlCol.value(‘@hobtid’,‘BIGINT’),
  58.             Resource_ObjectName        = QUOTENAME(PARSENAME(x.XmlCol.value(‘@objectname’,‘SYSNAME’), 3)) + ‘.’ + QUOTENAME(PARSENAME(x.XmlCol.value(‘@objectname’,‘SYSNAME’), 2)) + ‘.’ + QUOTENAME(PARSENAME(x.XmlCol.value(‘@objectname’,‘SYSNAME’), 1)),
  59.             Resource_IndexName        = QUOTENAME(x.XmlCol.value(‘@indexname’,‘SYSNAME’)),
  60.  
  61.             ProcessOwner_Id            = own.XmlCol.value(‘@id’, ‘SYSNAME’),
  62.             ProcessOwner_LockMode    = own.XmlCol.value(‘@mode’, ‘SYSNAME’),
  63.  
  64.             ProcessWaiter_Id        = wtr.XmlCol.value(‘@id’, ‘SYSNAME’),
  65.             ProcessWaiter_Mode        = wtr.XmlCol.value(‘@mode’, ‘SYSNAME’)
  66.     FROM    @xdl.nodes(‘deadlock-list/deadlock/resource-list/*’) x(XmlCol)
  67.     OUTER APPLY x.XmlCol.nodes(‘owner-list/owner’) own(XmlCol)
  68.     OUTER APPLY x.XmlCol.nodes(‘waiter-list/waiter’) wtr(XmlCol);
  69.  
  70.     – Information about databases
  71.     SELECT    [Object]        = db.name + ‘ (‘ + CONVERT(NVARCHAR(11), db.database_id) + ‘)’,
  72.             [Object_Type]    = ‘database’,
  73.             IsolationLevel    = ‘*snapshot_isolation_state ‘ + db.snapshot_isolation_state_desc
  74.     FROM    sys.databases db
  75.     WHERE    db.database_id IN (SELECT r.Resource_DatabaseId FROM @Resources r)
  76.     UNION ALL
  77.     SELECT    [Object]        = x.XmlCol.value(‘(@id)[1]‘, ‘SYSNAME’),
  78.             [Object_Type]    = ‘process’,
  79.             IsolationLevel    = x.XmlCol.value(‘(@isolationlevel)[1]‘, ‘SYSNAME’)
  80.     FROM    @xdl.nodes(‘deadlock-list/deadlock/process-list/process’) x(XmlCol)
  81.  
  82.  
  83.     – Information about resources
  84.     SELECT    *
  85.     FROM    @Resources
  86.  
  87.     DECLARE @Processes TABLE (Id SYSNAME PRIMARY KEY);
  88.     INSERT    @Processes (Id)
  89.     SELECT    DISTINCT p.Id
  90.     FROM    @Resources r
  91.     CROSS APPLY (
  92.         SELECT    r.ProcessOwner_Id
  93.         UNION ALL
  94.         SELECT r.ProcessWaiter_Id
  95.     ) p (Id);
  96.  
  97.     – Information about serve processes / conections
  98.     SELECT    Process_Id        = x.XmlCol.value(‘(@id)[1]‘, ‘SYSNAME’),
  99.             SPID            = x.XmlCol.value(‘(@spid)[1]‘, ‘INT’),
  100.             IsDeadlockVictim = x.XmlCol.exist(‘(.)[(@id)[1] = (../../@victim)[1]]‘),
  101.  
  102.             LogUsed            = x.XmlCol.value(‘(@logused)[1]‘, ‘INT’),
  103.             TransactionName    = x.XmlCol.value(‘(@transactionname)[1]‘, ‘SYSNAME’),
  104.             IsolationLevel    = x.XmlCol.value(‘(@isolationlevel)[1]‘, ‘SYSNAME’),
  105.             DeadlockPriority= x.XmlCol.value(‘(@priority)[1]‘, ‘SMALLINT’),
  106.         
  107.             CurrentDatabase    =  DB_NAME(x.XmlCol.value(‘(@currentdb)[1]‘, ‘SMALLINT’)) + N’ (‘ + x.XmlCol.value(‘(@currentdb)[1]‘, ‘NVARCHAR(5)’) + N’)’,
  108.         
  109.             ClientApp        = x.XmlCol.value(‘(@clientapp)[1]‘, ‘SYSNAME’),
  110.             HostName        = x.XmlCol.value(‘(@hostname)[1]‘, ‘SYSNAME’),
  111.             LoginName        = x.XmlCol.value(‘(@loginname)[1]‘, ‘SYSNAME’),
  112.             InputBuffer        = x.XmlCol.query(‘inputbuf’)
  113.     FROM    @xdl.nodes(‘deadlock-list/deadlock/process-list/process’) x(XmlCol)
  114.  
  115.     – Statements & execution plan
  116.     DECLARE @Statements TABLE (
  117.         Process_Id            SYSNAME NOT NULL,
  118.         SPID                INT NOT NULL,
  119.         [Statement]            SYSNAME NOT NULL,   
  120.         [SqlHandle]            VARBINARY(64)   
  121.     );
  122.     INSERT    @Statements
  123.     SELECT    Process_Id        = x.XmlCol.value(‘(@id)[1]‘, ‘SYSNAME’),
  124.             SPID            = x.XmlCol.value(‘(@spid)[1]‘, ‘INT’),
  125.             [Statement]        = y.XmlCol.value(‘(@procname)[1]‘, ‘SYSNAME’),
  126.             [SqlHandle]        = CONVERT(VARBINARY(64), y.XmlCol.value(‘(@sqlhandle)[1]‘, ‘VARCHAR(128)’), 1)
  127.     FROM    @xdl.nodes(‘deadlock-list/deadlock/process-list/process’) x(XmlCol)
  128.     CROSS APPLY x.XmlCol.nodes(‘executionStack/frame’) y(XmlCol)
  129.  
  130.     SELECT    stm.Process_Id, stm.SPID,
  131.             Batch            = stm.[Statement],
  132.             StatementText    = SUBSTRING(src.[text], ISNULL(NULLIF(sts.statement_start_offset, -1) / 2, 1),  ISNULL((NULLIF(sts.statement_end_offset, -1) - NULLIF(sts.statement_start_offset, -1)) / 2, 200)),
  133.             DatabasebId        = pln.dbid,
  134.             DatabaseName    = plnattr.query_db_name,
  135.             ObjectId        = pln.objectid,
  136.             ObjectName        = plnattr.query_object_name,
  137.             QueryPlan        = pln.query_plan,
  138.             SqlHandle        = stm.SqlHandle,
  139.             QueryPlanHandle    = sts.plan_handle
  140.     FROM    @Statements stm
  141.     LEFT JOIN sys.dm_exec_query_stats sts ON stm.SqlHandle = sts.sql_handle
  142.     OUTER APPLY sys.dm_exec_query_plan(sts.plan_handle) pln
  143.     OUTER APPLY (
  144.         SELECT    DB_NAME(pvt.dbid) AS query_db_name, OBJECT_NAME(pvt.objectid, pvt.dbid) query_object_name
  145.         FROM (
  146.             SELECT    fplanatt.attribute, CONVERT(INT, fplanatt.value) AS id
  147.             FROM    sys.dm_exec_plan_attributes(sts.plan_handle) fplanatt    – ref: http://msdn.microsoft.com/en-us/library/ms189472.aspx
  148.             WHERE    fplanatt.attribute IN (‘dbid’, ‘objectid’)
  149.         ) srcpvt
  150.         PIVOT( MAX(srcpvt.id) FOR srcpvt.attribute IN ([dbid], [objectid]) ) pvt
  151.     ) plnattr
  152.     OUTER APPLY sys.dm_exec_sql_text(stm.SqlHandle) src
  153. END;
  154. GO

History:

2014-06-10 Small revision of source code (SUBSTRING: ISNULL, NULLIF)

2014-06-12

SQL Server: deadlocks

Filed under: concurrency,conference,sql server — ---- @ 7:01 AM

I gave a presentation at .Net Bucharest software developers meeting about deadlocks (resources, locks, blocking, deadlocks, how to simulate a simple deadlock, deadlocks detection, *.xdl – XML deadlock graph, how to minimize deadlocks).

Files: Net_User_Group_Bucharest_2014-06-10.zip

2014-03-16

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #3

Filed under: concurrency,EXISTS,safe,UPSERT — ---- @ 1:25 PM

I presented here and here the reasons why one of the standard approach used for UPSERT

Code Snippet
IF EXISTS(SELECT * FROM dbo.MyTable WHERE ID = @param1)
UPDATE    dbo.MyTable
SET        Col1 = @param2
WHERE    ID = @param1
ELSE
INSERT    dbo.MyTable(ID, Col1)
VALUES    (@param1, @param2)

isn’t safe from concurrency point of view if we use default setting for transactions isolation level: read committed.

This means that

  1. We might get duplicate values (see #2) if we don’t use an unique index or
  2. We might get “Cannot insert duplicate key row…” errors if we use an unique index to prevent duplicate values. In this last blog on this subject and I want to show you how simple is to get above error.

The only difference between this test and previous test is the next index:

Code Snippet
  1. CREATE UNIQUE NONCLUSTERED INDEX [IUN_Customer_Email]
  2. ON [dbo].[Customer] ([Email] ASC)

The ostress.exe tool is executed with the same parameters:

sqlserver.ostress.upsert.test

The parameter -o”d:\SqlOstressOutput” represents the output folder. This folder is used to write the results of every T-SQL script and connection plus ostress.log file which collects overall results.

This time, this unique index prevents insertion of duplicate emails:

Code Snippet
  1. CustomerID  Email Note OrdersCount
  2. ———– —– —- ———–
  3. 1           [email protected] NULL 1
  4. 2           [email protected] NULL 1001
  5. 4           [email protected] NULL 995
  6. 9           [email protected] NULL 994

and, also, D:\SqlOstressOutput\ostress.log contains some errors generated by T-SQL scripts because at some moments they try to insert duplicate emails (see lines 34 – 46) but IUN_Customer_Email index prevents duplicate emails :

Code Snippet
  1. 03/13/14 15:10:37.875 [0x00002878] OSTRESS, A Generic ODBC-Based Stress/Replay Utility.
  2. Version 9.04.0004 built for x64.
  3.   Copyright1997-2013 Microsoft. All Rights Reserved
  4. 03/13/14 15:10:37.876 [0x00002878]             Computer: SGBD
  5. 03/13/14 15:10:37.877 [0x00002878]          Base Module: C:\\cf2 Program Files\\cf2 Microsoft Corporation\\cf2 RMLUtils\\cf2 ostress.exe
  6. 03/13/14 15:10:37.877 [0x00002878]           Process Id: 6848
  7. 03/13/14 15:10:37.877 [0x00002878]  Active proc mask(0): 0x0000000F
  8. 03/13/14 15:10:37.877 [0x00002878]         Architecture: 9
  9. 03/13/14 15:10:37.878 [0x00002878]            Page size: 4096
  10. 03/13/14 15:10:37.878 [0x00002878]                 CPUs: 4
  11. 03/13/14 15:10:37.878 [0x00002878]     Processor groups: 1
  12. 03/13/14 15:10:37.879 [0x00002878]         Highest node: 0
  13. 03/13/14 15:10:37.879 [0x00002878] —————————————
  14. 03/13/14 15:10:37.879 [0x00002878]                Group: 0
  15. 03/13/14 15:10:37.880 [0x00002878] —————————————
  16. 03/13/14 15:10:37.880 [0x00002878]         Processor(s): 0x00000003 Function units: Shared
  17. 03/13/14 15:10:37.881 [0x00002878]         Package mask: 0x0000000F
  18. 03/13/14 15:10:37.882 [0x00002878]         Processor(s): 0x0000000C Function units: Shared
  19. 03/13/14 15:10:37.884 [0x00002878]         Processor(s): 0x0000000F assigned to Numa node: 0
  20. 03/13/14 15:10:37.887 [0x00002878] Current time bias: -120 minutes -2.00 hours DST Standard
  21. 03/13/14 15:10:37.888 [0x00002878] Max threads setting: 10000
  22. 03/13/14 15:10:37.889 [0x00002878] Arguments:
  23. 03/13/14 15:10:37.891 [0x00002878] -S(local)\\cf2 SQL2012
  24. 03/13/14 15:10:37.892 [0x00002878] -E
  25. 03/13/14 15:10:37.893 [0x00002878] -dTestUPSERT
  26. 03/13/14 15:10:37.894 [0x00002878] -od:\\cf2 SqlOstressOutput
  27. 03/13/14 15:10:37.896 [0x00002878] -id:\\cf2 SqlOstressInput\\cf2 SQLQuery*.sql
  28. 03/13/14 15:10:37.897 [0x00002878] -n10
  29. 03/13/14 15:10:37.898 [0x00002878] -r100
  30. 03/13/14 15:10:37.900 [0x00002878] -q
  31. 03/13/14 15:10:37.901 [0x00002878] Using language id (LCID): 1024 [English_United States.1252] for character formatting with NLS: 0x00060101 and Defined: 0x00060101
  32. 03/13/14 15:10:37.903 [0x00002878] Default driver: SQL Server Native Client 11.0
  33. 03/13/14 15:10:39.857 [0x0000327C] [spid 69] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  34. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  35.  
  36. 03/13/14 15:10:39.857 [0x00001988] [spid 71] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  37. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (c@c.c).
  38.  
  39. 03/13/14 15:10:39.858 [0x00000A14] [spid 54] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  40. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  41.  
  42. 03/13/14 15:10:39.858 [0x00003888] [spid 79] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  43. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  44.  
  45. 03/13/14 15:10:39.874 [0x0000038C] [spid 62] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  46. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  47.  
  48. more errors

Solutions: to prevent these concurrency problems here I’ve presented few solutions. Please bear in mind that you should rigorous test selected solution. Next blog will present pros and cons for every solution.

See also:

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

2014-03-04

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

Filed under: concurrency,EXISTS,safe,UPSERT — ---- @ 8:45 AM

One of previous articles showed how easily we might get duplicate rows if we use following approach for UPSERT:

Code Snippet
IF EXISTS(SELECT  FROM MyTable WHERE ID = @param1)
    UPDATE    MyTable
    SET     Col1 = NewValue
    WHERE   ID = @param1
ELSE
    INSERT  MyTable(ID, Col1)
    VALUES  (@param1, @param2)    

This time I decided to do other tests but without WAITFOR. I’ve used following table (the same table) without any non-clustered indexes (like IUN_Customer_Email) and with just two rows:

Code Snippet
CREATE DATABASE TestUPSERT
GO

DBCC FREEPROCCACHE;
GO

USE TestUPSERT
GO

IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
    DROP TABLE dbo.Customer;
CREATE TABLE dbo.Customer (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    Email NVARCHAR(100) NOT NULL,
    Note NVARCHAR(50) NULL,    
    OrdersCount INT NOT NULL DEFAULT (1)
);
GO

INSERT dbo.Customer (Email) VALUES (N'[email protected]');
INSERT dbo.Customer (Email) VALUES (N'[email protected]');
GO

For my test I’ve used three scripts which try to insert or updated multiple times three emails: [email protected] (SQLQuery1.sql), [email protected] (SQLQuery2.sql) and [email protected] (SQLQuery3.sql). These files are available in D:\SqlOstressInput folder.

Content of SQLQuery{1|2|3}.sql script:

Code Snippet
– Parameters
DECLARE @Email NVARCHAR(100);
SET @Email = '[email protected]'; – or '[email protected]' or '[email protected]'
– End of Parameters
BEGIN TRANSACTION;
    IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email)
    BEGIN
        UPDATE  dbo.Customer
        SET        OrdersCount = OrdersCount + 1
        WHERE   Email = @Email;
    END
    ELSE
    BEGIN
        INSERT    dbo.Customer(Email)
        VALUES    (@Email) – OrdersCount = Default 1
    END    
COMMIT;

To run these scripts multiple times (100 times) using 10 concurrent connections I’ve used ostress.exe which is included into RML utilities for SQL Server. Before test I’ve used DBCC DROPCLEANBUFFERS and for ostress.exe I’ve used following arguments:

Code Snippet
[RML] C:\Program Files\Microsoft Corporation\RMLUtils>
ostress -S(local)\SQL2012 -E -dTestUPSERT -o"d:\SqlOstressOutput" -i"d:\SqlOstressInput\SQLQuery*.sql" -n10 -r100 -q

sqlserver.ostress.upsert.test

After few tries I succeeded to get duplicate rows (sample output, you might get different results) and for some emails ([email protected] and [email protected]) the OrderCount is wrong:

Code Snippet
CustomerID  Email Note OrdersCount
———– —– —- ———–
1           [email protected] NULL 1
2           [email protected] NULL 1001
3           [email protected] NULL 999
4           [email protected] NULL 999
5           [email protected] NULL 996
6           [email protected] NULL 996
7           [email protected] NULL 996
8           [email protected] NULL 996
9           [email protected] NULL 996

Conclusion: This approach for UPSERT (without any unique indexes and table hints or without serializable transaction isolation level) isn’t safe.

Note: for this test I used

Code Snippet
Microsoft SQL Server 2012 (SP1) – 11.0.3000.0 (X64)
    Oct 19 2012 13:38:57
    Copyright (c) Microsoft Corporation
    Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

Transaction isolation level was READ COMMITTED (default).

See also:

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #3

2014-01-22

UPSERT – safe solutions for SQL Server [DRAFT]

Filed under: concurrency,EXISTS,MERGE,sql server,UPSERT — ---- @ 9:39 PM

This article tackles how to implement safe UPSERTs from concurrency point of view. A previous blog has presented a common solution for UPSERT and it has demonstrated that this solution isn’t safe. Current blog specifies common solutions for UPSERT operations with a single row.

To discuss these solutions I will use the same table dbo.Customer:

Code Snippet
IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
    DROP TABLE dbo.Customer;
CREATE TABLE dbo.Customer (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    Email NVARCHAR(100) NOT NULL,
    Note NVARCHAR(50) NULL,    
    OrdersCount INT NOT NULL DEFAULT (1)
);
GO

If we want to insert a new customer then the identifier used for customers is the email address. If there is no email address stored within dbo.Customer then we need to insert a new customer with 1 order. Otherwise (the customer’s email already exists) we need to increment by 1 the OrdersCount column.

From this point of view we need also an unique index on Email column:

Code Snippet
CREATE UNIQUE INDEX IUN_Customer_Email
ON dbo.Customer(Email);

This index has two important roles:

  1. It prevents duplicated emails and
  2. It helps those queries which tries to find customers using the email address as identifier (WHERE Email = constant / @variable).

Solutions for a single row

Code Snippet
– Solution #1: IF EXISTS
– Parameters
DECLARE @Email NVARCHAR(100)
SET @Email = '[email protected]';
– End of Parameters
BEGIN TRANSACTION;
    IF EXISTS(SELECT * FROM dbo.Customer WITH(XLOCK, HOLDLOCK) WHERE Email = @Email)
    BEGIN
        UPDATE  dbo.Customer
        SET        OrdersCount = OrdersCount + 1
        WHERE   Email = @Email;
    END
    ELSE
    BEGIN
        INSERT    dbo.Customer(Email)
        VALUES    (@Email) – OrdersCount = Default 1
    END    
COMMIT; – Use ROLLBACK for testing
GO

– Solution #2: MERGE
– Parameters
DECLARE @Email NVARCHAR(100)
SET @Email = '[email protected]';
– End of Parameters
BEGIN TRANSACTION;
    MERGE    dbo.Customer WITH(HOLDLOCK) trg
    USING    (VALUES(@Email)) src(Email) ON trg.Email = src.Email
    WHEN MATCHED
        THEN    
        UPDATE
        SET        OrdersCount = trg.OrdersCount + 1
    WHEN NOT MATCHED BY TARGET
        THEN    
        INSERT (Email)
        VALUES    (@Email);
COMMIT;
GO

– Solution #3: UPDATE + INSERT
– Parameters
DECLARE @Email NVARCHAR(100)
– End of Parameters
SET @Email = '[email protected]';
BEGIN TRANSACTION;
    UPDATE    trg
    SET        OrdersCount = OrdersCount + 1
    FROM    dbo.Customer trg WITH(HOLDLOCK)
    WHERE    trg.Email = @Email;

    IF @@ROWCOUNT = 0
        INSERT    dbo.Customer(Email)
        VALUES    (@Email);        
COMMIT;
GO

Solutions for multiple rows

Code Snippet
– Solution #1: EXISTS
– Parameters
DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
– End of Parameters
DECLARE @CustomerEmailForUPDATE TABLE(
    Email NVARCHAR(100) NOT NULL PRIMARY KEY
);

BEGIN TRANSACTION;
    INSERT    @CustomerEmailForUPDATE (Email)
    SELECT    ce.Email
    FROM    @CustomerEmail ce
    WHERE    EXISTS(SELECT * FROM dbo.Customer c WITH(XLOCK, HOLDLOCK) WHERE c.Email = ce.Email);

    UPDATE  dbo.Customer
    SET        OrdersCount = OrdersCount + 1
    FROM    dbo.Customer c
    INNER JOIN @CustomerEmailForUPDATE cefu ON c.Email = cefu.Email
    
    INSERT  dbo.Customer(Email)
    SELECT    ce.Email – OrdersCount = Default 1
    FROM    @CustomerEmail ce
    WHERE    NOT EXISTS(SELECT * FROM @CustomerEmailForUPDATE cefu WHERE cefu.Email = ce.Email)
COMMIT; – Use ROLLBACK for testing
GO

– Solution #2: MERGE
– Parameters
DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
– End of Parameters
BEGIN TRANSACTION;
    MERGE    dbo.Customer WITH(HOLDLOCK) trg
    USING    @CustomerEmail src ON trg.Email = src.Email
    WHEN MATCHED
        THEN    
        UPDATE
        SET        OrdersCount = trg.OrdersCount + 1
    WHEN NOT MATCHED BY TARGET
        THEN    
        INSERT (Email)
        VALUES    (src.Email);
COMMIT; – Use ROLLBACK for testing
GO

– Solution #3: UPDATE + INSERT
– Parameters
DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
– End of Parameters

DECLARE @CustomerEmailUPDATED TABLE(
    Email NVARCHAR(100) NOT NULL PRIMARY KEY
);
BEGIN TRANSACTION;
    UPDATE    trg
    SET        OrdersCount = OrdersCount + 1
    OUTPUT    inserted.Email INTO @CustomerEmailUPDATED
    FROM    dbo.Customer trg WITH(HOLDLOCK)
    INNER JOIN @CustomerEmail src ON trg.Email = src.Email;

    INSERT    dbo.Customer(Email)
    SELECT    ce.Email
    FROM    @CustomerEmail ce
    WHERE    NOT EXISTS(SELECT * FROM @CustomerEmailUPDATED upd WHERE upd.Email = ce.Email);
COMMIT;
GO

The next blog

  • will present solutions for scenarios where we need to insert or update multiple rows (2014-02-18) and
  • will discuss pros and cons of these solutions.

Revision History:

2014-02-08: First update. I have added a new section for multiple rows. Solutions for a single row: some minor changes.

2014-01-05

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

Filed under: concurrency,EXISTS,index,MERGE,safe,sql server,UPSERT — ---- @ 10:47 PM

Is it safe the following approach for UPSERT ?

Code Snippet
IF EXISTS (SELECT FROM MyTable WHERE ID = @param1)
    UPDATE    MyTable
    SET        Col1 = NewValue
    WHERE    ID = @param1
ELSE
    INSERT    MyTable(ID, Col1)
    VALUES    (@param1, @param2)

The short answer is no!

Example:

  1. The transaction isolation level is READ COMMITTED which is the default isolation level.
  2. For this demo I’ve used a customer table to store the customer’s email and the number of sales orders (for every customer).
  3. I’ve inserted two customers (1 – [email protected]; 2 – [email protected]).
  4. I’ve created a stored procedure for UPSERT (dbo.InsertOrder). This stored procedure receive the customer’s email, it checks if exists this email and if not then it inserts the new customer (sales orders 1). Otherwise it increments the number of sales orders by 1.

T-SQL script:

Code Snippet
IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
    DROP TABLE dbo.Customer;
IF OBJECT_ID(N'dbo.InsertOrder') IS NOT NULL
    DROP PROCEDURE dbo.InsertOrder;
GO

CREATE TABLE dbo.Customer (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    Email NVARCHAR(100) NOT NULL,
    Note NVARCHAR(50) NULL,    
    OrdersCount INT NOT NULL DEFAULT (1)
);
GO

INSERT    dbo.Customer (Email)
VALUES    (N'[email protected]'), – CustomerID 1
        (N'[email protected]'); – CustomerID 2
GO

CREATE PROCEDURE dbo.InsertOrder (
    @Email NVARCHAR(100)
)
AS
BEGIN
    IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email)
    BEGIN
        UPDATE    dbo.Customer
        SET        OrdersCount = OrdersCount + 1
        WHERE    Email = @Email;
    END
    ELSE
    BEGIN
        WAITFOR DELAY '00:00:03';
        INSERT    dbo.Customer (Email)
        VALUES    (@Email) – OrdersCount = Default 1
    END
END;
GO

Note: WAITFOR DELAY is used to delay the INSERTion of the new customers with just 3 seconds.

Now we can start this test executing following statements in separate windows:

Is.NOT.Safe.If.Exists.Update.Insert.UPSERT

Query window #1 (Step 1):

Code Snippet
BEGIN TRANSACTION
    UPDATE    dbo.Customer
    SET        Note = N'Note#2'
    WHERE    Email = N'[email protected]';
– ROLLBACK — or COMMIT

At this moment the CustomerID = 2 is locked X (exclusive) and no other transaction can read (excepting NOLOCK and READ UNCOMMITTED), update or delete this record. Because this transaction remains open (no COMMIT or ROLLBACK) this record will be locked till the end of transaction.

Query window #2 & #3 (Step 2 & Step 3):

Code Snippet
EXEC dbo.InsertOrder N'[email protected]';
SELECT * FROM dbo.Customer;

At this moment the IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email) statement will try to read the record with CustomerID = 1 and then the record with CustomerID = 2 requesting a S lock (shared). Because X and S locks aren’t compatible (see Lock Compatibility) both (Step 2 & 3) SELECT * FROM dbo.Customer WHERE Email = @Email statements will be blocked. Note: you can see the “Executing ….” message within tab title.

Query window #1 (Step 4):

I execute the ROLLBACK or COMMIT statement which ends the initial transaction. The X lock for CustomerID = 2 record is released and the SELECT statements (SELECT * FROM dbo.Customer WHERE Email = @Email; Step 2 & 3)  continue to be executed. Every EXISTS(SELECT …. WHERE Email = [email protected]) statement will return FALSE and both stored procedures will INSERT the customer `[email protected]`.

This is the reason why I get duplicated emails at the end (Step 5) of my example:

Results.Step5Solutions ?

Simplest solution is to create an unique index/constraint on Email column.

Code Snippet
CREATE UNIQUE INDEX IUN_Customer_Email
ON dbo.Customer(Email);

This UNIQUE index/constraint guarantees that we can’t have duplicated emails. But this doesn’t means this UPSERT approach is 100% safe because dbo.InsertOrder stored procedure still tries to INSERT duplicated emails (at Step 5):

Results.Step5.With.UNIQUE.index

Next blog On Jan 27 I will discuss what solutions do we have for safe UPSERTs.

See also:

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #3

Revision History:

2014-01-12: I changed the last paragraph.

2014-01-27: I changed the last paragraph.

2014-04-06: I added the [See also] section

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.