Yet another SQL Server enthusiast

2014-03-04

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

Filed under: concurrency,EXISTS,safe,UPSERT — ---- @ 8:45 AM

One of previous articles showed how easily we might get duplicate rows if we use following approach for UPSERT:

Code Snippet
IF EXISTS(SELECT  FROM MyTable WHERE ID = @param1)
     UPDATE    MyTable
     SET     Col1 = NewValue
     WHERE   ID = @param1
 ELSE
     INSERT  MyTable(ID, Col1)
     VALUES  (@param1, @param2)    

This time I decided to do other tests but without WAITFOR. I’ve used following table (the same table) without any non-clustered indexes (like IUN_Customer_Email) and with just two rows:

Code Snippet
CREATEDATABASETestUPSERT
GO
 DBCC FREEPROCCACHE;
 GO
 USE TestUPSERT
 GO
 IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
     DROP TABLE dbo.Customer;
 CREATE TABLE dbo.Customer (
     CustomerID INT IDENTITY(1,1) PRIMARY KEY,
     Email NVARCHAR(100) NOT NULL,
     Note NVARCHAR(50) NULL,    
     OrdersCount INT NOT NULL DEFAULT (1)
 );
 GO
 INSERT dbo.Customer (Email) VALUES (N'[email protected]');
 INSERT dbo.Customer (Email) VALUES (N'[email protected]');
 GO

For my test I’ve used three scripts which try to insert or updated multiple times three emails: [email protected] (SQLQuery1.sql), [email protected] (SQLQuery2.sql) and [email protected] (SQLQuery3.sql). These files are available in D:\SqlOstressInput folder.

Content of SQLQuery{1|2|3}.sql script:

Code Snippet
– Parameters
 DECLARE @Email NVARCHAR(100);
 SET @Email = '[email protected]'; – or '[email protected]' or '[email protected]' 
 – End of Parameters
 BEGIN TRANSACTION;
     IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email)
     BEGIN
         UPDATE  dbo.Customer
         SET        OrdersCount = OrdersCount + 1
         WHERE   Email = @Email;
     END
     ELSE
     BEGIN
         INSERT    dbo.Customer(Email)
         VALUES    (@Email) – OrdersCount = Default 1
     END    
 COMMIT;

To run these scripts multiple times (100 times) using 10 concurrent connections I’ve used ostress.exe which is included into RML utilities for SQL Server. Before test I’ve used DBCC DROPCLEANBUFFERS and for ostress.exe I’ve used following arguments:

Code Snippet
[RML] C:\Program Files\Microsoft Corporation\RMLUtils> 
 ostress -S(local)\SQL2012 -E -dTestUPSERT -o"d:\SqlOstressOutput" -i"d:\SqlOstressInput\SQLQuery*.sql" -n10 -r100 -q

After few tries I succeeded to get duplicate rows (sample output, you might get different results) and for some emails ([email protected] and [email protected]) the OrderCount is wrong:

Code Snippet
CustomerID  Email Note OrdersCount
 ———– —– —- ———–
 1           [email protected] NULL 1
 2           [email protected] NULL 1001
 3           [email protected] NULL 999
 4           [email protected] NULL 999
 5           [email protected] NULL 996
 6           [email protected] NULL 996
 7           [email protected] NULL 996
 8           [email protected] NULL 996
 9           [email protected] NULL 996

Conclusion: This approach for UPSERT (without any unique indexes and table hints or without serializable transaction isolation level) isn’t safe.

Note: for this test I used

Code Snippet
Microsoft SQL Server 2012 (SP1) – 11.0.3000.0 (X64) 
     Oct 19 2012 13:38:57 
     Copyright (c) Microsoft Corporation
     Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

Transaction isolation level was READ COMMITTED (default).

2014-02-12

Diviziunea relațională

Filed under: Uncategorized — ---- @ 8:52 AM

1) PPTX + scripturi SQL + funcție SQL CLR download

Funcția SQL CLR DivideBy este prezentată doar cu titlu de exemplu. Nu este optimizată și nu este pe deplin testată.

2) Exemplu de utilizare a funcției DivideBy:

Code Snippet
SELECT    *
FROM    dbo.DivideBy('dbo.Rezultat','dbo.Disciplina','IdStudent','IdDisciplina')dr;
/*
Id
–
1
2
3
*/
 SELECT    s.IdStudent, s.Nume, s.Prenume
 FROM    dbo.DivideBy('dbo.Rezultat', 'dbo.Disciplina', 'IdStudent', 'IdDisciplina') dr
 JOIN    dbo.Student s ON dr.Id = s.IdStudent;
 /*
 IdStudent Nume  Prenume
 ——— —– ———-
 1         A     Alexandru
 2         B     Bogdan
 3         C     Constantin
 */

3) Resurse:

Divided We Stand: The SQL of Relational Division

Proper Relational Division With Sets

Mulțumesc

Comments (1)

2014-02-10

How to avoid SQL injections: one simple solution

Filed under: Uncategorized — ---- @ 8:30 AM

SQL injections are one of the top threats for web applications. This kind of attack is possible if application concatenates and executes strings received from the end-users. This article shows a simple technique to avoid this attacks: [at database level] instead of receiving (and concatenating) strings (for example column’s or table’s name) from the client app it receives only integers. Every column will have an INT identifier associated.

Example:

If the column Name included into Person.PhoneNumberType could have the ID 4

Code Snippet
CREATETABLEdbo.AllowedTable (
IDAllowedTableINTPRIMARYKEY,
TableSchemaSYSNAMENOTNULL,
TableNameSYSNAMENOTNULL,
UNIQUE (TableSchema,TableName),
CaptionNVARCHAR(50)NOTNULL– This is what end-user will see
);
INSERT    INTOdbo.AllowedTable
(IDAllowedTable,TableSchema,TableName,Caption)
VALUES    (1,N'Person',N'AddressType','Type of address'),
(2,N'Person',N'PhoneNumberType','Type of phone number');
 CREATE TABLE dbo.AllowedColumn (
     IDAllowedColumn INT PRIMARY KEY, – This is what client-all will send to SQL Server
     IDAllowedTable INT NOT NULL REFERENCES dbo.AllowedTable(IDAllowedTable),
     ColumnName SYSNAME NOT NULL,
         UNIQUE(IDAllowedTable, ColumnName),
     IsPKColumn BIT NOT NULL,
     Caption NVARCHAR(50) NOT NULL – This is what end-user will see
 );
 – Every table has a simple PK (single column)
 CREATE UNIQUE INDEX IUF_AllowedColumn_IDAllowedTable_WhereIsPKColumnIs1
 ON dbo.AllowedColumn(IDAllowedTable)
 WHERE IsPKColumn = 1;
 GO
 INSERT    INTO dbo.AllowedColumn
         (IDAllowedColumn, IDAllowedTable, ColumnName, IsPKColumn, Caption)
 VALUES    (1, 1, N'AddressTypeID', 1, 'ID'),
         (2, 1, N'Name', 0, 'Name'),
         (3, 2, N'PhoneNumberTypeID', 1, 'ID'),
         (4, 2, N'Name', 0, 'Name'),
         (5, 2, N'ModifiedDate', 0, 'The last modification date');
 GO

then if the client app needs a list of all values from this column it could execute a stored procedure having an INT parameter instead of [N]VARCHAR parameter:

Code Snippet
EXEC dbo.ExecuteDynamicQuery @IDColumn = 4;

Code Snippet
CREATEPROCEDUREdbo.ExecuteDynamicQuery
(
@IDColumnINT– Client app will send only the ID of object (the id of column in this example)
)
AS
BEGIN
DECLARE    @IDTableINT,@SecondColumnNameSYSNAME;

SELECT    @IDTable    =ac.IDAllowedTable,
@SecondColumnName=ac.ColumnName
FROM    dbo.AllowedColumnac
WHERE    ac.IDAllowedColumn=@IDColumn;
     IF @IDTable IS NULL – This will check if the source table exists
         RAISERROR('Error', 16, 1); – Object not found
     ELSE 
     BEGIN
         DECLARE  @PkColumnName SYSNAME;
         SELECT    @PkColumnName = ac.ColumnName
         FROM    dbo.AllowedColumn ac
         WHERE    ac.IDAllowedTable = @IDTable
         AND        ac.IsPKColumn = 1;
         DECLARE @SqlStatement NVARCHAR(MAX);
         SELECT    @SqlStatement =
                 N'SELECT ' + QUOTENAME(@PkColumnName) 
                 + N' , ' + QUOTENAME(@SecondColumnName) 
                 + N' FROM '  + QUOTENAME(at.tableSchema) + N'.' + QUOTENAME(at.TableName)
         FROM    dbo.AllowedTable at
         WHERE    at.IDAllowedTable = @IDTable;
         PRINT @SqlStatement;
         EXEC sp_executesql @SqlStatement;
     END
 END;
 GO

First, this stored procedure will validate the value of parameter (the column’s ID: @IDColumn INT) and then if the requested ID exists will generate and execute a SQL statement.

Usage:

Code Snippet
EXEC dbo.ExecuteDynamicQuery @IDColumn = 100;
 /*
 Msg 50000, Level 16, State 1, Procedure ExecuteDynamicQuery, Line 16
 Error
 */
 EXEC dbo.ExecuteDynamicQuery @IDColumn = NULL;
 /*
 Msg 50000, Level 16, State 1, Procedure ExecuteDynamicQuery, Line 16
 Error
 */
 EXEC dbo.ExecuteDynamicQuery @IDColumn = 2;
 /*
 Executed @SqlStatement:
 SELECT [AddressTypeID] , [Name] FROM [Person].[AddressType]
 */
 EXEC dbo.ExecuteDynamicQuery @IDColumn = 4;
 /*
 Executed @SqlStatement:
 SELECT [PhoneNumberTypeID] , [Name] FROM [Person].[PhoneNumberType]
 */

Note: Microsoft has published a comprehensive article on this subject here .

Leave a Comment

2014-01-22

UPSERT – safe solutions for SQL Server [DRAFT]

Filed under: concurrency,EXISTS,MERGE,sql server,UPSERT — ---- @ 9:39 PM

This article tackles how to implement safe UPSERTs from concurrency point of view. A previous blog has presented a common solution for UPSERT and it has demonstrated that this solution isn’t safe. Current blog specifies common solutions for UPSERT operations with a single row.

To discuss these solutions I will use the same table dbo.Customer:

Code Snippet
IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
     DROP TABLE dbo.Customer;
 CREATE TABLE dbo.Customer (
     CustomerID INT IDENTITY(1,1) PRIMARY KEY,
     Email NVARCHAR(100) NOT NULL,
     Note NVARCHAR(50) NULL,    
     OrdersCount INT NOT NULL DEFAULT (1)
 );
 GO

If we want to insert a new customer then the identifier used for customers is the email address. If there is no email address stored within dbo.Customer then we need to insert a new customer with 1 order. Otherwise (the customer’s email already exists) we need to increment by 1 the OrdersCount column.

From this point of view we need also an unique index on Email column:

Code Snippet
CREATE UNIQUE INDEX IUN_Customer_Email
 ON dbo.Customer(Email);

This index has two important roles:

It prevents duplicated emails and
It helps those queries which tries to find customers using the email address as identifier (WHERE Email = constant / @variable).

Solutions for a single row

Code Snippet
– Solution #1: IF EXISTS
– Parameters
DECLARE@EmailNVARCHAR(100)
SET@Email='[email protected]';
– End of Parameters
BEGINTRANSACTION;
IFEXISTS(SELECT*FROMdbo.CustomerWITH(XLOCK,HOLDLOCK)WHEREEmail=@Email)
BEGIN
UPDATEdbo.Customer
SET        OrdersCount=OrdersCount+ 1
WHEREEmail=@Email;
END
ELSE
BEGIN
INSERTdbo.Customer(Email)
VALUES    (@Email)– OrdersCount = Default 1
END    
COMMIT;– Use ROLLBACK for testing
GO
 – Solution #2: MERGE
 – Parameters
 DECLARE @Email NVARCHAR(100)
 SET @Email = '[email protected]';
 – End of Parameters
 BEGIN TRANSACTION;
     MERGE    dbo.Customer WITH(HOLDLOCK) trg 
     USING    (VALUES(@Email)) src(Email) ON trg.Email = src.Email
     WHEN MATCHED 
         THEN    
         UPDATE 
         SET        OrdersCount = trg.OrdersCount + 1
     WHEN NOT MATCHED BY TARGET 
         THEN    
         INSERT (Email)
         VALUES    (@Email);
 COMMIT;
 GO
 – Solution #3: UPDATE + INSERT
 – Parameters
 DECLARE @Email NVARCHAR(100)
 – End of Parameters
 SET @Email = '[email protected]';
 BEGIN TRANSACTION;
     UPDATE    trg
     SET        OrdersCount = OrdersCount + 1
     FROM    dbo.Customer trg WITH(HOLDLOCK) 
     WHERE    trg.Email = @Email;
     IF @@ROWCOUNT = 0
         INSERT    dbo.Customer(Email)
         VALUES    (@Email);        
 COMMIT;
 GO

Solutions for multiple rows

Code Snippet
– Solution #1: EXISTS
– Parameters
DECLARE@CustomerEmailTABLE(EmailNVARCHAR(100)NOTNULLPRIMARYKEY);
INSERT    @CustomerEmail (Email)VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email)VALUES (N'[email protected]');
INSERT    @CustomerEmail (Email)VALUES (N'[email protected]');
– End of Parameters
DECLARE@CustomerEmailForUPDATETABLE(
EmailNVARCHAR(100)NOTNULLPRIMARYKEY
);
 BEGIN TRANSACTION;
     INSERT    @CustomerEmailForUPDATE (Email)
     SELECT    ce.Email
     FROM    @CustomerEmail ce
     WHERE    EXISTS(SELECT * FROM dbo.Customer c WITH(XLOCK, HOLDLOCK) WHERE c.Email = ce.Email);
     UPDATE  dbo.Customer
     SET        OrdersCount = OrdersCount + 1
     FROM    dbo.Customer c 
     INNER JOIN @CustomerEmailForUPDATE cefu ON c.Email = cefu.Email
     
     INSERT  dbo.Customer(Email)
     SELECT    ce.Email – OrdersCount = Default 1
     FROM    @CustomerEmail ce
     WHERE    NOT EXISTS(SELECT * FROM @CustomerEmailForUPDATE cefu WHERE cefu.Email = ce.Email)
 COMMIT; – Use ROLLBACK for testing
 GO
 – Solution #2: MERGE
 – Parameters
 DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
 INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
 INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
 INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
 – End of Parameters
 BEGIN TRANSACTION;
     MERGE    dbo.Customer WITH(HOLDLOCK) trg 
     USING    @CustomerEmail src ON trg.Email = src.Email
     WHEN MATCHED 
         THEN    
         UPDATE 
         SET        OrdersCount = trg.OrdersCount + 1
     WHEN NOT MATCHED BY TARGET 
         THEN    
         INSERT (Email)
         VALUES    (src.Email);
 COMMIT; – Use ROLLBACK for testing
 GO
 – Solution #3: UPDATE + INSERT
 – Parameters
 DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
 INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
 INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
 INSERT    @CustomerEmail (Email) VALUES (N'[email protected]');
 – End of Parameters
 DECLARE @CustomerEmailUPDATED TABLE( 
     Email NVARCHAR(100) NOT NULL PRIMARY KEY
 );
 BEGIN TRANSACTION;
     UPDATE    trg
     SET        OrdersCount = OrdersCount + 1
     OUTPUT    inserted.Email INTO @CustomerEmailUPDATED
     FROM    dbo.Customer trg WITH(HOLDLOCK) 
     INNER JOIN @CustomerEmail src ON trg.Email = src.Email;
     INSERT    dbo.Customer(Email)
     SELECT    ce.Email
     FROM    @CustomerEmail ce
     WHERE    NOT EXISTS(SELECT * FROM @CustomerEmailUPDATED upd WHERE upd.Email = ce.Email);
 COMMIT;
 GO

The next blog

will present solutions for scenarios where we need to insert or update multiple rows (2014-02-18) and
will discuss pros and cons of these solutions.

Revision History:

2014-02-08: First update. I have added a new section for multiple rows. Solutions for a single row: some minor changes.

Leave a Comment

2014-01-13

Conditional joins

Filed under: conditional,execution plan,join,optimization,predicate,sql server — ---- @ 12:36 AM

Let’s start with following example (based on Supertypes and Subtypes modeling approach) which creates three tables:

dbo.BankAccountOwner (BankAccountOwnerID-PK, OwnerType, CreateDate)
dbo.Person (BankAccountOwnerID – PK, FK, FirstName, LastName)
dbo.Company (BankAccountOwnerID – PK, FK, CompanyName)

T-SQL Script:

Code Snippet
CREATETABLEdbo.BankAccountOwner (
BankAccountOwnerIDINTPRIMARYKEY,
OwnerTypeCHAR(1)NOTNULLCHECK(OwnerTypeIN ('P','C')),– P=Person, C=Company
CreateDateDATETIMENOTNULLDEFAULT (GETDATE())
);
CREATETABLEdbo.Person (
BankAccountOwnerIDINTPRIMARYKEY
REFERENCESdbo.BankAccountOwner(BankAccountOwnerID),
FirstNameNVARCHAR(50)NOTNULL,
LastNameNVARCHAR(50)NOTNULL
);
CREATETABLEdbo.Company (
BankAccountOwnerIDINTPRIMARYKEY
REFERENCESdbo.BankAccountOwner(BankAccountOwnerID),
CompanyNameNVARCHAR(100)NOTNULL
);
 INSERT    dbo.BankAccountOwner (BankAccountOwnerID, OwnerType) 
 VALUES    (1, 'P'), (2, 'P'), (3, 'C'), (4, 'C'), (5, 'C');
 INSERT    dbo.Person (BankAccountOwnerID, FirstName, LastName) 
 VALUES    (1, N'John', N'Doe'), (2, N'Mary', N'Doe');
 INSERT    dbo.Company (BankAccountOwnerID, CompanyName) 
 VALUES (3, N'MyComputer'), (4, N'Control Panel'), (5, N'Device Manager');
 GO

Problem: how can we get FirstName, LastName and CompanyName values for the following owners: 1, 2, 3, 4, 5 ?

First solution:

Code Snippet
SET NOCOUNT ON;
 SET STATISTICS IO ON;
 PRINT 'Test #1';
 SELECT    bao.*, p.FirstName, p.LastName, c.CompanyName
 FROM    dbo.BankAccountOwner bao
 LEFT JOIN dbo.Person p ON bao.BankAccountOwnerID = p.BankAccountOwnerID
 LEFT JOIN dbo.Company c ON bao.BankAccountOwnerID = c.BankAccountOwnerID
 WHERE    bao.BankAccountOwnerID IN (1,2,3,4,5);

My solution:

Because

dbo.Person table contains only rows with OwnerType = ‘P’ and dbo.Company table contains only rows with OwnerType = ‘C’ and
SQL Server doesn’t knows this

I added these predicates to every left join thus:

Code Snippet
PRINT 'Test #2';
 SELECT    bao.*, p.FirstName, p.LastName, c.CompanyName
 FROM    dbo.BankAccountOwner bao
 LEFT JOIN dbo.Person p ON bao.OwnerType = 'P' AND bao.BankAccountOwnerID = p.BankAccountOwnerID
 LEFT JOIN dbo.Company c ON bao.OwnerType = 'C' AND bao.BankAccountOwnerID = c.BankAccountOwnerID
 WHERE    bao.BankAccountOwnerID IN (1,2,3,4,5);

These are the execution plans:

First solution: as you can see from the properties of Index Seek operators SQL Server will try to find every bank account owner (1, 2, …, 5) within dbo.Person table and within dbo.Company table (Number of Execution = 5).

Second solution: this time because SQL Server knows that dbo.Person table contains only OwnerType = ‘P’ rows it will read from dbo.Person table using an Index Seek (on Person.PK_Person_…) just two times (Number of Executions = 2) because there are only two persons among those five owners. This was possible because of Filter operator which will verify before executing Index Seek if the current owner is ‘P':

Also, SQL Server will read from dbo.Company table using an Index Seek (on Company.PK_Company_…) three times (Number of Executions = 3) because there are three companies.

Because we have reduced the number of execution for Index Seek operators, this is means that we have reduced also the number of logical reads as we can see from the output of STATISTICS IO ON:

Code Snippet
Test #1
 Table 'Company'. Scan count 0, logical reads 10
 Table 'Person'. Scan count 0, logical reads 10
 Table 'BankAccountOwner'. Scan count 5, logical reads 10
 Test #2
 Table 'Company'. Scan count 0, logical reads 6
 Table 'Person'. Scan count 0, logical reads 4
 Table 'BankAccountOwner'. Scan count 5, logical reads 10

Leave a Comment

2014-01-05

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

Filed under: concurrency,EXISTS,index,MERGE,safe,sql server,UPSERT — ---- @ 10:47 PM

Is it safe the following approach for UPSERT ?

Code Snippet
IF EXISTS (SELECT … FROM MyTable WHERE ID = @param1)
     UPDATE    MyTable 
     SET        Col1 = NewValue 
     WHERE    ID = @param1
 ELSE
     INSERT    MyTable(ID, Col1)
     VALUES    (@param1, @param2)

The short answer is no!

Example:

The transaction isolation level is READ COMMITTED which is the default isolation level.
For this demo I’ve used a customer table to store the customer’s email and the number of sales orders (for every customer).
I’ve inserted two customers (1 – [email protected]; 2 – [email protected]).
I’ve created a stored procedure for UPSERT (dbo.InsertOrder). This stored procedure receive the customer’s email, it checks if exists this email and if not then it inserts the new customer (sales orders 1). Otherwise it increments the number of sales orders by 1.

T-SQL script:

Code Snippet
IFOBJECT_ID(N'dbo.Customer')ISNOTNULL
DROPTABLEdbo.Customer;
IFOBJECT_ID(N'dbo.InsertOrder')ISNOTNULL
DROPPROCEDUREdbo.InsertOrder;
GO
 CREATE TABLE dbo.Customer (
     CustomerID INT IDENTITY(1,1) PRIMARY KEY,
     Email NVARCHAR(100) NOT NULL,
     Note NVARCHAR(50) NULL,    
     OrdersCount INT NOT NULL DEFAULT (1)
 );
 GO
 INSERT    dbo.Customer (Email)
 VALUES    (N'[email protected]'), – CustomerID 1
         (N'[email protected]'); – CustomerID 2
 GO
 CREATE PROCEDURE dbo.InsertOrder (
     @Email NVARCHAR(100)
 )
 AS
 BEGIN
     IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email)
     BEGIN
         UPDATE    dbo.Customer
         SET        OrdersCount = OrdersCount + 1
         WHERE    Email = @Email;
     END
     ELSE
     BEGIN
         WAITFOR DELAY '00:00:03';
         INSERT    dbo.Customer (Email)
         VALUES    (@Email) – OrdersCount = Default 1
     END
 END;
 GO

Note: WAITFOR DELAY is used to delay the INSERTion of the new customers with just 3 seconds.

Now we can start this test executing following statements in separate windows:

Query window #1 (Step 1):

Code Snippet
BEGIN TRANSACTION
     UPDATE    dbo.Customer
     SET        Note = N'Note#2'
     WHERE    Email = N'[email protected]';
 – ROLLBACK — or COMMIT

At this moment the CustomerID = 2 is locked X (exclusive) and no other transaction can read (excepting NOLOCK and READ UNCOMMITTED), update or delete this record. Because this transaction remains open (no COMMIT or ROLLBACK) this record will be locked till the end of transaction.

Query window #2 & #3 (Step 2 & Step 3):

Code Snippet
EXEC dbo.InsertOrder N'[email protected]';
 SELECT * FROM dbo.Customer;

At this moment the IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email) statement will try to read the record with CustomerID = 1 and then the record with CustomerID = 2 requesting a S lock (shared). Because X and S locks aren’t compatible (see Lock Compatibility) both (Step 2 & 3) SELECT * FROM dbo.Customer WHERE Email = @Email statements will be blocked. Note: you can see the “Executing ….” message within tab title.

Query window #1 (Step 4):

I execute the ROLLBACK or COMMIT statement which ends the initial transaction. The X lock for CustomerID = 2 record is released and the SELECT statements (SELECT * FROM dbo.Customer WHERE Email = @Email; Step 2 & 3) continue to be executed. Every EXISTS(SELECT …. WHERE Email = ‘[email protected]’) statement will return FALSE and both stored procedures will INSERT the customer `[email protected]`.

This is the reason why I get duplicated emails at the end (Step 5) of my example:

Solutions ?

Simplest solution is to create an unique index/constraint on Email column.

Code Snippet
CREATE UNIQUE INDEX IUN_Customer_Email
 ON dbo.Customer(Email);

This UNIQUE index/constraint guarantees that we can’t have duplicated emails. But this doesn’t means this UPSERT approach is 100% safe because dbo.InsertOrder stored procedure still tries to INSERT duplicated emails (at Step 5):

Next blog On Jan 27 I will discuss what solutions do we have for safe UPSERTs.

Revision History:

2014-01-12: I changed the last paragraph.

2014-01-27: I changed the last paragraph.

2014-04-06: I added the [See also] section

Comments (1)

2014-01-02

Happy New Year!

Filed under: sql server — ---- @ 5:48 PM

Code Snippet

SELECT (
SELECT CHAR(h+i)
FROM (
SELECT    e.f.value('(@value)', 'TINYINT'), ROW_NUMBER() OVER(ORDER BY e.f DESC) 
FROM (VALUES (N'<row value="57"/><row value="83"/><row value="99"/><row value="100"/><row value="110"/><row value="22"/><row value="69"/><row value="93"/><row value="112"/><row value="26"/><row value="84"/><row value="97"/><row value="94"/><row value="112"/><row value="32"/>')) a(b)
CROSS APPLY (VALUES (CONVERT(XML, a.b))) c(d)
CROSS APPLY c.d.nodes('/row') e(f)
) g(h,i)
ORDER BY i DESC
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')

SQL Fiddle demo

Leave a Comment

2013-12-25

Notes regarding SQL Server and BETWEEN operator

Filed under: between,optimization,SARG,sql server — ---- @ 12:49 AM

Note #1

According to ANSI SQL ’92 "X BETWEEN Y AND Z" is equivalent to "X>=Y AND X<=Z". This is the reason why "WHERE h.OrderDate BETWEEN @StartDate AND @EndDate" is translated into "WHERE h.OrderDate >= @StartDate AND h.OrderDate <= @EndDate":

This means also that Y must be less or equal than Z (Y <= Z) and @StartDate must be less or equal than @EndDate (@StartDate <= @EndDate). Now, if we run following query having @StartDate = Dec 31, 2005 and @EndDate = Jan 1, 2005,

Code Snippet

SET STATISTICS IO ON;
 
DECLARE @StartDate DATETIME = '20051231 23:59:59.997',
        @EndDate DATETIME = '20050101 00:00:00.000';
        
SELECT    COUNT(*) AS OrderCount1
FROM    Sales.SalesOrderHeader h
WHERE    h.OrderDate BETWEEN @StartDate AND @EndDate
/*
STATISTICS IO ON output:
Table 'SalesOrderHeader'. Scan count 1, logical reads 686
*/

the result of COUNT(*) is 0 sales orders and the output of STATISTICS IO ON (the number of logical reads = the number of 8K data pages read from buffer pool) is 686 [logical reads]. This means that SQL Server reads data from Sales.SalesOrderHeader table though @StartDate <= @EndDate predicate is False for current values ("WHERE h.OrderDate BETWEEN ‘20051231 23:59:59.997′ AND ‘20050101 00:00:00.000’" or "WHERE h.OrderDate >= ‘20051231 23:59:59.997′ AND h.OrderDate <= ‘20050101 00:00:00.000’").

What options do we have in such cases ?

We validate the values of parameters
We tell SQL Server that Y must be less or equal than Z:

Code Snippet

SET STATISTICS IO ON;
DECLARE @StartDate DATETIME = '20051231 23:59:59.997',
        @EndDate DATETIME = '20050101 00:00:00.000';
        
SELECT    COUNT(*) AS OrderCount1
FROM    Sales.SalesOrderHeader h
WHERE    h.OrderDate BETWEEN @StartDate AND @EndDate
AND        @StartDate <= @EndDate
/*
STATISTICS IO ON output:
Table 'Worktable'. Scan count 0, logical reads 0
*/

This time, because SQL Server knows that @StartDate must be less or equal than @EndDate the output of STATISTICS IO ON shows that it never reads data from Sales.SalesOrderHeader table (0 logical reads). The execution plan of this query includes a Filter operator which checks @StartDate <= @EndDate predicate (‘20051231 23:59:59.997′ <= ‘20050101 00:00:00.000′) preventing (in this case) the execution of Index Scan / Index Seek operator and thus preventing reading data from Sales.SalesOrderHeader table:

Note #2

What if we want to count sales orders when @StartDate <= @EndDate but also when @StartDate > @EndDate thus avoiding the limitation of BETWEEN operator from ANSI SQL ? One solution is to rewrite the BETWEEN predicate in such way that BETWEEN always will bet BETWEEN @MinimumValue AND @MaximumValue.

Example:

Code Snippet

SET STATISTICS IO ON;
DECLARE @StartDate DATETIME = '20051231 23:59:59.997',
        @EndDate DATETIME = '20050101 00:00:00.000';
        
SELECT    COUNT(*) AS OrderCount2
FROM    Sales.SalesOrderHeader h
WHERE    h.OrderDate 
        BETWEEN 
        – Minimum value
        (CASE WHEN @StartDate <= @EndDate THEN @StartDate ELSE @EndDate END) 
        AND 
        – Maximum value
        (CASE WHEN @StartDate <= @EndDate THEN @EndDate ELSE @StartDate END)

This solution has an advantage: is SARG-able and if there is an index on OrderDate column

Code Snippet

CREATE INDEX IX_SalesOrderHeader_OrderDate
ON Sales.SalesOrderHeader (OrderDate);

then the execution plan will include an Index Seek operator:

Leave a Comment

2013-12-16

SQL Server conferences with recorded sessions

Filed under: conference,sql server — ---- @ 7:52 AM

SQLRally Nordic 2013
SQLBits X (also see Previous Events)

Leave a Comment

2013-12-09

Non SARG-able predicates #1

Filed under: execution plan,index,optimization,predicate,SARG,scan,seek,sql server — ---- @ 7:51 AM

Sometimes isn’t enough to create an index. to optimize a query. Even we create the proper index we may discover by examining the execution plan that SQL Server doesn’t use that an Index Seek to find the rows using instead an Index Scan.

For example, if we want to count the number of sales order for December 2005 a first attempt can use following query and an index on OrderDate:

Code Snippet
CREATE INDEX IX_SalesOrderHeader_OrderDate
 ON Sales.SalesOrderHeader (OrderDate);
 GO
 SET STATISTICS IO ON;
 DBCC FREEPROCCACHE;
 GO
 SELECT COUNT(*)
 FROM Sales.SalesOrderHeader h
 WHERE YEAR(h.OrderDate) = 2005 AND MONTH(h.OrderDate) = 12
 GO
 /*
 SET STATISTICS IO ON output:
 Table 'SalesOrderHeader'. Scan count 1, logical reads 73
 */

We can see that for this query the execution plan includes an Index Scan instead of Index Seek. The reason is these two predicates aren’t SARG-able. A SARG-able predicates allows an Index Seek if an index is available.

The second attempt uses a BETWEEN predicate (which is SARG-able) thus:

Code Snippet
DBCC FREEPROCCACHE;
 GO
 SELECT    COUNT(*)
 FROM    Sales.SalesOrderHeader h
 WHERE    h.OrderDate BETWEEN '20051201' AND '20051231 23:59:59.997' 
 – or WHERE    h.OrderDate >= '20051201' AND h.OrderDate < '20060101'
 GO
 /*
 SET STATISTICS IO ON output:
 Table 'SalesOrderHeader'. Scan count 1, logical reads 3
 */

This time the execution plan includes an Index Seek and the performance (Logical Reads) is better: just 3 pages / logical reads (Seek) instead of 73 pages logical reads (Index Scan).

More information on SARG-able predicates can be found here.

Note: For this test I used Adventure Works for SQL Server 2012 (data file) database

Leave a Comment

« Previous Page — Next Page »

Yet another SQL Server enthusiast

2014-03-04

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

See also:

2014-02-12

Diviziunea relațională

2014-02-10

How to avoid SQL injections: one simple solution

2014-01-22

UPSERT – safe solutions for SQL Server [DRAFT]

Solutions for a single row

Solutions for multiple rows

Revision History:

2014-01-13

Conditional joins

2014-01-05

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

See also:

Revision History:

2014-01-02

Happy New Year!

2013-12-25

Notes regarding SQL Server and BETWEEN operator

Note #1

Note #2

2013-12-16

SQL Server conferences with recorded sessions

2013-12-09

Non SARG-able predicates #1

Archives

Categories

Certifications

Follow “Yet another SQL Server enthusiast”