Yet another SQL Server enthusiast

2014-10-13

One solution to optimize SQL queries that use scalar functions

Filed under: function,optimization,scalar,UDF — ---- @ 7:05 AM

Scenario

  1. There is a query that uses big tables,
  2. It’s executed rarely (ex. it’s executed once or twice per month),
  3. This query has many conditions, some of them uses UDF scalar functions. The combined selectivity of non-UDF conditions is high (ex. 3.295 rows from 31.263.601).

Problem

The query is slow. How can be optimized quickly this query without rewriting scalar function ?

Scalar UDF used for this test:

Code Snippet
CREATE FUNCTION dbo.SlowScalarFunction(@Name NVARCHAR(250))
RETURNS NVARCHAR(250)
AS
BEGIN
    DECLARE @Result NVARCHAR(250), @Index TINYINT = 1, @Len TINYINT = LEN(@Name + '$') - 1;
    SET @Result = CASE WHEN @Name IS NOT NULL THEN '' END
    WHILE @Index <= @Len
    BEGIN
        SET @Result += UPPER(SUBSTRING(@Name, @Index, 1))
        SET @Index += 1
    END
    RETURN @Result
END;

Note: this scalar function was wrote intentionally to be slow.

I’ve used following query for this test:

Code Snippet
– Test #1
SELECT    dbo.SlowScalarFunction(h.ProductName) AS UpperName, h.Quantity
FROM    [dbo].[BigProduct] p JOIN BigTransactionHistory h ON p.ProductID = h.ProductID
WHERE    p.ListPrice > 3550 AND h.Quantity >= 100 AND dbo.SlowScalarFunction(h.ProductName) LIKE N'ROAD%'
GO

Execution plan

Serial.Execution.Plan.Scalar.User.Defined.Function.WHERE.01

As you can see,

  • The WHERE clause contains one condition that calls a scalar UDF ( dbo.SlowScalarFunction(h.ProductName) LIKE N’ROAD%’ ).
  • The same scalar UDF is used also by SELECT clause.
  • The execution plan is serial.
  • There is a “missing index” warning that tells we should create a non-clustered index:

Code Snippet
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[BigTransactionHistory] ([ProductID],[Quantity])
INCLUDE ([ProductName])

[One] Solution

1) Because this query is executed rarely, the suggested “missing” index will be / could be used rarely (for read operations). This means that after creation of this index will be an overhead due to maintenance of this index after every INSER, UPDATE or DELETE operation.

2) The usage of scalar function forces SQL Server to choose a serial plan.

Because of these two reasons I decided to not create the suggested index and to decompose the initial query thus:

  • One query without scalar functions calls. This query inserts data that into a temporary table (#Results). This table will contain, usually, a small amount of data. Because of lack of scalar functions, the execution plan will use parallelism.
  • For that small amount of rows inserted into temporary table I call the scalar UDF.

Code Snippet
SELECT    ProductName, Quantity
INTO    #Results
FROM    [dbo].[BigProduct] p JOIN BigTransactionHistory h ON p.ProductID = h.ProductID
WHERE    ListPrice > 3550 AND Quantity >= 100

ALTER TABLE #Results
ADD UpperName NVARCHAR(250)

UPDATE    #Results
SET        UpperName = dbo.SlowScalarFunction(ProductName)

SELECT    UpperName, Quantity
FROM    #Results
WHERE    UpperName LIKE N'ROAD%'

DROP TABLE #Results

Execution plan:

Paralelism.Plan.Scalar.User.Defined.Function.WHERE.02

Using a warm buffer, after these changes, the execution time dropped from 10 seconds to 1 second .

About these ads

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: