Yet another SQL Server enthusiast

2013-08-09

Seek vs. Scan (I)

Filed under: index,optimization,scan,seek,sql server — ---- @ 8:55 PM

Overview

There is an old debate in SQL Server community: is always Index Seek better than Index Scan ?

First, I would like to  explain what represents each of them. Visually, difference between them can be described thus:

SeekScan1 As you can see, Seek has two subtypes: singleton lookup and range scan. A singleton lookup is specific to an unique index and it searches a single key value starting from root page towards leaf pages. This means that a singleton lookup will touch at most one record from leaf pages.

Usually, a Seek of range scan subtype supposes searching key values from root page towards first leaf page (which includes the first or last key value in range) and then from that point it scans forward or backward the leaf pages. This means that a range scan could touch more records from leaf pages.

An Index Scan starts from root page and navigates towards first or last record (leaf page) and from that point it reads forward or backward all (usually) records within leaf pages.

What you have seen above is a simplified description for what is happens for a single execution. But these operators could have many executions. For example, an Index Seek – singleton lookup could search 100 key values:SeekCIProperties

Above picture shows two properties regarding to number of executions: the estimated value (1; which is generated at compile time before the query is executed) and the actual number of executions (100; which is generated after the query is executed). Because of that, it’s important the performance of the operator as a all. In this example,  because the index (PK_Customer_CustomerID) has two levels

Code Snippet
USE AdventureWorks2008R2;
GO
SELECT    f.page_count,f.index_level
FROM    sys.dm_db_index_physical_stats
(
    DB_ID(),
    OBJECT_ID(N'Sales.Customer'),
    1, – Clustered index
    NULL,
    'DETAILED'
) f;
/*
page_count index_level
———- ———–
121        0
1          1
*/

every singleton lookup generates two logical reads and the overall performance is 100 key values x 2 logical reads = 200 logical reads. But this value is greater than 1 + 121 logical reads + 1 logical read (IAM) = 123 logical reads which is the I/O performance of a simple Index Scan on the same index.

It’s just about logical reads ? No. What it matters is the overall performance (logical reads, CPU and duration) of current query and of every operator. To better understand I have done the following test.

Test

For this test I used AdventureWorks2008R2 database. I created a table named dbo.Rows

Code Snippet
CREATE TABLE dbo.[Rows](CustomerID INT NOT NULL PRIMARY KEY);

which is used to store 164 rows (for the first test) or 1640 rows (for the second test). Every row from dbo.Rows table is sought in Sales.Customer table (19820 rows) using these queries:

Code Snippet
– Query 1
SELECT @cnt=COUNT(*)
FROM dbo.[Rows] r
INNER JOIN Sales.Customer c WITH(FORCESEEK (PK_Customer_CustomerID(CustomerID)))
ON c.CustomerID=r.CustomerID;

– Query 2
SELECT @cnt=COUNT(*)
FROM dbo.[Rows] r
INNER HASH JOIN Sales.Customer c WITH(INDEX(1),FORCESCAN)
ON c.CustomerID=r.CustomerID;

– Query 3
SELECT @cnt=COUNT(*)
FROM dbo.[Rows] r
INNER MERGE JOIN Sales.Customer c WITH(INDEX(1),FORCESCAN)
ON c.CustomerID=r.CustomerID;

Execution plans:

ActualExecutionPlans

First execution plan (Query 1) includes a Seek operator on PK_Customer_CustomerID clustered index. Second execution plan (Query 2) includes an Index Scan on PK_Customer_CustomerID and a Hash Join and third execution plan (Query 3) includes an Index Scan on the same index and a Merge Join.

Test methodology

1. Insert into dbo.Rows 164 rows (the first 164 rows from Sales.Customer consumes almost one 8K page without B-Tree overhead). Second test inserts into dbo.Rows first 1640 rows.

2. Execute 100 interations for Query 1. Execute 100 interations for Query 2. Execute 100 interations for Query 3

3.  Execute 1000 interations for Query 1. Execute 1000 interations for Query 2. Execute 1000 interations for Query 3.

4. Execute 10000 interations for Query 1. Execute 10000 interations for Query 2. Execute 10000 interations for Query 3.

Download script

Note: 164 is the average number of rows in every leaf page of PK_Customer_CustomerID clustered index.

Results

Below are the results (SQL Profiler, SQL:BatchCompleted) :

Results

Remarks

Query 2 and Query 3 (Scan) have a better [Logical] Reads performance than Query 1 (Seek) but Query 2 (Scan + HASH JOIN) has a bigger CPU time and the Duration is also bigger than Query 1. So, when we have to optimize a query, it isn’t enough to optimize Logical Reads but we have to pay attention also at CPU time and Duration. On the other hand, Query 3 (Scan + Merge Join) seems to offer the best performance in a consistently manner. If I would say that for this test Scan is a better than Seek it would be misleading.

Instead, my remark is that for this test Scan + Merge Join is a better option than Seek + Nested Loops.

In following post I will try to explain these results.

Revision History

2013-08-18: Updated [Download script] link and the Results table (column Interations#)

About these ads

1 Comment »

  1. […] output of STATISTICS IO shows only 3 logical reads representing 1 root page + 1 leaf page + 1 IAM = 123 logical reads. This output shows that after reading 164 records from the “first” leaf page the […]

    Pingback by Seek vs. Scan (II) | Yet another SQL Server enthusiast — 2013-08-24 @ 8:41 PM | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: