Let’s start with following example (based on Supertypes and Subtypes modeling approach) which creates three tables:
- dbo.BankAccountOwner (BankAccountOwnerID-PK, OwnerType, CreateDate)
- dbo.Person (BankAccountOwnerID – PK, FK, FirstName, LastName)
- dbo.Company (BankAccountOwnerID – PK, FK, CompanyName)
T-SQL Script:
BankAccountOwnerID INT PRIMARY KEY,
OwnerType CHAR(1) NOT NULL CHECK( OwnerType IN ('P', 'C') ), – P=Person, C=Company
CreateDate DATETIME NOT NULL DEFAULT (GETDATE())
);
CREATE TABLE dbo.Person (
BankAccountOwnerID INT PRIMARY KEY
REFERENCES dbo.BankAccountOwner(BankAccountOwnerID),
FirstName NVARCHAR(50) NOT NULL,
LastName NVARCHAR(50) NOT NULL
);
CREATE TABLE dbo.Company (
BankAccountOwnerID INT PRIMARY KEY
REFERENCES dbo.BankAccountOwner(BankAccountOwnerID),
CompanyName NVARCHAR(100) NOT NULL
);
INSERT dbo.BankAccountOwner (BankAccountOwnerID, OwnerType)
VALUES (1, 'P'), (2, 'P'), (3, 'C'), (4, 'C'), (5, 'C');
INSERT dbo.Person (BankAccountOwnerID, FirstName, LastName)
VALUES (1, N'John', N'Doe'), (2, N'Mary', N'Doe');
INSERT dbo.Company (BankAccountOwnerID, CompanyName)
VALUES (3, N'MyComputer'), (4, N'Control Panel'), (5, N'Device Manager');
GO
Problem: how can we get FirstName, LastName and CompanyName values for the following owners: 1, 2, 3, 4, 5 ?
First solution:
SET STATISTICS IO ON;
PRINT 'Test #1';
SELECT bao.*, p.FirstName, p.LastName, c.CompanyName
FROM dbo.BankAccountOwner bao
LEFT JOIN dbo.Person p ON bao.BankAccountOwnerID = p.BankAccountOwnerID
LEFT JOIN dbo.Company c ON bao.BankAccountOwnerID = c.BankAccountOwnerID
WHERE bao.BankAccountOwnerID IN (1,2,3,4,5);
My solution:
Because
- dbo.Person table contains only rows with OwnerType = ‘P’ and dbo.Company table contains only rows with OwnerType = ‘C’ and
- SQL Server doesn’t knows this
I added these predicates to every left join thus:
SELECT bao.*, p.FirstName, p.LastName, c.CompanyName
FROM dbo.BankAccountOwner bao
LEFT JOIN dbo.Person p ON bao.OwnerType = 'P' AND bao.BankAccountOwnerID = p.BankAccountOwnerID
LEFT JOIN dbo.Company c ON bao.OwnerType = 'C' AND bao.BankAccountOwnerID = c.BankAccountOwnerID
WHERE bao.BankAccountOwnerID IN (1,2,3,4,5);
These are the execution plans:
First solution: as you can see from the properties of Index Seek operators SQL Server will try to find every bank account owner (1, 2, …, 5) within dbo.Person table and within dbo.Company table (Number of Execution = 5).
Second solution: this time because SQL Server knows that dbo.Person table contains only OwnerType = ‘P’ rows it will read from dbo.Person table using an Index Seek (on Person.PK_Person_…) just two times (Number of Executions = 2) because there are only two persons among those five owners. This was possible because of Filter operator which will verify before executing Index Seek if the current owner is ‘P':
Also, SQL Server will read from dbo.Company table using an Index Seek (on Company.PK_Company_…) three times (Number of Executions = 3) because there are three companies.
Because we have reduced the number of execution for Index Seek operators, this is means that we have reduced also the number of logical reads as we can see from the output of STATISTICS IO ON:
Table 'Company'. Scan count 0, logical reads 10
Table 'Person'. Scan count 0, logical reads 10
Table 'BankAccountOwner'. Scan count 5, logical reads 10
Test #2
Table 'Company'. Scan count 0, logical reads 6
Table 'Person'. Scan count 0, logical reads 4
Table 'BankAccountOwner'. Scan count 5, logical reads 10