Index Key Column VS Index Included Column

Index key columns are part of the b-tree of the index. Included columns are not.

Take two indexes:

CREATE INDEX index1 ON table1 (col1, col2, col3)
CREATE INDEX index2 ON table1 (col1) INCLUDE (col2, col3)

index1 is better suited for this kind of query:

SELECT * FROM table1 WHERE col1 = x AND col2 = y AND col3 = z

Whereas index2 is better suited for this kind of query:

SELECT col2, col3 FROM table1 WHERE col1 = x

In the first query, index1 provides a mechanism for quickly identifying the rows of interest. The query will (probably) execute as an index seek, followed by a bookmark lookup to retrieve the full row(s).

In the second query, index2 acts as a covering index. SQL Server doesn't have to hit the base table at all, since the index provides all the data it needs to satisfy the query. index1 could also act as a covering index in this case.

If you want a covering index, but don't want to add all columns to the b-tree because you don't seek on them, or can't because they aren't an allowed datatype (eg, XML), use the INCLUDE clause.


Included columns don't form part of the key for the index, but they do exist on the index. Essentially the values will be duplicated, so there is a storage overhead, but there is a greater chance that your index will cover (i.e. be selected by the query optimizer for) more queries. This duplication also improves performance when querying, since the database engine can return the value without having to look at the table itself.

Only nonclustered indexes can have included columns, because in a clustered index, every column is effectively included.


Let's think about the book. Every page in the book has the page number. All information in this book is presented sequentially based on this page number. Speaking in the database terms, a page number is the clustered index. Now think about the glossary at the end of the book. This is in alphabetical order and allows you to quickly find the page number specific glossary term belongs to. This represents a non-clustered index with a glossary term as the key column.

Now assuming that every page also shows "chapter" title at the top. If you want to find in what chapter is the glossary term, you have to look up what page # describes glossary term, next - open the corresponding page, and see the chapter title on the page. This clearly represents key lookup - when you need to find the data from a non-indexed column, you have to find actual data record (clustered index) and look at this column value. Included column helps in terms of performance - think about glossary where each chapter title includes in addition to glossary term. If you need to find out what chapter the glossary term belongs - you don't need to open actual page - you can get it when you look up the glossary term.

So included columns are like those chapter titles. Non clustered Index (glossary) has an additional attribute as part of the non-clustered index. An index is not sorted by included columns - it just additional attributes that help to speed up the lookup (e.g. you don't need to open the actual page because the information is already in the glossary index).

Example:

Create Table Script

CREATE TABLE [dbo].[Profile](
    [EnrollMentId] [int] IDENTITY(1,1) NOT NULL,
    [FName] [varchar](50) NULL,
    [MName] [varchar](50) NULL,
    [LName] [varchar](50) NULL,
    [NickName] [varchar](50) NULL,
    [DOB] [date] NULL,
    [Qualification] [varchar](50) NULL,
    [Profession] [varchar](50) NULL,
    [MaritalStatus] [int] NULL,
    [CurrentCity] [varchar](50) NULL,
    [NativePlace] [varchar](50) NULL,
    [District] [varchar](50) NULL,
    [State] [varchar](50) NULL,
    [Country] [varchar](50) NULL,
    [UIDNO] [int] NOT NULL,
    [Detail1] [varchar](max) NULL,
    [Detail2] [varchar](max) NULL,
    [Detail3] [varchar](max) NULL,
    [Detail4] [varchar](max) NULL,
PRIMARY KEY CLUSTERED 
(
    [EnrollMentId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO

Stored procedure script

CREATE Proc [dbo].[InsertIntoProfileTable]
As
BEGIN
SET NOCOUNT ON
Declare @currentRow int
Declare @Details varchar(Max)
Declare @dob Date
set @currentRow =1;
set @Details ='Let''s think about the book. Every page in the book has the page number. All information in this book is presented sequentially based on this page number. Speaking in the database terms, a page number is the clustered index. Now think about the glossary at the end of the book. This is in alphabetical order and allows you to quickly find the page number specific glossary term belongs to. This represents non-clustered index with glossary term as the key column.        Now assuming that every page also shows "chapter" title at the top. If you want to find in what chapter is the glossary term, you have to look up what page # describes glossary term, next - open the corresponding page, and see the chapter title on the page. This clearly represents key lookup - when you need to find the data from non-indexed column, you have to find actual data record (clustered index) and look at this column value. Included column helps in terms of performance - think about glossary where each chapter title includes in addition to glossary term. If you need to find out what chapter the glossary term belongs - you don''t need to open the actual page - you can get it when you look up the glossary term.        So included columns are like those chapter titles. Non clustered Index (glossary) has an additional attribute as part of the non-clustered index. Index is not sorted by included columns - it just additional attributes that help to speed up the lookup (e.g. you don''t need to open the actual page because the information is already in the glossary index).'
while(@currentRow <=200000)
BEGIN
insert into dbo.Profile values( 'FName'+ Cast(@currentRow as varchar), 'MName' + Cast(@currentRow as varchar), 'MName' + Cast(@currentRow as varchar), 'NickName' + Cast(@currentRow as varchar), DATEADD(DAY, ROUND(10000*RAND(),0),'01-01-1980'),NULL, NULL, @currentRow%3, NULL,NULL,NULL,NULL,NULL, 1000+@currentRow,@Details,@Details,@Details,@Details)
set @currentRow +=1;
END

SET NOCOUNT OFF
END

GO

Using the above SP you can insert 200000 records at one time.

You can see that there is a clustered index on column “EnrollMentId”.

Now Create a non-Clustered index on “ UIDNO” Column.

Script

CREATE NONCLUSTERED INDEX [NonClusteredIndex-20140216-223309] ON [dbo].[Profile]
(
    [UIDNO] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

Now Run the following Query

select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile 
--Takes about 30-50 seconds and return 200,000 results.

Query 2

select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile
where DOB between '01-01-1980' and '01-01-1985'
 --Takes about 10-15 seconds and return 36,479 records.

Now drop the above non-clustered index and re-create with the following script

CREATE NONCLUSTERED INDEX [NonClusteredIndex-20140216-231011] ON [dbo].[Profile]
(
    [UIDNO] ASC,
    [FName] ASC,
    [DOB] ASC,
    [MaritalStatus] ASC,
    [Detail1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

It will throw the following error

Msg 1919, Level 16, State 1, Line 1 Column 'Detail1' in table 'dbo.Profile' is of a type that is invalid for use as a key column in an index.

Because we can not use varchar(Max) datatype as key column.

Now Create a non-Clustered Index with included columns using following script

CREATE NONCLUSTERED INDEX [NonClusteredIndex-20140216-231811] ON [dbo].[Profile]
(
    [UIDNO] ASC
)
INCLUDE (   [FName],
    [DOB],
    [MaritalStatus],
    [Detail1]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

Now Run the following Query

select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile --Takes about 20-30 seconds and return 200,000 results.

Query 2

select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile
where DOB between '01-01-1980' and '01-01-1985'
 --Takes about 3-5 seconds and return 36,479 records.