How to choose columns when creating index?
In perfect world, you'd like to index columns, that appear in WHERE
clause or JOIN
condition. In your case it would be Email
and Password
columns.
So you could go for a nonclustered index on User table and on Email and Password.
So pretty much this index:
CREATE NONCLUSTERED INDEX idx_User_Email_Password
ON dbo.User (Email, Password);
So if you will run this query:
SELECT UserID, UserName
FROM User
WHERE Email = 'something'
AND Password = 'something';
You will end up using just created index (most likely) or Clustered index and it will seek trough it. However, your query selects UserID and UserName, which are not included in your index, as a result, your query will do a Key Lookup (it will find records in a created index and will look back at your dbo.User
table to find matching values for SELECT statement (UserID and UserName). To avoid that, you could create index with INCLUDED
columns to remove a Key Lookup (and you would want to do that).
CREATE NONCLUSTERED INDEX idx_User_Email_Password
ON dbo.User (Email, Password)
INCLUDE (UserID, UserName);
Using this index you will have a nice NON CLUSTERED INDEX seek in your execution plan.
Also, choosing indexed columns order matters. Let's say, your table would contain UserTypeID (there are not many of them). So you would pass some specific UserTypeIDs and a list of UserIDs, then SQL Server would probably want to pick an index, which has UserTypeID as first indexed column.
So some tests:
CREATE TABLE #Users
(
UserId INT
, UserName VARCHAR(500)
, Email VARCHAR(500)
, Password VARCHAR(500)
);
CREATE CLUSTERED INDEX idx_Users_UserID
ON #Users (UserID);
-- Some test data from my DB
INSERT INTO #Users (UserId, UserName, Email, Password)
SELECT TOP (10000) UserId, UserName, Email, 'password'
FROM Users;
So this is the query:
SELECT *
FROM #Users;
This will perform index Scan, since we don't specify any details.
Now if we specify UserId it will Seek your Clustered index (we have UserId as key):
SELECT *
FROM #Users
WHERE UserID = 602;
Now let's create index without included columns and query something:
CREATE NONCLUSTERED INDEX idx_Users_Email_Password
ON #Users (Email, Password);
SELECT *
FROM #Users
WHERE Email = '[email protected]';
As I've told, it uses created index and does a Key Lookup, it finds matching Email and password and finds rest of the columns in your table to output them (P.S. If you would be ouputting, let's say, only Email, it wouldn't do a Key Lookup, it wouldn't be needed):
Now let's create index with included UserName and run query above. It will produce this nice execution plan with plain NonClustered Index seek as I told you before:
CREATE NONCLUSTERED INDEX idx_Users_Email_Password_iUserName
ON #Users (Email, Password)
INCLUDE (UserName);
This is a high-quality article and I'd recommended reading it: https://www.simple-talk.com/sql/performance/index-selection-and-the-query-optimizer/