Covering indexes and included columns

Lesson 21 covered clustered and non-clustered indexes. Today we dig into the single most impactful indexing pattern in SQL Server: covering indexes. The pattern where your non-clustered index contains every column the query needs, so the engine never has to go back to the clustered index for more data.

When Runehold’s customer-support dashboard does “find all orders for this customer, with status and tracking,” that query can either be one clean seek or a seek plus ten thousand key lookups. The difference is a single INCLUDE clause in an index definition. Not kidding — one line of SQL can cut the query’s runtime by 50×.

What a Key Lookup costs

Recap from lesson 21: a non-clustered index contains the key columns plus a pointer to the clustered index. When a query needs columns not in the non-clustered index, the engine does a Key Lookup (in an older name, Bookmark Lookup) for each row — a per-row round trip back to the clustered index to fetch the missing columns.

For one row: fine. For five: fine. For 5,000: catastrophic. Each lookup is random I/O, can’t be batched efficiently, and the cost scales linearly.

In an execution plan:

Index Seek (NonClustered) → Nested Loops → Key Lookup (Clustered) → SELECT

That Key Lookup is the tax. Every time you see it on a hot query, your first instinct should be: can I make the non-clustered index cover this query?

`INCLUDE`: the magic word

-- Before: seeks by Status, but needs CustomerName and OrderTotal
-- which aren't in the index
CREATE NONCLUSTERED INDEX IX_Orders_Status
    ON Sales.Orders (Status);

-- After: every column the dashboard wants is now in the index
CREATE NONCLUSTERED INDEX IX_Orders_Status
    ON Sales.Orders (Status)
    INCLUDE (OrderDate, Total, CountryCode, CustomerId);

INCLUDE columns live only in the leaf pages of the non-clustered index. They aren’t sortable, they aren’t in the upper levels of the B-tree, they don’t change the index’s seek behavior. They’re just along for the ride, ensuring the query can be answered entirely from this index.

The execution plan shape after adding INCLUDE:

Index Seek (NonClustered) → SELECT

One operator. Thin arrow. Done.

Key columns vs included columns

Quick rule that covers 95% of cases:

Key columns — things you filter on, join on, or sort on.
INCLUDE columns — things you only return (project).

-- Query:
SELECT OrderId, Total, OrderDate
FROM Sales.Orders
WHERE CustomerId = @cid AND Status = @status
ORDER BY OrderDate DESC;

-- Index designed for it:
CREATE NONCLUSTERED INDEX IX_Orders_Customer_Status
    ON Sales.Orders (CustomerId, Status, OrderDate DESC)   -- keys: filter + sort
    INCLUDE (Total);                                          -- include: returned only

CustomerId and Status are in the key because they’re the filter. OrderDate DESC is in the key because of the sort (matching sort order means the engine doesn’t need a separate sort operator). Total is in the include because the query just returns it.

Key column order matters

Keys are ordered. If your index is (CustomerId, Status), it’s useful for:

WHERE CustomerId = @c — seek on the first column.
WHERE CustomerId = @c AND Status = @s — seek on both.

It’s not useful for:

WHERE Status = @s alone — can’t seek, has to scan.

The rule: put the most selective equality column first. Or: “equality, range, sort, include.”

-- For:
SELECT ...
FROM Sales.Orders
WHERE CustomerId = @c AND OrderDate >= @start AND OrderDate < @end
ORDER BY OrderDate DESC;

-- The right index:
CREATE NONCLUSTERED INDEX IX_Orders_Customer_Date
    ON Sales.Orders (CustomerId, OrderDate DESC)
    INCLUDE (Status, Total);

CustomerId first (equality). OrderDate DESC second (range + sort, same direction as the query). Everything else included.

Flipping the key order — (OrderDate, CustomerId) — breaks this. A range scan on OrderDate makes CustomerId non-seekable under the filter. We covered this in lesson 21; it bears repeating because it bites everyone once.

Missing-index DMVs

SQL Server tracks every query that could have benefited from an index it didn’t have. Those “missing index” suggestions are available via DMVs:

SELECT TOP (20)
    [Impact] = mig.avg_total_user_cost * (mig.avg_user_impact / 100.0) * (mid.user_seeks + mid.user_scans),
    mig.group_handle,
    mid.statement                AS table_name,
    mid.equality_columns,
    mid.inequality_columns,
    mid.included_columns,
    mid.user_seeks + mid.user_scans AS [seeks+scans]
FROM sys.dm_db_missing_index_groups             AS mig
JOIN sys.dm_db_missing_index_group_stats         AS mig_stats ON mig_stats.group_handle = mig.index_group_handle
JOIN sys.dm_db_missing_index_details             AS mid       ON mid.index_handle    = mig.index_handle
ORDER BY [Impact] DESC;

Shows the top-impact missing indexes. The columns equality_columns, inequality_columns, included_columns are exactly the shape an INDEX ... INCLUDE (...) needs.

Important caveat: SQL Server’s missing-index recommendations are suggestions, not laws. They’re generated by the optimizer based on individual query plans. Blindly creating every suggestion is how you end up with 40 indexes on one table and a write-heavy workload that’s in agony.

Rule: look at the top 5-10 suggestions, evaluate each, and create the ones that match your hot queries. Ignore the rest.

How many indexes is too many?

Rules of thumb that have stood up for ten years:

OLTP tables: 4-6 non-clustered indexes is usually enough. More than 10 and your write performance is likely suffering.
Reporting / DW tables: can tolerate more, because writes are bulk-loaded and reads dominate.
Wide heap of SKUs that everyone searches by different columns: consider columnstore (next lesson).

Every non-clustered index adds cost on every INSERT, UPDATE that touches indexed columns, and DELETE. Ten indexes means 11× the write cost of the base table.

Tool for the job: find indexes that aren’t being used:

SELECT
    OBJECT_NAME(i.object_id) AS table_name,
    i.name                   AS index_name,
    i.type_desc,
    ius.user_seeks + ius.user_scans + ius.user_lookups AS reads,
    ius.user_updates                                      AS writes
FROM sys.indexes AS i
LEFT JOIN sys.dm_db_index_usage_stats AS ius
    ON ius.object_id = i.object_id
   AND ius.index_id   = i.index_id
   AND ius.database_id = DB_ID()
WHERE i.object_id > 100            -- skip system tables
  AND i.type_desc = 'NONCLUSTERED'
ORDER BY reads, writes DESC;

Indexes at the top have zero or almost-zero reads but real writes. They’re costing you and giving nothing back. Drop them.

Caveat: dm_db_index_usage_stats resets on server restart. Run this on a server that’s been up at least a week, ideally through a business cycle (month-end reporting, etc.) so you see the full usage pattern.

Covering for JOINs

Join queries often benefit from covering indexes on both sides. Runehold’s “orders with customer name” query:

SELECT o.OrderId, o.Total, c.Name
FROM Sales.Orders   AS o
JOIN Sales.Customer AS c ON c.CustomerId = o.CustomerId
WHERE o.OrderDate >= '2026-04-01';

On the Orders side, an index on (OrderDate, CustomerId) with included Total covers the orders part. The join to Customer uses Customer.CustomerId (the PK, already indexed). c.Name is returned; we need it covered too. If Customer.CustomerId is the clustered PK, Name is already in the leaf of the clustered index — no extra non-clustered needed. Clean plan: two seeks, no lookups, done.

Run this on your own machine

USE Runehold;
GO

-- Create a realistic hot table
IF OBJECT_ID('dbo.HotOrders', 'U') IS NOT NULL DROP TABLE dbo.HotOrders;

CREATE TABLE dbo.HotOrders (
    OrderId     BIGINT IDENTITY(1,1) PRIMARY KEY,
    CustomerId  INT     NOT NULL,
    Status      TINYINT NOT NULL,
    Total       DECIMAL(19,4) NOT NULL,
    CountryCode CHAR(2) NOT NULL,
    OrderDate   DATETIME2(0) NOT NULL
);

-- Fill with 500k rows
INSERT INTO dbo.HotOrders (CustomerId, Status, Total, CountryCode, OrderDate)
SELECT TOP (500000)
    ABS(CHECKSUM(NEWID())) % 1000,
    ABS(CHECKSUM(NEWID())) % 5,
    ABS(CHECKSUM(NEWID())) % 1000 / 10.0,
    CHAR(65 + ABS(CHECKSUM(NEWID())) % 5) + CHAR(65 + ABS(CHECKSUM(NEWID())) % 5),
    DATEADD(DAY, -ABS(CHECKSUM(NEWID())) % 365, GETDATE())
FROM sys.all_objects a CROSS JOIN sys.all_objects b;

SET STATISTICS IO, TIME ON;

-- 1. No index: full scan
SELECT Total, OrderDate, Status
FROM dbo.HotOrders
WHERE CustomerId = 42;

-- 2. Add a narrow index: seek + key lookup
CREATE NONCLUSTERED INDEX IX_HotOrders_Customer
    ON dbo.HotOrders (CustomerId);

SELECT Total, OrderDate, Status
FROM dbo.HotOrders
WHERE CustomerId = 42;

-- 3. Make it covering: clean seek
DROP INDEX IX_HotOrders_Customer ON dbo.HotOrders;
CREATE NONCLUSTERED INDEX IX_HotOrders_Customer_Cover
    ON dbo.HotOrders (CustomerId)
    INCLUDE (Total, OrderDate, Status);

SELECT Total, OrderDate, Status
FROM dbo.HotOrders
WHERE CustomerId = 42;

SET STATISTICS IO, TIME OFF;

-- Check suggestions
SELECT TOP (5) mid.statement, mid.equality_columns, mid.inequality_columns, mid.included_columns
FROM sys.dm_db_missing_index_details AS mid
ORDER BY mid.index_handle DESC;

-- Cleanup
DROP TABLE dbo.HotOrders;

Run each SELECT with Include Actual Execution Plan (Ctrl+M) enabled. Watch the plan go from scan → seek+lookup → seek. Watch the STATISTICS IO logical reads drop from thousands to a handful.

Next lesson: filtered indexes and columnstore. The two weird-specialty indexes that fit specific problem shapes like a glove.