Statistics and cardinality estimation

SQL Server’s optimizer is a guesser. Before it runs your query, it asks: “How many rows will this filter match? How many rows will this join produce?” Its answers drive every plan choice — whether to seek or scan, hash or merge or nested-loop, spill to tempdb or keep in memory.

The guesses are based on statistics — histograms and density samples stored alongside each index and sometimes on individual columns. When the stats are fresh and representative, the optimizer’s guesses are accurate and the plan is good. When the stats are stale or skewed, the guesses are wrong, and you get a plan that worked beautifully on 1,000 rows falling apart on 10 million.

This is lesson 26. It’s the one that turns “why is my query slow?” into “ah, the optimizer guessed 1 row and got 2 million.”

What statistics look like

Run this to see a histogram:

DBCC SHOW_STATISTICS ('Sales.Orders', 'IX_Orders_CustomerId');

Three results come back:

Header — when the stats were last updated, how many rows, how many sampled, how many distinct values.
Density vector — average selectivity for prefixes of the key.
Histogram — up to 200 bucket ranges, each with row counts for an “equal” range and a “surrounding” range.

The histogram is the interesting part. For a column CustomerId, it might say: “For CustomerId values 1 to 100, there are about 50 rows each. For 101 to 500, about 10 rows each.” When you run WHERE CustomerId = 42, the optimizer estimates ~50 rows. If you have 1,000 real rows with that CustomerId (because 42 is your biggest customer), the estimate is 20× off, and the plan will reflect that wrongness.

Estimated vs actual rows

This is the #1 diagnostic tool in plan reading (lesson 25):

Turn on “Include Actual Execution Plan” (Ctrl+M) in SSMS.
Run the query.
Hover over each operator.
Compare “Estimated Number of Rows” vs “Actual Number of Rows.”

Off by 2×? Meh. Off by 10×? Suspicious. Off by 100×? Stats are almost certainly the cause of your slow query.

When stats get stale

SQL Server updates statistics automatically when:

500 + 20% of the table’s rows have changed since the last update (old threshold).
SQL Server 2016+ with compatibility level 130+ uses a dynamic threshold that triggers more often on big tables.

Problem: on a billion-row table with 20% threshold, 200 million rows have to change before stats update. That’s a lot of drift.

Run UPDATE STATISTICS ... WITH FULLSCAN on a schedule. We mentioned it in lesson 24. For hot tables, nightly full-scan stats is worth the maintenance cost.

-- One table, full scan
UPDATE STATISTICS Sales.Orders WITH FULLSCAN;

-- All stats on the table
UPDATE STATISTICS Sales.Orders;   -- default sample

-- All tables in the database
EXEC sp_updatestats;

Ola’s IndexOptimize has smart defaults that sample most tables and full-scan the critical ones. Use it.

Auto create and auto update

Two database settings control automatic statistics behavior:

SELECT name,
       is_auto_create_stats_on,
       is_auto_update_stats_on,
       is_auto_update_stats_async_on
FROM sys.databases
WHERE name = 'Runehold';

AUTO_CREATE_STATISTICS — on by default; the engine creates single-column statistics on the fly when a query needs them.
AUTO_UPDATE_STATISTICS — on by default; the engine refreshes stats when the threshold is crossed.
AUTO_UPDATE_STATISTICS_ASYNC — off by default; when on, queries don’t wait for the stats update, they run with the old stats while the update happens in the background.

Recommendation for most OLTP workloads: auto-update on, auto-update-async on. Queries don’t stall waiting for the stats engine.

Multi-column statistics

Single-column stats are fine for single-column predicates. For multi-column predicates, SQL Server can create multi-column stats automatically (on index leading columns) or you can create them manually:

CREATE STATISTICS STAT_Orders_CountryStatus
    ON Sales.Orders (CountryCode, Status);

Useful when a two-column combination is skewed — most CountryCode = 'DE' orders have Status = 1, but most CountryCode = 'IT' orders have Status = 3. A stats object capturing both columns helps the optimizer estimate combined selectivity.

Usually a niche need. Don’t create these by default; add them if a specific query’s cardinality estimate is off.

The Legacy vs Modern Cardinality Estimator

SQL Server 2014 rewrote the Cardinality Estimator (CE). Behavior changed in lots of subtle ways. Some queries got faster, some got slower. The compatibility level of the database controls which CE is used:

Compatibility level 110 (SQL 2012) and below: Legacy CE.
Compatibility level 120 (SQL 2014) and above: Modern CE.

Check and set:

SELECT name, compatibility_level FROM sys.databases WHERE name = 'Runehold';

-- Change (carefully, test first)
ALTER DATABASE Runehold SET COMPATIBILITY_LEVEL = 160;

Upgrading the compatibility level — often done after migrating to a newer SQL Server version — can cause plan regressions. Use Query Store (lesson 28) to detect them and pin good plans.

To force the legacy CE for a specific query, add a hint:

SELECT ... OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

Or database-wide:

ALTER DATABASE SCOPED CONFIGURATION
    SET LEGACY_CARDINALITY_ESTIMATION = ON;

Use as a surgical fix for known-regressed queries, not as a blanket default.

When the optimizer truly can’t estimate

Some cases make the optimizer guess blindly:

Table-valued parameters — treated as 1-row estimates unless you use a recompile hint.
OPTION (RECOMPILE) — fresh plan every time but expensive compile.
OPTIMIZE FOR UNKNOWN — uses average density instead of specific values.
Very complex WHERE clauses — the optimizer bails out to a default 10% estimate.

When you know a specific value will produce wildly different row counts, consider:

Parameterizing differently (separate procs for different ranges).
Using OPTION (OPTIMIZE FOR (@p = 'DE')) to hint at a representative value.
Forcing a specific plan via Query Store.

We’ll cover these patterns thoroughly in the next lesson on parameter sniffing.

How to diagnose “the estimate is wrong”

Standard workflow:

Run the query with Actual Execution Plan on.
Find the operator where estimated vs actual diverges the most.
Look at what stats back that operator (the index or column it’s seeking/scanning).
DBCC SHOW_STATISTICS on that index.
Check when the stats were last updated.
UPDATE STATISTICS ... WITH FULLSCAN on that table.
Re-run. Did the estimate improve? If yes, you’ve found your problem.

If stats are fresh and accurate but the estimate is still wrong, you might have:

A skewed distribution the histogram can’t represent (200 buckets might not be enough for a 100-million-value column).
A correlation between columns that single-column stats don’t see.
A functional dependency the optimizer doesn’t know about.

For those cases: multi-column stats, filtered stats (stats with a WHERE clause, like filtered indexes), or rewriting the query.

Run this on your own machine

USE Runehold;
GO

-- 1. Look at the stats on one index
DBCC SHOW_STATISTICS ('Sales.Orders', 'pk_Orders');

-- 2. Find tables with ancient stats
SELECT
    OBJECT_NAME(s.object_id) AS table_name,
    s.name                   AS stat_name,
    STATS_DATE(s.object_id, s.stats_id) AS last_updated,
    sp.rows,
    sp.rows_sampled,
    sp.modification_counter  AS rows_changed_since
FROM sys.stats AS s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) AS sp
WHERE s.object_id > 100
ORDER BY STATS_DATE(s.object_id, s.stats_id);

-- 3. Force a full refresh on a specific table
UPDATE STATISTICS Sales.Orders WITH FULLSCAN;

-- 4. Turn on auto-update-stats-async (usually recommended)
ALTER DATABASE Runehold SET AUTO_UPDATE_STATISTICS_ASYNC ON;

-- 5. See cardinality estimate in action
-- Before: small table, rough estimate
SELECT Name FROM Sales.Customer WHERE Country = 'IT';

-- Check the plan, hover over Index Scan / Seek, look at "Estimated Number of Rows"
-- vs "Actual Number of Rows" in the actual plan.

When you start paying attention to estimates-vs-actuals on every slow query, you’ll find more bugs than you expected, faster than any other diagnostic.

Next lesson: parameter sniffing — the “runs fast in SSMS, slow in the app” problem, why it happens, and the modern fixes.