tempdb: the one shared resource that kills everything

Every user database is an island. Its own files, its own users, its own backups. But every SQL Server instance has one shared database that every session touches: tempdb. It’s the scratch pad. It’s the session’s coffee table. It’s where temp tables live, where sort operations spill when they don’t fit in memory, where version data for RCSI and SNAPSHOT isolation is stored.

It’s also the single most common source of “mysteriously slow” performance issues on a busy server. If you’re managing a SQL Server and you’ve never spent an afternoon tuning tempdb, you will. Today’s lesson is that afternoon.

What uses tempdb

More than you’d expect:

#temp_tables and table variables (@vars) — go in tempdb by default.
Sort operations that can’t fit in memory — spill to tempdb.
Hash joins that need more memory than granted — spill to tempdb.
Worktables the engine builds internally for complex query plans.
Version store — rows kept alive for RCSI, SNAPSHOT isolation, online index rebuilds, triggers.
Temporary stored procedures, functions, indexes — anything with a # prefix.
SSIS / ETL workloads — often hammer tempdb.
Index rebuild operations — WITH (ONLINE = ON, SORT_IN_TEMPDB = ON) uses tempdb as a workspace.

All of these share one tempdb on the instance. When one session fills it, everyone else stalls.

tempdb resets on restart

Fun fact: tempdb is recreated fresh every time SQL Server starts. No backups needed (you can’t back it up anyway). When the engine restarts, whatever you had in tempdb is gone.

This also means tempdb’s size at startup is determined by the model database plus the file sizes you explicitly set. If you pre-size your tempdb files to 4 GB each at creation, that’s the starting size every time.

File count: the big one

Default tempdb on older SQL Server versions: one data file. Every session ends up contending on the same in-memory allocation pages (PFS, GAM, SGAM — the three little pages that track free space).

Contention shows up as PAGELATCH_* waits on pages 2:1:1, 2:1:2, 2:1:3 — tempdb (database 2), first file, first three pages. When you see those waits, you don’t have enough tempdb files.

Rule: one tempdb data file per logical CPU core, up to 8. Beyond 8 you get diminishing returns; add more only if you still see contention after the first 8.

Check current:

SELECT name, size * 8 / 1024 AS size_mb, type_desc
FROM tempdb.sys.database_files;

Add files:

-- Add 7 more so you have 8 total on a server with 8 logical cores
USE master;

ALTER DATABASE tempdb ADD FILE
    (NAME = 'tempdev2', FILENAME = 'D:\tempdb\tempdb_2.ndf', SIZE = 4096MB, FILEGROWTH = 512MB);
ALTER DATABASE tempdb ADD FILE
    (NAME = 'tempdev3', FILENAME = 'D:\tempdb\tempdb_3.ndf', SIZE = 4096MB, FILEGROWTH = 512MB);
-- ... and so on.

All tempdb data files should be the same size with the same autogrowth settings. Unequal sizes cause SQL Server’s allocation to favor the biggest file, defeating the point of multiple files.

Instant file initialization (lesson 30) applies to tempdb too — enable it if possible.

Trace Flag 1118 — now default

For many years, you had to enable TF 1118 to get “uniform extent allocation” for tempdb — which prevented a specific kind of SGAM contention.

SQL Server 2016+ enables this by default for tempdb, making the trace flag unnecessary. You don’t have to set it. But if you work on older versions and see mixed-extent allocation contention, this is what people mean when they say “turn on TF 1118.”

Memory-Optimized tempdb metadata (2019+)

SQL Server 2019 introduced an option to move tempdb’s internal metadata — the catalog of all the temp objects — into memory-optimized (Hekaton) tables. Huge boost for workloads that create and drop temp tables rapidly.

ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED TEMPDB_METADATA = ON;
-- Requires a service restart.

After the restart, some of the contention you’d previously see on metadata operations (system tables in tempdb) vanishes.

If your workload creates thousands of #temp tables per minute (SSIS packages are notorious), this is a huge win. If not, it doesn’t hurt.

tempdb sizing

Start larger than you think. A production tempdb of 8 files × 4 GB = 32 GB is reasonable for a mid-sized OLTP server. Autogrowth kicks in only if the baseline is too small.

Signs tempdb is undersized:

Frequent autogrowth events (visible in default trace / XE).
Jobs suddenly slow when another job runs simultaneously (contention on tempdb space).
PAGEIOLATCH_* waits on tempdb files (slow disk for what tempdb demands).

Runehold’s prod tempdb setup:

Dedicated fast SSD for tempdb (separate from data files and log).
8 data files × 16 GB each = 128 GB.
Autogrowth 1 GB per file, same on every file.
Memory-optimized metadata enabled.

On that config, tempdb rarely misbehaves.

Version store: RCSI and snapshot pressure

When RCSI is on (lesson 19), every modified row’s previous version is kept in the version store in tempdb until every reader that started before the change has finished.

Pressure:

Long-running reader transactions extend version store retention.
Heavy write load stacks many versions fast.
Old uncommitted transactions block version cleanup indefinitely.

Monitor:

SELECT SUM(version_store_reserved_page_count) * 8 / 1024 AS version_store_mb
FROM sys.dm_db_file_space_usage;

SELECT
    s.session_id, s.login_name, s.status,
    at.transaction_begin_time,
    DATEDIFF(SECOND, at.transaction_begin_time, GETDATE()) AS seconds_open
FROM sys.dm_tran_active_snapshot_database_transactions AS sast
JOIN sys.dm_exec_sessions                               AS s
    ON s.session_id = sast.session_id
JOIN sys.dm_tran_active_transactions                     AS at
    ON at.transaction_id = sast.transaction_id
ORDER BY at.transaction_begin_time;

If you see a transaction open for an hour, kill it (or ask the owner). One stale session can pin the version store and let tempdb blow up.

Common tempdb mistakes

Using a table variable for huge intermediate results. Table variables live in tempdb but have poor statistics. For large datasets, a #temp table (temporary table) has real statistics and the optimizer can work with it. Table variables are fine for small sets (< 100 rows); beyond that, switch to temp tables.

Not cleaning up temp tables. They auto-drop when the session ends, but if a session is long-lived (a connection pool), old temp tables accumulate. Explicitly drop or use CREATE TABLE #temp inside procs where they auto-drop on proc exit.

Heavy use of MAX datatype columns in temp tables. NVARCHAR(MAX) forces LOB page allocation, which is slower. Size columns properly, even in temp tables.

Using tempdb for permanent-looking stuff. Some codebases create semi-permanent tables in tempdb for ETL. On restart, they disappear. This is always a bug discovered the next morning.

The “who’s using tempdb” query

SELECT
    s.session_id,
    s.login_name,
    s.status,
    su.user_objects_alloc_page_count * 8 / 1024 AS user_objects_mb,
    su.internal_objects_alloc_page_count * 8 / 1024 AS internal_objects_mb,
    (su.user_objects_alloc_page_count + su.internal_objects_alloc_page_count) * 8 / 1024 AS total_mb
FROM sys.dm_db_session_space_usage AS su
JOIN sys.dm_exec_sessions          AS s ON s.session_id = su.session_id
WHERE su.user_objects_alloc_page_count > 0
   OR su.internal_objects_alloc_page_count > 0
ORDER BY total_mb DESC;

One row per session using tempdb. The ones at the top are filling it. Perfect forensic query when tempdb grows unexpectedly.

Run this on your own machine

-- 1. Check current tempdb configuration
SELECT name, size * 8 / 1024 AS size_mb, growth, is_percent_growth, type_desc
FROM tempdb.sys.database_files;

-- 2. Look for page-latch waits on tempdb (contention indicator)
SELECT TOP (10)
    wait_type,
    waiting_tasks_count,
    wait_time_ms,
    signal_wait_time_ms,
    wait_time_ms - signal_wait_time_ms AS resource_wait_ms
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'PAGELATCH%'
ORDER BY wait_time_ms DESC;

-- 3. Who's using tempdb right now
SELECT
    s.session_id,
    s.login_name,
    (su.user_objects_alloc_page_count + su.internal_objects_alloc_page_count) * 8 / 1024 AS tempdb_mb
FROM sys.dm_db_session_space_usage AS su
JOIN sys.dm_exec_sessions          AS s ON s.session_id = su.session_id
WHERE (su.user_objects_alloc_page_count + su.internal_objects_alloc_page_count) > 0
ORDER BY tempdb_mb DESC;

-- 4. Version store size (only meaningful if RCSI/SNAPSHOT is enabled)
SELECT SUM(version_store_reserved_page_count) * 8 / 1024 AS version_store_mb
FROM sys.dm_db_file_space_usage;

-- 5. Current tempdb free space
SELECT SUM(unallocated_extent_page_count) * 8 / 1024 AS free_mb
FROM tempdb.sys.dm_db_file_space_usage;

tempdb is the part of SQL Server most likely to surprise you at 3am. The time to tune it is before the outage.

Next lesson: maintenance plans vs Ola Hallengren — why the built-in maintenance plan is bad and why everyone eventually switches to Ola’s scripts.