SQL Server, from the ground up Lesson 16 / 40

Common Table Expressions for real: recursive CTEs

Org charts, folder trees, bill-of-materials, and the one-line date-dimension generator. How to think about recursive CTEs without getting lost.

In lesson 11 we saw a recursive CTE example and moved on. Today we spend the whole lesson on it, because recursion in SQL is one of those features that once it clicks, you stop dreading hierarchy queries forever.

Runehold has at least three genuinely recursive datasets:

  1. Org chart. Every employee has a manager, who has a manager, up to the CEO.
  2. Product categories. “Home decor > Lighting > Floor lamps > Penguin-shaped floor lamps.”
  3. Bill of materials. The “Enchanted Starter Kit” bundle contains a spellbook, a travel mug, and a “mildly cursed” tea towel — each of which might itself be composed of simpler SKUs.

Any time the answer to “what about the thing below this?” is itself the same shape as the question — you have a recursive problem. Recursive CTEs are the cleanest way to express them.

The three-part structure

Every recursive CTE has the same shape:

WITH my_cte AS (
    -- Part 1: the ANCHOR — where do we start?
    SELECT ...
    FROM base_table
    WHERE seed_condition

    UNION ALL            -- mandatory, not UNION

    -- Part 2: the RECURSIVE MEMBER — how do we step further?
    SELECT ...
    FROM base_table AS t
    JOIN my_cte AS r ON r.col = t.col
)
-- Part 3: the OUTER QUERY that reads the CTE
SELECT * FROM my_cte ORDER BY ...;

Three parts. In order. UNION ALL required; UNION would silently dedupe intermediate results and make the recursion wrong. No exceptions.

SQL Server executes it like this:

  1. Run the anchor. Put the result into a virtual “current” set.
  2. Run the recursive member, joining against “current.” The result is the new “current.”
  3. Repeat step 2 until the recursive member returns zero rows.
  4. The final result is the union of all “current”s produced.

Two rules that come for free:

  • The recursive member must reference the CTE name (otherwise it’s not recursive).
  • There must be a termination condition — otherwise you recurse forever. SQL Server caps at 100 iterations by default; override with OPTION (MAXRECURSION n), 0 means unlimited (dangerous).

Example 1: the org chart

WITH org AS (
    -- Anchor: the CEO (no manager)
    SELECT EmployeeId,
           FullName,
           ManagerId,
           Team,
           0               AS level,
           CAST(FullName AS NVARCHAR(4000)) AS chain
    FROM HR.Employee
    WHERE ManagerId IS NULL

    UNION ALL

    -- Recursive: employees whose manager is already in the CTE
    SELECT e.EmployeeId,
           e.FullName,
           e.ManagerId,
           e.Team,
           o.level + 1,
           CAST(o.chain + N' > ' + e.FullName AS NVARCHAR(4000))
    FROM HR.Employee AS e
    JOIN org         AS o ON o.EmployeeId = e.ManagerId
)
SELECT level, REPLICATE(N'  ', level) + FullName AS indented_name, Team, chain
FROM org
ORDER BY chain;

The output is an indented org chart:

CEO - Ilse Jansen
  VP Engineering - Tomasz Kowalski
    Eng Manager - Franz Hofmann
      Dev - Ada Lovelace
      Dev - Grace Hopper
    Eng Manager - Sofia Bianchi
  VP Marketing - Djenna Akkad
    ...

chain accumulates the path from the root. You can use that for filtering (“everyone under VP Engineering”) or for display.

Be careful with chain length. Cast to a big NVARCHAR(4000) or the string might truncate to the length of the anchor’s first value. Classic recursive CTE gotcha.

Example 2: product categories

Same pattern, different data:

-- Given Catalog.Category (CategoryId, ParentCategoryId, Name)
WITH tree AS (
    SELECT CategoryId, ParentCategoryId, Name, 0 AS depth,
           CAST(Name AS NVARCHAR(4000)) AS path
    FROM Catalog.Category
    WHERE ParentCategoryId IS NULL       -- root categories

    UNION ALL

    SELECT c.CategoryId, c.ParentCategoryId, c.Name, t.depth + 1,
           CAST(t.path + N' > ' + c.Name AS NVARCHAR(4000))
    FROM Catalog.Category AS c
    JOIN tree               AS t ON t.CategoryId = c.ParentCategoryId
)
SELECT depth, REPLICATE(N'    ', depth) + Name AS indented, path
FROM tree
ORDER BY path;

Gives you the full category tree, sorted so parents come before their children in the output. Marketing loves this for generating navigation menus; ops loves it for building category-based reports.

Example 3: bill of materials

Products composed of other products:

-- Given Catalog.Bom (BundleId, ComponentProductId, Quantity)
WITH bom AS (
    -- Anchor: the top-level bundle
    SELECT BundleId AS product_id,
           ComponentProductId,
           Quantity,
           1 AS depth
    FROM Catalog.Bom
    WHERE BundleId = 12345           -- the "Enchanted Starter Kit"

    UNION ALL

    -- Recursive: components of components
    SELECT b.BundleId, b.ComponentProductId, b.Quantity * p.Quantity, p.depth + 1
    FROM Catalog.Bom AS b
    JOIN bom         AS p ON p.ComponentProductId = b.BundleId
)
SELECT ComponentProductId, SUM(Quantity) AS total_qty
FROM bom
GROUP BY ComponentProductId;

Expands a bundle all the way down and multiplies quantities along the way. “Starter Kit contains 1 Spellbook, and a Spellbook contains 3 Magical Bookmarks, so the Kit effectively contains 3 Bookmarks at the leaf level.”

Example 4: generating date ranges

No table needed. Generate dates on the fly:

-- Every day in Q1 2026
WITH days AS (
    SELECT CAST('2026-01-01' AS DATE) AS d
    UNION ALL
    SELECT DATEADD(DAY, 1, d) FROM days WHERE d < '2026-03-31'
)
SELECT d, DATENAME(WEEKDAY, d) AS dow
FROM days
OPTION (MAXRECURSION 500);

This pattern is gold for gap-filling reports. Report wants “revenue per day for the last 30 days, showing zero for days with no orders”? LEFT JOIN the generated date series against your aggregated data:

WITH days AS (
    SELECT CAST(DATEADD(DAY, -29, GETDATE()) AS DATE) AS d
    UNION ALL
    SELECT DATEADD(DAY, 1, d) FROM days WHERE d < CAST(GETDATE() AS DATE)
),
daily_rev AS (
    SELECT CAST(OrderDate AS DATE) AS d, SUM(Total) AS rev
    FROM Sales.Orders
    WHERE OrderDate >= DATEADD(DAY, -30, GETDATE())
    GROUP BY CAST(OrderDate AS DATE)
)
SELECT d.d, COALESCE(r.rev, 0) AS revenue_eur
FROM days AS d
LEFT JOIN daily_rev AS r ON r.d = d.d
ORDER BY d.d;

Thirty rows, one per day, zero where no orders happened. Your dashboard doesn’t skip days. Marketing cannot claim “but we had a campaign on the 14th” and be met with “there’s no row for the 14th in the report.”

The MAXRECURSION gotcha

SQL Server defaults to 100 iterations. For a deep org chart, that’s usually enough. For date generation or graph traversal, it’s not.

OPTION (MAXRECURSION 5000)   -- allow up to 5000 iterations
OPTION (MAXRECURSION 0)      -- unlimited (use carefully)

0 is “no limit” — dangerous with an unintentionally unbounded recursion. Prefer explicit numbers; pick one comfortably above your real maximum and let it fail loudly if something is wrong.

The OPTION hint goes at the very end of the outermost SELECT, not inside the CTE.

Breadth-first vs depth-first traversal

SQL Server’s recursive CTE is fundamentally breadth-first: it produces all level-1 rows before any level-2 rows. If you need depth-first (all descendants before siblings), you have to simulate it with the ordering. The classic trick: build a sortable path as you go, then ORDER BY path:

WITH tree AS (
    SELECT CategoryId, ParentCategoryId, Name, 0 AS depth,
           CAST(RIGHT(REPLICATE('0', 10) + CAST(CategoryId AS VARCHAR(10)), 10) AS VARCHAR(4000)) AS sort_path
    FROM Catalog.Category
    WHERE ParentCategoryId IS NULL

    UNION ALL

    SELECT c.CategoryId, c.ParentCategoryId, c.Name, t.depth + 1,
           CAST(t.sort_path + '/' + RIGHT(REPLICATE('0', 10) + CAST(c.CategoryId AS VARCHAR(10)), 10) AS VARCHAR(4000))
    FROM Catalog.Category AS c
    JOIN tree             AS t ON t.CategoryId = c.ParentCategoryId
)
SELECT depth, REPLICATE('  ', depth) + Name AS indented
FROM tree
ORDER BY sort_path;

Zero-padded IDs in the sort path ensure alphabetic sorting matches numeric order. ORDER BY sort_path gives you a depth-first traversal.

Performance notes

  • Recursive CTEs are usually fast when the graph is small. Org charts of 200 employees? Trivial. Product-category trees with 500 nodes? Instant.
  • Recursive CTEs can be slow on big graphs. A recursive walk of a million-node graph will hurt. For truly large graph problems, SQL Server’s Graph tables (introduced in 2017) or a dedicated graph database are better.
  • The join between the recursive member and the CTE runs once per iteration. If the underlying table has no index on the join column, each iteration scans. For “employees joined on ManagerId” — make sure there’s an index on ManagerId.
  • For deep date generation, a number-table or dedicated date-dimension table is faster than recursion. But for ad-hoc reports, the recursive date generator is perfect — no schema changes needed.

Run this on your own machine

USE Runehold;
GO

-- 1. Date generator + gap-fill report
WITH days AS (
    SELECT CAST('2026-03-01' AS DATE) AS d
    UNION ALL
    SELECT DATEADD(DAY, 1, d) FROM days WHERE d < '2026-04-30'
),
daily AS (
    SELECT CAST(OrderDate AS DATE) AS d, SUM(Total) AS rev
    FROM Sales.Orders
    GROUP BY CAST(OrderDate AS DATE)
)
SELECT d.d,
       COALESCE(daily.rev, 0) AS revenue_eur,
       DATENAME(WEEKDAY, d.d) AS dow
FROM days AS d
LEFT JOIN daily ON daily.d = d.d
ORDER BY d.d
OPTION (MAXRECURSION 500);

-- 2. Build a small org chart for the demo
IF OBJECT_ID('HR.Employee', 'U') IS NULL
BEGIN
    IF SCHEMA_ID('HR') IS NULL EXEC('CREATE SCHEMA HR AUTHORIZATION dbo');
    CREATE TABLE HR.Employee (
        EmployeeId INT IDENTITY(1,1) PRIMARY KEY,
        FullName   NVARCHAR(200) NOT NULL,
        Team       NVARCHAR(50)  NOT NULL,
        ManagerId  INT NULL,
        FOREIGN KEY (ManagerId) REFERENCES HR.Employee(EmployeeId)
    );

    INSERT INTO HR.Employee (FullName, Team, ManagerId) VALUES
    ('Ilse Jansen',     'Executive',  NULL);       -- 1, CEO
    INSERT INTO HR.Employee (FullName, Team, ManagerId) VALUES
    ('Tomasz Kowalski', 'Executive',  1),            -- 2, VP Eng
    ('Djenna Akkad',    'Executive',  1),            -- 3, VP Mktg
    ('Franz Hofmann',   'Engineering',2),            -- 4
    ('Sofia Bianchi',   'Engineering',2),            -- 5
    ('Ada Lovelace',    'Engineering',4),            -- 6
    ('Grace Hopper',    'Engineering',4),            -- 7
    ('Piotr Nowak',     'Marketing',  3);            -- 8
END;

-- 3. The org chart
WITH org AS (
    SELECT EmployeeId, FullName, ManagerId, Team, 0 AS lvl,
           CAST(FullName AS NVARCHAR(4000)) AS chain
    FROM HR.Employee WHERE ManagerId IS NULL

    UNION ALL

    SELECT e.EmployeeId, e.FullName, e.ManagerId, e.Team, o.lvl + 1,
           CAST(o.chain + N' > ' + e.FullName AS NVARCHAR(4000))
    FROM HR.Employee AS e
    JOIN org         AS o ON o.EmployeeId = e.ManagerId
)
SELECT lvl, REPLICATE(N'    ', lvl) + FullName AS indented, Team, chain
FROM org
ORDER BY chain;

-- 4. Everyone under VP Engineering (filter by chain)
WITH org AS (
    SELECT EmployeeId, FullName, ManagerId, 0 AS lvl
    FROM HR.Employee WHERE EmployeeId = 2   -- VP Eng

    UNION ALL

    SELECT e.EmployeeId, e.FullName, e.ManagerId, o.lvl + 1
    FROM HR.Employee AS e
    JOIN org         AS o ON o.EmployeeId = e.ManagerId
)
SELECT * FROM org WHERE lvl > 0;
-- Excludes the VP themselves; starts from their direct reports.

Module 2 is done. That’s querying covered end-to-end: SELECT, WHERE, NULL, ORDER BY, JOINs, GROUP BY, subqueries, CTEs, windows, UNION, strings, dates, recursion. With this toolkit you can answer 95% of business questions that reach a data engineer’s desk at any EU e-commerce company.

Next module: DML and transactions. Lesson 17 starts with the complicated love story of INSERT, UPDATE, DELETE, and why MERGE is dangerous.

Search