SELECT: asking questions without crying

Today is the first lesson that’s actually about writing T-SQL. Everything before this was setup. From here on, every lesson starts with a question Runehold’s business actually asks — “which products sold the most in Germany last week?”, “how many orders did we ship on time in March?”, “what’s the churn rate on Runeboxes in Italy?” — and works out the SQL to answer it.

The statement we’ll spend most of our lives writing is SELECT. Looks simple. Has at least nine keywords that can go in it. Has a logical execution order that doesn’t match the order you type it in. Has one keyword (*) that ruins queries in ways that are invisible until they aren’t. We’ll cover all of that today.

The six parts of a SELECT

Every SELECT has up to six major clauses, in this order when you write them:

SELECT     <columns>
FROM       <table>
WHERE      <row filter>
GROUP BY   <grouping>
HAVING     <group filter>
ORDER BY   <sorting>

Plus a handful of optional modifiers (TOP, DISTINCT, OFFSET ... FETCH, joins, window functions) that attach to one of those clauses.

The simplest legal SELECT has just two clauses:

SELECT CustomerId, Email FROM Sales.Customer;

The simplest legal SELECT without a table has one:

SELECT GETDATE();
-- Returns the current time. No FROM needed.

Most production queries use all six.

The logical order is not the written order

Here is the single most important fact about SELECT that nobody ever tells you explicitly.

The order you WRITE the clauses in is not the order SQL Server EXECUTES them in.

SQL Server’s logical processing order goes roughly like this:

1.  FROM         (pick the table or tables)
2.  ON           (apply any JOIN condition)
3.  JOIN         (combine rows from joined tables)
4.  WHERE        (filter rows)
5.  GROUP BY     (collapse rows into groups)
6.  HAVING       (filter groups)
7.  SELECT       (project the columns)
8.  DISTINCT     (deduplicate)
9.  ORDER BY     (sort)
10. TOP / OFFSET (limit)

Everything starts with FROM. The SELECT list — the columns you’re projecting — is almost the last thing evaluated. This feels weird because we write SELECT first. But it explains a dozen beginner mistakes in one sentence:

Why you can’t use a column alias in the WHERE clause (the alias is created at step 7; WHERE runs at step 4)
Why you can use a column alias in ORDER BY (step 9 is after step 7)
Why GROUP BY requires the un-aliased name
Why a WHERE filter on an aggregate doesn’t work — you have to use HAVING

Make a mental note of this order. It’ll save you from ten hours of “why doesn’t this work” in the first year.

Projection: picking columns

In real queries, never SELECT *. I’ll explain why in a minute. Instead, always list the columns you want:

USE Runehold;

SELECT CustomerId,
       Name,
       Country,
       CreatedAt
FROM Sales.Customer;

Each row that matches the query produces one row in the output, with the requested columns in the requested order.

You can also compute new columns on the fly, called expressions:

SELECT CustomerId,
       Name,
       UPPER(Country) AS CountryCode,          -- case transform
       DATEDIFF(DAY, CreatedAt, GETDATE()) AS DaysSinceSignup,
       CreatedAt
FROM Sales.Customer;

AS <alias> gives the expression a column name. Without an alias, computed columns show up in results as (No column name) — ugly and unusable for application code. Always alias.

`SELECT *` is a professional smell

SELECT * means “every column in the table, in the order the table defines them.” Every SQL tutorial starts with it because it’s easy to type. Every SQL production guide bans it.

Reasons:

Schema changes break your code. Somebody adds a column; your INSERT INTO ... SELECT * now has the wrong column count. Your application’s strongly-typed result mapping breaks. Now you’ve got a 500 on Monday morning.
Performance. SELECT * forces SQL Server to read every column, which rules out covering non-clustered indexes. A query that could seek an index and return in 2ms now scans the whole table.
Readability. SELECT Name, Email, Country tells the reader what this query is about. SELECT * tells them nothing.
Network cost. Returning columns the app doesn’t use wastes bandwidth between the database and the app server.

Three exceptions where SELECT * is acceptable:

Ad-hoc exploration in SSMS (“what’s in this table?”). Throwaway.
EXISTS subqueries: WHERE EXISTS (SELECT * FROM ... WHERE ...) — the * is never materialized, it’s a convention.
COUNT(*) — not actually a SELECT *, it’s a special form meaning “count rows regardless of NULL.”

Always list columns. Your future self will thank you.

Aliases: the quality-of-life feature

Column aliases rename computed or verbose columns:

SELECT o.OrderId                 AS Id,
       o.Total                   AS AmountEUR,
       o.CountryCode              AS Country,
       o.Total * o.VatRate       AS VatAmount,
       o.Total - (o.Total * o.VatRate / (1 + o.VatRate)) AS NetAmount
FROM Sales.Orders AS o;

Two styles: o.OrderId AS Id (with AS) and o.OrderId Id (space instead of AS). Both work. Always use AS — it’s explicit and visually unambiguous. The space form has bitten me when I accidentally wrote SELECT OrderId Total FROM ... and SQL Server interpreted that as “give me OrderId aliased as Total,” not “give me OrderId and Total.” Confusing, silent, annoying.

Table aliases rename the table in a query, usually to something short:

SELECT o.OrderId, c.Name, c.Country
FROM Sales.Orders AS o
JOIN Sales.Customer AS c ON c.CustomerId = o.CustomerId;

o for Orders, c for Customer. The aliases let you write o.OrderId instead of Sales.Orders.OrderId. Always alias tables in multi-table queries. Always write o.OrderId instead of just OrderId when more than one table is in play, even if the column name is unambiguous today. Tomorrow someone adds an OrderId column to Customer and your query stops compiling.

`TOP`, `OFFSET ... FETCH`, and `DISTINCT`

TOP (n) — return the first n rows. Without an ORDER BY it’s a coin flip which n rows you get:

SELECT TOP (10) OrderId, Total FROM Sales.Orders ORDER BY OrderDate DESC;
-- The 10 most recent orders

OFFSET ... FETCH — the SQL-standard paging form. Requires ORDER BY:

SELECT OrderId, Total
FROM Sales.Orders
ORDER BY OrderDate DESC
OFFSET 20 ROWS
FETCH NEXT 10 ROWS ONLY;
-- Skip 20, take 10 — "page 3 of 10-per-page"

DISTINCT — deduplicate rows by all projected columns:

SELECT DISTINCT Country FROM Sales.Customer;
-- Each country appears once

DISTINCT is expensive (SQL Server has to sort or hash to dedupe). If you find yourself writing it often, usually the real fix is GROUP BY or a different join pattern.

The full-column-list habit

Let’s take a realistic Runehold query: “list the ten biggest orders from Italian customers this quarter, newest first, showing order ID, customer name, total in EUR, and the date.”

SELECT TOP (10)
       o.OrderId,
       c.Name                                    AS CustomerName,
       o.Total                                   AS TotalEUR,
       CAST(o.OrderDate AS DATE)                 AS OrderDay
FROM Sales.Orders AS o
JOIN Sales.Customer AS c
    ON c.CustomerId = o.CustomerId
WHERE o.CountryCode = 'IT'
  AND o.OrderDate  >= '2026-01-01'
  AND o.OrderDate  <  '2026-04-01'
ORDER BY o.Total DESC;

Every column is named. Every table is aliased. The filter on OrderDate uses a half-open interval (>= on start, < on end), which is the right way to filter on a datetime column — more on why in lesson 15, but it’s the only form that’s both correct and index-friendly. The ORDER BY is explicit; without it TOP (10) would be a random sample of 10 rows.

This is the shape of a production query. Get comfortable with it.

Comments: leave breadcrumbs

T-SQL has two comment styles:

-- Single-line comment, like this

/*
  Multi-line comment.
  Useful for long explanations or
  temporarily disabling a block of code.
*/

Write comments for the business context, not for what the SQL obviously does:

-- Bad comment: tells you what you can already read
-- Select customers
SELECT Name FROM Sales.Customer;

-- Good comment: tells you why
-- Italian customers, used by the marketing team's quarterly
-- "famiglie italiane" mailing campaign. Updated 2026-03-15.
SELECT Name FROM Sales.Customer WHERE Country = 'IT';

A query that will be read by a dozen people over five years needs comments. A throwaway exploration doesn’t. Use judgement.

Common beginner mistakes, with fixes

Mistake 1: “Invalid column name” on a valid-looking alias.

-- ERROR: 'Country' is invalid in WHERE
SELECT UPPER(CountryCode) AS Country FROM Sales.Orders WHERE Country = 'IT';

WHERE runs before SELECT in the logical order. The alias doesn’t exist yet at step 4. Fix:

SELECT UPPER(CountryCode) AS Country FROM Sales.Orders WHERE UPPER(CountryCode) = 'IT';
-- or better:
SELECT UPPER(CountryCode) AS Country FROM Sales.Orders WHERE CountryCode = 'IT';

The second form keeps the index on CountryCode usable (no function on the column).

Mistake 2: Sorting by a column not in the SELECT list.

Actually this works fine:

SELECT OrderId, Total FROM Sales.Orders ORDER BY OrderDate DESC;

ORDER BY runs at step 9 and has access to everything FROM produced, not just what SELECT chose. The output only shows the columns you projected, but you can sort by anything from the source tables. Useful and often forgotten.

Mistake 3: DISTINCT as a band-aid for duplicates.

If your query returns duplicate rows you didn’t expect, the usual cause is a JOIN that’s producing more rows than you thought. Adding DISTINCT hides the symptom without fixing the cause. Fix the join.

Run this on your own machine

We’ll finally put some data into Runehold and query it. Paste the whole block in one go — it builds on lesson 3’s database.

USE Runehold;
GO

-- Create the Orders table if it doesn't exist
IF OBJECT_ID(N'Sales.Orders', N'U') IS NULL
BEGIN
    CREATE TABLE Sales.Orders (
        OrderId      BIGINT IDENTITY(1,1) NOT NULL,
        CustomerId   INT                  NOT NULL,
        OrderDate    DATETIME2(0)         NOT NULL DEFAULT SYSUTCDATETIME(),
        Total        DECIMAL(19, 4)       NOT NULL,
        Currency     CHAR(3)              NOT NULL DEFAULT 'EUR',
        CountryCode  CHAR(2)              NOT NULL,
        VatRate      DECIMAL(5, 4)        NOT NULL,
        CONSTRAINT pk_Orders PRIMARY KEY CLUSTERED (OrderId),
        CONSTRAINT fk_Orders_Customer FOREIGN KEY (CustomerId)
            REFERENCES Sales.Customer (CustomerId)
    );
END;

-- Insert a few realistic orders (CustomerIds 1-4 from lesson 3)
INSERT INTO Sales.Orders (CustomerId, OrderDate, Total, CountryCode, VatRate)
VALUES (1, '2026-03-05 10:32:00',  59.00, 'NL', 0.2100),
       (1, '2026-03-18 14:22:00',  29.00, 'NL', 0.2100),
       (2, '2026-02-15 09:15:00', 149.00, 'IT', 0.2200),
       (2, '2026-03-22 11:40:00',  89.50, 'IT', 0.2200),
       (3, '2026-03-10 16:05:00', 199.00, 'DE', 0.1900),
       (4, '2026-03-28 08:12:00',  42.42, 'RO', 0.1900);

-- Query 1: simple SELECT with aliases
SELECT OrderId,
       CustomerId,
       Total             AS TotalEUR,
       CountryCode       AS Country
FROM Sales.Orders
ORDER BY Total DESC;

-- Query 2: compute a VAT breakdown on the fly
SELECT OrderId,
       Total                                                     AS GrossEUR,
       Total / (1 + VatRate)                                     AS NetEUR,
       Total - (Total / (1 + VatRate))                           AS VatEUR,
       CAST(VatRate * 100 AS DECIMAL(5,2))                       AS VatPercent
FROM Sales.Orders
ORDER BY OrderDate;

-- Query 3: the top-3 biggest orders in Italy (joins across tables)
SELECT TOP (3)
       o.OrderId,
       c.Name            AS CustomerName,
       o.Total           AS TotalEUR,
       CAST(o.OrderDate AS DATE) AS OrderDay
FROM Sales.Orders AS o
JOIN Sales.Customer AS c ON c.CustomerId = o.CustomerId
WHERE o.CountryCode = 'IT'
ORDER BY o.Total DESC;

-- Query 4: notice the logical order — alias works in ORDER BY, not in WHERE
SELECT o.OrderId,
       o.Total * 100 AS TotalInCents
FROM Sales.Orders AS o
WHERE o.Total > 50         -- filter uses original column
ORDER BY TotalInCents DESC; -- ORDER BY can use the alias

Read the results. Type a few queries of your own. Break them on purpose to see the errors.

Next lesson: WHERE, NULL, and why your filters occasionally lie to you. It’s shorter than this one, and it has the one T-SQL fact that trips up every person who’s ever learned SQL.