SQL Server, from the ground up Lesson 3 / 40

Databases, schemas, tables: the nesting dolls

The three-level namespace that SQL Server uses to organize everything, what each level is for, and why dbo is not actually your friend.

Let’s say you just got a fresh SQL Server from lesson 2, and you’re staring at the Object Explorer tree in SSMS. You see “Databases,” and under it “System Databases,” and the system databases are called master, model, msdb, and tempdb. You haven’t even created anything yet and there are already four databases.

If you’re new to SQL Server, this part of the world is deeply confusing. It gets worse: inside a database, there’s a thing called a schema, which is not the same as what “schema” means in Postgres, or in ORM tools, or in the general English language. And inside schemas there are tables, views, stored procedures, and functions.

Today’s lesson is about untangling the nesting. By the end you’ll know exactly what each level means, what it’s for, and why “just put it in dbo” is the single most common mistake people make when they start out.

The nesting, from outside in

SQL Server has a three-level namespace for most objects:

Instance → Database → Schema → Object

Every table, view, procedure, and function has a full name with three dots:

DatabaseName.SchemaName.ObjectName

For example: AdventureWorks2022.Sales.SalesOrderHeader. The instance name is implied (it’s wherever you’re currently connected). Some objects like logins, server roles, and linked servers live at the instance level, not inside a database — but that’s the exception. For the day-to-day stuff you’ll write queries against, the three-part name is the law.

Let’s go level by level.

The instance: what you connect to

An instance is one running SQL Server process. One server machine can have multiple instances (the default one, plus named ones like SERVERNAME\SQL2019 and SERVERNAME\SQL2022), each with its own port, its own configuration, its own databases, and its own memory. This is less common than it used to be — in the cloud era, “one instance per machine” is the norm — but you will meet old on-prem boxes running three instances on one server, each listening on a different port, because someone in 2010 thought it was a good idea.

For this course, you have one instance, either localhost (if you installed locally) or localhost,1433 (if you’re using Docker). That’s it. Everything else nests inside it.

The instance is where logins live. A login is how you authenticate to the server. Logins map to users inside databases, but we’ll get to that in lesson 33 on security.

The databases

Inside an instance, you have databases. Each database has its own storage (its own .mdf and .ldf files), its own backup and restore lifecycle, and its own users. When you CREATE DATABASE MyApp;, SQL Server spins up two files — a data file and a log file — and registers the new database in master.

Every instance comes with four system databases you should learn to recognize:

  • master — the instance’s catalog. It tracks which databases exist, which logins exist, which linked servers are configured, and general server-level settings. If master dies, the whole instance dies. Back it up. Treat it with respect.
  • model — the template. Every time you run CREATE DATABASE, SQL Server copies model and uses that as the starting point. If you want every new database to start with a default set of tables, users, or extended properties, you put them in model. 95% of the time you leave model alone.
  • msdb — where SQL Server Agent lives. Scheduled jobs, job history, backup history, database mail, maintenance plans. If you’ve ever asked “when did my backup last run?” the answer is in msdb.
  • tempdb — a scratch space shared by every session on the instance. Temporary tables, table variables, sort spills, version store data for snapshot isolation, and many other things land here. tempdb resets every time SQL Server restarts. It’s also the single most common source of performance surprises on a busy server, and we’ll spend a whole lesson on it later (lesson 34).

Then you have user databases — the ones you create for your applications. Each has its own name, its own settings, its own users. A SQL Server instance can host thousands of databases; in practice, shops keep it to dozens.

Switching between databases inside SSMS is done with the dropdown at the top of the query window, or by running USE DatabaseName; at the top of a script. USE is a T-SQL statement that changes which database your session is pointed at.

USE master;
SELECT name, database_id, create_date, state_desc
FROM sys.databases
ORDER BY database_id;

Run that and you’ll see your system databases plus any user databases you’ve created. The database_id for master is always 1, tempdb is 2, model is 3, msdb is 4. Your user databases start at 5.

The schemas, and why they matter

Inside a database, objects are organized into schemas. A schema is a logical namespace. It exists so you can say Sales.Customer and HR.Customer without those being the same object.

Schemas are a feature most people ignore, which is a shame. In AdventureWorks you’ll see Person.*, Sales.*, HumanResources.*, Production.*, Purchasing.*. Each schema groups related tables: Person.Address, Person.EmailAddress, Person.PersonPhone. That grouping is the whole point.

The default schema in SQL Server is called dbo (short for “database owner”). When you run CREATE TABLE Foo; without a schema prefix, SQL Server creates dbo.Foo. This is fine for a tiny demo database. It is a disaster for any real application.

Why dbo is not actually your friend:

  • Every object you don’t explicitly place in a schema ends up in dbo. Six months in, you have 340 tables in dbo with no discoverable structure.
  • Schema-based permissions let you say “all users in the Reporting role can SELECT on every table in the Reporting schema.” Granting per-table is a nightmare.
  • Different application modules (Sales, Billing, HR, Ops) bump into each other when they all share dbo. Schema segregation lets modules evolve independently.
  • When you have to move a table later, cross-schema moves are cheap and well-supported. Renaming half your dbo tables is not.

The rule I give juniors on my team: never create anything in dbo. Even a throwaway proof-of-concept gets its own schema named after what it does. Staging.CustomerImport, not dbo.Stage_Customer_Import_Temp.

Creating a schema is one line:

CREATE SCHEMA Sales AUTHORIZATION dbo;

“AUTHORIZATION dbo” means the schema is owned by the dbo user. You can change ownership later. Creating an object in a specific schema is just prefixing the name:

CREATE TABLE Sales.Customer (
    CustomerId INT NOT NULL PRIMARY KEY,
    Name       NVARCHAR(100) NOT NULL
);

And querying uses the two-part name:

SELECT CustomerId, Name FROM Sales.Customer;

You can get away with SELECT * FROM Customer because SQL Server looks in your default schema first (usually dbo), but writing fully-qualified names is a habit worth building. It removes ambiguity, it’s easier to read six months later, and it makes your grep game stronger when hunting for references.

Tables: the thing you came for

Inside a schema lives the object everyone actually cares about: tables. A table is a rectangle of data: fixed columns with names and data types, and zero or more rows. Each row is an instance of the entity the table represents. A Sales.Customer table has rows where each row is a single customer.

Tables are created with CREATE TABLE:

CREATE TABLE Sales.Customer (
    CustomerId      INT           NOT NULL IDENTITY(1,1),
    Name            NVARCHAR(100) NOT NULL,
    Email           NVARCHAR(200) NOT NULL,
    CreatedAt       DATETIME2(0)  NOT NULL DEFAULT SYSUTCDATETIME(),
    IsActive        BIT           NOT NULL DEFAULT 1,
    CONSTRAINT pk_Customer PRIMARY KEY CLUSTERED (CustomerId)
);

A few things to notice, because they’re fundamental:

  • Every column has a data type. INT, NVARCHAR(100), DATETIME2(0), BIT. We’ll spend a whole lesson on data types next time, because getting them wrong costs real money. For now, know that SQL Server is strongly typed: '42' and 42 are not the same thing, and mixing them causes implicit conversions you don’t want.
  • Every column can be NOT NULL or implicitly nullable. Always be explicit. Writing NOT NULL on columns that shouldn’t accept NULL saves you from entire categories of bugs.
  • IDENTITY(1,1) makes SQL Server auto-generate sequential integer values. Seed 1, increment 1. Equivalent to AUTO_INCREMENT in MySQL or SERIAL in Postgres.
  • DEFAULT gives a column a value if the INSERT doesn’t specify one. SYSUTCDATETIME() is the built-in function for “now, in UTC.” Use UTC, always. We’ll cover date-handling pain in lesson 15.
  • The CONSTRAINT pk_Customer PRIMARY KEY CLUSTERED (CustomerId) line creates both a primary key and a clustered index on CustomerId. In SQL Server, a primary key gets a clustered index by default unless you say otherwise. See lesson 21 for the full story on clustered indexes — they’re arguably the most important idea in the whole course.

Other things that live in a schema

Schemas hold more than tables. They also hold:

  • Views — saved queries that look like tables.
  • Stored procedures — saved T-SQL that you can call with EXEC.
  • Functions — scalar or table-valued; called like a function inside a query.
  • Triggers — T-SQL that runs automatically on INSERT, UPDATE, DELETE, or DDL.
  • User-defined types — custom data types built on top of the built-in ones.
  • Synonyms — local aliases for objects in other databases.

All of these are namespaced by schema. Sales.GetActiveCustomers is a procedure. Reporting.CustomerSummary is a view. Staging.LoadOrders is a procedure in a staging schema you use for ETL. Schema boundaries are a gift. Accept the gift.

Listing what’s in a database

Two queries to memorize. First, “what tables exist in the current database and which schema are they in?”

SELECT
    SCHEMA_NAME(t.schema_id) AS schema_name,
    t.name                   AS table_name,
    p.rows                   AS approx_rows
FROM sys.tables AS t
JOIN sys.partitions AS p
    ON p.object_id = t.object_id
    AND p.index_id IN (0, 1)
ORDER BY schema_name, table_name;

Second, “what schemas exist in this database?”

SELECT name AS schema_name,
       SCHEMA_OWNER = USER_NAME(principal_id) AS owner
FROM sys.schemas
ORDER BY name;

These two will become reflexive in a month. Every time you join a new team or open an unfamiliar database, run them first to get the lay of the land.

Run this on your own machine

A small exercise to make the three levels tangible. We’ll build a tiny version of the Runehold schema we’ll extend throughout the course. Paste this into a query window and run the whole thing:

-- 1. Create the Runehold database
IF DB_ID('Runehold') IS NULL
    CREATE DATABASE Runehold;
GO

USE Runehold;
GO

-- 2. Create two schemas that belong to two different teams
CREATE SCHEMA Sales AUTHORIZATION dbo;
GO
CREATE SCHEMA HR AUTHORIZATION dbo;
GO

-- 3. Each team has a "Customer" concept with a different meaning.
--    Sales.Customer = people who buy enchanted goods from us.
--    HR.Customer    = the internal name HR uses for "employees" (weird but
--                     this happens at real companies all the time).
CREATE TABLE Sales.Customer (
    CustomerId INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
    Name       NVARCHAR(100) NOT NULL,
    Country    CHAR(2)       NOT NULL     -- ISO 3166-1 alpha-2: NL, DE, IT...
);

CREATE TABLE HR.Customer (
    EmployeeId INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
    FullName   NVARCHAR(200) NOT NULL,
    Team       NVARCHAR(50)  NOT NULL
);

-- 4. Insert realistic-looking data
INSERT INTO Sales.Customer (Name, Country)
VALUES ('Anne de Vries',    'NL'),
       ('Marco Rossi',       'IT'),
       ('Franz Hofmann',     'DE'),
       ('Ioana Popescu',     'RO');

INSERT INTO HR.Customer (FullName, Team)
VALUES ('Ada Lovelace',  'Engineering'),
       ('Grace Hopper',  'Engineering'),
       ('Djenna Akkad',  'Marketing'),
       ('Piotr Nowak',   'Warehouse');

-- 5. Query both, fully qualified
SELECT 'Sales' AS src, CustomerId AS id, Name    AS name, Country AS extra
FROM Sales.Customer
UNION ALL
SELECT 'HR',           EmployeeId,       FullName,        Team
FROM HR.Customer
ORDER BY src, id;

-- 6. List what you just built
SELECT SCHEMA_NAME(schema_id) AS schema_name, name AS table_name
FROM sys.tables
ORDER BY schema_name, table_name;

Two tables called Customer, in two schemas, inside one database, on one instance. Fully named: LessonsPlayground.Sales.Customer and LessonsPlayground.HR.Customer. They can coexist forever because the schemas keep them apart.

Leave the Runehold database in place — we’ll add to it in the next lesson and every lesson after. If you ever want to start fresh:

USE master;
DROP DATABASE Runehold;

Next lesson: data types. We’ll cover the difference between VARCHAR(50) and VARCHAR(MAX), why DATETIME2 beats DATETIME, and the one time I saw a table designed entirely with NVARCHAR(MAX) columns. It was a bad day for everyone.

Search