The medallion architecture: bronze, silver, gold

We are now in the middle of Module 5, the batch module. The previous lessons built up the components: the lake, the warehouse, the orchestrator, the file formats. This lesson zooms back out and asks a simpler question. Once you have all that machinery, how do you organise the data flowing through it? What do the layers look like, and what does each one promise?

The most common answer in 2026 is some flavour of the medallion architecture: data flows through three layers, each more curated than the last, and each named after a metal. Bronze, silver, gold. The naming is Databricks-coined and the marketing makes it sound newer than it is, but the pattern itself predates the name by decades. Older shops called it “raw, staging, curated”, or “L0, L1, L2”, or “raw, clean, serving”. The discipline is the same: explicit layers with explicit responsibilities, and a strong norm against letting one layer’s concerns leak into another.

This lesson is the careful walk through what each layer actually contains, why the boundaries are where they are, and when this whole model is overkill.

The pattern

Three layers, in flow order:

Bronze: every event, every change, every record from every source, stored exactly as it arrived. No transformation. No deduplication. No quality filtering. Append-only and immutable.
Silver: the cleaned, conformed, joinable layer. Records are deduplicated, types are correct, schemas are applied, bad records are filtered out. Still source-shaped (one silver table per source entity, roughly), but trustable.
Gold: the analytics-ready layer. Denormalised, aggregated, business-shaped tables and feature stores. The marts that BI dashboards, ML pipelines, and downstream consumers read from.

Data flows in one direction, bronze toward gold. Each layer is built from the layer below by a transformation pipeline. If you find a bug in the silver-to-gold logic, you re-run from silver. If you find a bug in the bronze-to-silver logic, you re-run from bronze. Bronze itself is never re-derived: it is the system of record for what arrived, and replaying upstream sources is usually expensive or impossible.

flowchart LR
    S[Sources: apps, APIs, databases, events]
    B[(Bronze: raw, immutable)]
    SI[(Silver: cleaned, conformed)]
    G[(Gold: business-aggregate, denormalised)]
    C[Consumers: BI, ML, exports]
    S --> B --> SI --> G --> C

That diagram is the entire architecture. The interesting parts are the rules that govern what belongs in each box.

Bronze: the unmodified record

The bronze layer’s whole job is to preserve, faithfully and forever, the data that came in. The contract is: “if you point at bronze, you see exactly what the source sent us, with timestamps for when we received each record”.

In practice this means:

Append-only: bronze tables grow. Existing rows are not updated. If a record changes upstream, the new version arrives as a new row, not as an update to the old one.
Schema-on-read where possible: the source’s payload is stored as JSON or as the source’s native types, and richer parsing is left to silver. The reason is that source schemas drift, and bronze should not break when a new field appears.
Cheap storage: bronze is the largest of the three layers and the least frequently queried. It lives on object storage (S3, GCS, ADLS) with the cheapest tier appropriate for occasional access. Compression is high.
Partitioned by ingestion time: typically ingest_date=2026-01-16, sometimes also by source. Time-partitioning lets you reason about what arrived when, and lets you drop or archive old partitions cleanly.

The reason for the strict “no transformation” rule is replayability. If the silver-to-gold logic has a bug that has been corrupting your dashboards for three months, you fix the logic and re-run silver and gold from bronze, and you get the corrected three months back. If bronze has been quietly transforming data, you cannot do that: the bug’s input has been lost. Bronze is the only layer that stores history that cannot be reconstructed, and so it has to store everything, even the records you suspect are wrong.

The other reason is auditability. When a regulator, a customer-support investigation, or a security review asks “what did the source actually send us on the morning of the 3rd”, bronze is the answer. Silver and gold are not, because they have already filtered and reshaped the data.

Silver: the trustable layer

Silver is the layer that says “you can join these tables and not worry”. Concretely, that means several things have happened:

Deduplication. The same source record may have arrived twice in bronze (because at-least-once delivery, lesson 16). Silver collapses duplicates by key.
Type casting. The string "2026-01-16T12:00:00Z" becomes a timestamp. The string "42" becomes an integer. The JSON blob’s customer_id becomes a typed column.
Schema enforcement. Silver tables have a declared schema. Records that do not match are sent to a quarantine table for inspection, not silently dropped.
Quality filtering. Records with null primary keys, impossible timestamps, or values outside agreed-on ranges are removed and logged.
Conforming. Country codes are normalised to ISO-3166. Currency codes are normalised to ISO-4217. Customer IDs from different sources are mapped to the canonical customer ID where possible. The same business entity referred to two ways becomes the same value in silver.

Silver is still mostly source-shaped. If your sources are an orders service and a customers service, you still have a silver.orders and a silver.customers, not yet a denormalised gold.customer_360. The shape changes from raw payloads to clean tables, but the entity boundaries stay roughly where the sources put them.

Silver is the layer that downstream data scientists and ad-hoc analysts should be allowed to query. They get clean, joinable tables with documented columns and predictable types. They do not get the surprises of bronze (raw JSON, duplicate rows, schema drift) and they do not yet have the strong opinions of gold (a particular aggregation, a particular grain, a particular set of business rules baked in).

Gold: the business shape

Gold is where the business logic lives. A gold table answers a specific question that someone in the business actually asks. “How many orders did we ship this week, by region, by product category?” “What is each customer’s lifetime value, refreshed nightly?” “What is the feature vector for this user, ready for the recommendation model?”

Gold tables are:

Denormalised. The relational instinct (one entity per table, joined on demand) is wrong here. A gold table is one wide table per question, joined and aggregated ahead of time.
Aggregated to the right grain. A daily-orders-by-region table is at one row per (date, region), not at one row per order. A user-features table is at one row per user, with whatever rolling-window aggregates the model needs.
Optimised for the query pattern. Partitioned, clustered, sorted for the dashboards or models that read them. If a dashboard is going to scan by event_date and group by region, the gold table is partitioned by event_date and clustered by region.
Versioned and contract-stable. Downstream consumers (BI tools, ML pipelines, exports) take dependencies on gold table shapes, and changing those shapes is a breaking change. Gold tables get the same API-stability discipline that public APIs do.

There is usually no single gold schema. Instead there is gold_finance, gold_marketing, gold_ml_features, each owned by the team that consumes it, each with its own tables tuned to that team’s questions. Gold is plural.

Why three layers, not two or four

You could have just bronze and gold, and skip the middle. People do, in small shops. The problem is that gold then has to do everything: cleaning, conforming, deduplicating, and aggregating, all in one transformation. The pipeline becomes a tangle. Two teams’ gold tables redo the same cleaning work in slightly different ways and quietly disagree. A bug in the cleaning step shows up in some gold tables and not others. The cleaning logic and the aggregation logic are mixed together in the same SQL, and untangling them later is painful.

Silver solves this by being the shared cleaning surface. Cleaning happens once, in silver, and every gold table builds on that. The boundary is the answer to “where do I trust the data”. The contract on silver is “joinable and clean”, and gold can rely on that without re-doing it.

You could also have four or five layers: a raw landing zone before bronze, a “platinum” layer above gold, a separate “feature store” parallel to gold. People do this and it can work. The cost is that more layers mean more places for the same logic to live in two of them, and more disagreements about which layer owns which transformation. Three is the sweet spot for most teams. Add layers when the pain of not having them is real, not because the symmetry of the diagram is appealing.

The discipline: don’t let layers leak

The medallion pattern is easy to draw and hard to enforce. The two failure modes I see most often:

Business logic creeping into bronze. Someone writes an ingestion job that “cleans up” the source’s weird timestamp format on the way in, “because it’s easier here”. Now bronze is no longer a faithful record of what arrived. Six months later, when the timestamp parser turns out to have a bug, the original strings are gone. Resist this. Bronze is the museum exhibit: look but do not touch.

Raw records leaking into gold. Someone needs a quick dashboard, queries directly off bronze, and ships it. Now there is a gold-shaped consumer reading from a layer that was never meant to be consumer-facing, with no quality guarantees, with schema drift that will silently break the dashboard the next time the source changes. Build a silver representation of what the dashboard needs and have it read from there. The shortcut compounds.

The discipline is usually the lakehouse table format (Delta, Iceberg, Hudi) plus access control: only the silver pipeline can write to silver tables, only the gold pipeline can write to gold tables, and bronze is read-only after ingestion. The format and the permissions enforce what the diagram only suggests.

When not to medallion

Medallion is appropriate when you have a real lake or lakehouse, multiple sources, multiple consumers, and a non-trivial amount of data. If you do not, the layers are overhead that buys you nothing.

Cases where one layer is enough:

Very small datasets. If your whole “warehouse” is a few hundred megabytes of data refreshed nightly into Postgres, just have a staging schema and a prod schema. The bronze layer adds storage cost and pipeline complexity for no benefit.
Pure OLTP. If you do not have a lake at all, and your analytics consist of a handful of materialised views over the transactional database, you do not need medallion. You need good views.
Streaming-first systems. Real-time pipelines, where data has to flow from source to consumer in seconds, do not have time for three batch hops. The Kappa architecture (Module 6, lesson 47) is a different shape entirely: a streaming source of truth and materialisations off it. Medallion is the batch shape; Kappa is the streaming shape; the lambda architecture is the unhappy compromise.

The trigger for adopting medallion is usually the second consumer. As soon as two teams want to query the same data and one of them is doing cleanup that the other should also be doing, you have the silver-shaped problem. As soon as three teams want different aggregations of the same clean data, you have the gold-shaped problem. Build the layer that solves the pain you actually have, not the one the diagram says you should have.

The next lesson is about the table formats that make this whole architecture possible at lakehouse scale: Delta Lake, Apache Iceberg, and Apache Hudi.