There’s a moment in every engineer’s career, usually somewhere around the third year, where the difference between “people who write code” and “people who own things” becomes visible. It’s not about title. It’s not even mostly about skill. It’s about whether, when something you built breaks at 11pm on a Sunday, your first instinct is “let me look” or “not my problem”.
I’ve worked with both kinds. The second kind ships more code per quarter on paper. The first kind ends up running the team five years later. There’s a reason for that, and it’s not seniority by attrition. It’s that the people who own their work end up understanding the system better than anyone else, because they live with the consequences.
This post is about that gap. The gap between “merged to main” and “live and useful”.
Merging is the easy part
The dirty secret of software engineering is that writing the code is roughly the first third of the job. The rest is what happens after the PR turns green.
Did the deploy actually go out? Did the dashboard tick up? Did anybody notice the warning log spam in the new endpoint? Did the dbt model land in production with the right column types, or did it silently fall back to varchar(255) because somebody’s macro had a bug? Is the Airflow DAG that depends on it still passing, or has it been red for three days and nobody told you?
I once shipped what I thought was a perfectly clean change to a Postgres ingestion job. The PR was small, the tests passed, the review was friendly. I merged on a Thursday and went on holiday on Friday. When I came back the Monday after, my manager pulled me aside and said, gently, “the job has been failing every night you were away, and we’ve been running it manually”. The change had a subtle issue with how it handled an empty source file, which only happened on weekends. Nobody else on the team knew the job well enough to fix it. The system had been on life support for ten days.
The code was fine. The handoff was the problem. I had treated “merged” as “done”, and the system had taught me, in the polite way teams teach you these things, that those are not the same word.
Watching your thing in production
The cheapest, highest-leverage habit I’ve ever picked up is this: after I deploy something, I watch it. Not for an hour. For a few days. I don’t sit and stare at Grafana like it’s the World Cup. I just keep an eye on it. I check the dashboard the next morning. I scan the error rate before standup. I look at the volume on the new Kafka topic on day three. I peek at the dbt run logs to see if anything new is taking longer than usual.
Most of the time, nothing is wrong. The point isn’t to find bugs. The point is that I learn what “normal” looks like for the thing I just built. The shape of the traffic curve. Which time zones spike. Which downstream models depend on it. Which alert is loud and which is quiet. By the time the first real incident hits, weeks or months later, I already know what the system looks like when it’s healthy. That intuition is impossible to bootstrap during an outage at 2am.
A colleague of mine taught me a stronger version of this. Every Friday afternoon, before logging off, he’d open the dashboards for everything he owned and just look. Five minutes. No agenda. He’d often catch something subtle: a slow drift, a weekly seasonality that was new, a queue that was creeping up. Nine times out of ten, it was nothing. The tenth time saved the team a Monday morning fire. Over a year, that’s a lot of Monday mornings.
On-call is part of the job, not a punishment
A lot of engineers treat on-call as a tax. Something to do to qualify for the bonus. They show up, acknowledge the page, escalate to the most senior person they can find, and go back to bed.
On-call is genuinely tiring. But the page that wakes you up is, often, the most direct feedback loop you’ll ever get on the quality of the system you built. The thing is talking to you. It’s saying “I broke, and here’s how”. If you outsource that conversation, you outsource the part of the job that teaches you to be senior.
The engineers I’ve seen grow the fastest are the ones who treat their on-call shift as a learning week. They keep a small notes file. Every page that comes in, they jot down what alerted, what they did, what the actual cause was, and whether the runbook was useful. At the end of the week they spend an hour cleaning that up and turning it into runbook updates, alert tuning, or follow-up Jira tickets. They do not just close the page and move on.
Two effects compound from this. One, the system gets better, because somebody is actively maintaining the institutional memory of how it fails. Two, that engineer becomes the person other people ask when something breaks, because they’ve literally written the field guide. Promotion follows from there. It’s not magic, it’s bookkeeping.
”I built it, I support it”
There’s a pattern I think about a lot, which I’ll call “I built it, I support it”. It says: the person who shipped a thing is, by default, the person on the hook for it until they explicitly hand it over. Not the platform team. Not the on-call rota. Not “ops”. Them.
This sounds heavy. In practice it’s lighter than the alternative. Here’s why.
When a system has a clear owner, things get fixed properly. The owner has the context. They wrote the code, they remember the trade-offs, they know which warning logs are real and which are noise. A bug that would take a stranger half a day takes the owner forty minutes, because they recognise the symptom on sight.
When a system has no owner, every problem gets fixed half-properly. Whoever is unlucky enough to get the page slaps a band-aid on it, ships a “TODO: figure out why this happens” comment, and goes back to their actual project. The next person who gets paged does the same thing six weeks later. Over a year, the system accumulates a sediment of half-fixes that nobody understands, and eventually it falls over in a way that takes a week to clean up.
I once worked on a team that had a beautiful streaming pipeline written by a brilliant engineer who left for another company two months after shipping it. For a year that pipeline fell over every week or two. We rotated through “whoever was free” to fix it. Each fix was reasonable in isolation. The pipeline as a whole became a horror. Then a new hire said, in a planning meeting, “I’ll take this on. I want to actually understand it.” Within three months it was rock solid. Same code, same Kafka, same Airflow. The thing it had been missing was a person whose name was on it.
When to hand off, and how
“I built it, I support it” doesn’t mean forever. It means until you’ve handed it off cleanly, which is a verb, not a wish.
A clean handoff has, in my experience, three things. A runbook that someone who has never seen the system can use to triage the most common alerts. A diagram, however ugly, of the data flow and the dependencies. And a real conversation with the new owner, ideally a paired on-call shift, where you walk them through a real incident or a recent near-miss. Anything less than this and you have not handed it off. You have abandoned it and asked someone else to take the blame.
If your manager is asking you to move on to a new project but the old system has no clean owner, that’s a conversation, not a fait accompli. “Happy to start the new thing on Monday. Who’s picking up the search indexer? Can we do a handoff session this week?” That is a senior question. Asking it is part of the job.
The anti-pattern: the over-the-wall ship
The opposite of ownership is the over-the-wall ship. Build the thing, merge the PR, drop a “shipped!” message in Slack with a screenshot of green tests, and never look at it again. Bonus points for tagging the data team or the ops team and writing “let me know if you need anything”.
I have done this. Most engineers have. It always looks like productivity in the moment and feels like betrayal three months later, when the person who actually has to live with your system finds the bug you didn’t see. They will remember. The team will remember. Your reputation will quietly settle into “fine but ships and forgets”.
The fix is not heroics. The fix is small. A two-day post-deploy watch. A check-in with the team that depends on your output one week after launch. An updated runbook the first time you find something the runbook didn’t cover. A short message in the channel saying “hey, I noticed dashboard X looked weird, I dug in, here’s what I found”. None of that is glamorous. All of it compounds.
Takeaway
Ownership is not a personality trait. It’s a small set of habits: watching your thing in production for a few days after you ship it, treating on-call as feedback rather than tax, keeping the runbook honest, and refusing to call something “done” until somebody, including possibly you, can actually run it without you in the room. The engineers who do this don’t work harder than everyone else. They just work the full length of the problem, instead of stopping when the green check appears. Over a few years, that gap becomes a career.