Engineering Process Audit: How to Assess Your Team's Health

May 25, 2026 / Mikael Danielian

Every six to twelve months, every engineering team should sit still and ask honestly: are we working well? Most teams never do this. They keep shipping, keep firefighting, keep adding people, and one day they look up and realise that things have gotten quietly worse.

An engineering process audit is the structured way to ask that question. Done well, it is one of the highest-leverage things a leader can do — three or four weeks of work that can change the shape of the next year. Done badly, it is a slide deck nobody reads.

This article is the version of the audit I run, and that I have helped teams run on their own. It is opinionated. It is practical. It works.

What an Engineering Process Audit Actually Is

An engineering process audit is a structured review of how your engineering team works — not what they're building, not the codebase, but the practices, rituals, and decisions that shape day-to-day work.

It looks at:

  • How work gets defined, planned, and delivered
  • How code gets written, reviewed, and shipped
  • How the team handles incidents and on-call
  • How decisions get made and communicated
  • How people are hired, onboarded, supported, and developed
  • How the team measures and improves itself

It does not look at architecture in detail. It does not review the codebase line by line. It does not evaluate individual people. Those are different exercises.

A good audit takes three to four weeks of focused effort and produces three things: a clear-eyed assessment of where the team is, a small number of high-leverage things to change, and a plan for measuring whether the changes work.

When to Run One

A few situations almost always call for an audit.

You are between engineering leaders. A new VP of Engineering or CTO is starting. The audit gives them a structured way to understand the team and a credible artifact to anchor their first 90 days against.

You have grown fast. Your team has gone from 10 to 30 in the last year. Practices that worked at 10 are probably broken at 30, but no one has had time to look up and notice.

Something is wrong but you can't name it. Velocity feels slow. Morale feels off. The roadmap keeps slipping. You know something is wrong but you don't have language for it. An audit forces that language out.

You are preparing for due diligence. Pre-acquisition, pre-investment, pre-partnership. Outside parties are about to evaluate your team. Better to evaluate yourself first and address the obvious gaps.

It has been more than a year since the last serious look. Even healthy teams drift. An annual audit catches the drift before it becomes a problem.

The Six Areas to Look At

A good audit covers six areas. I will go through each with what to look at and what good and bad look like.

1. Planning and Delivery

This is how work gets from "idea" to "in production."

What to look at:

  • How are roadmaps built? Top-down, bottom-up, or some mix?
  • How is a quarter planned? A sprint? A week?
  • How often does the team actually deliver what they said they would?
  • How are priorities changed mid-cycle, and how often?
  • How does the team know if something they shipped worked?

What good looks like: The team can describe how the next 90 days are planned. Most engineers can explain why they are working on what they are working on. When priorities change, the change is communicated clearly with a reason. The team has a way to know if shipped work is actually working — usage data, customer feedback, metrics.

What bad looks like: Engineers can't explain why they are working on their current task. The roadmap exists somewhere but isn't referenced. "Urgent" requests interrupt sprints constantly. Work ships and disappears — nobody knows whether it worked.

2. Code Quality and Shipping

This is the basic mechanics of how code gets into production.

What to look at:

  • How long does a typical change take to go from "code written" to "in production"?
  • What is the deploy frequency? Daily, weekly, monthly?
  • How often does a deploy fail? What happens when one does?
  • What does code review look like? How long does it usually take? How thorough is it?
  • Are there automated tests? What is the coverage roughly? Are they trusted?

What good looks like: Deploys happen daily or several times a day. Lead time from PR open to merge is under a day for most changes. Code review is real but not slow — most PRs reviewed within a few hours. Tests are trusted enough that a green build means it's probably safe to ship.

What bad looks like: Deploys take days. Releases are events that require manual coordination. PRs sit for days waiting for review. Tests are flaky enough that people retry the build instead of investigating failures. Production incidents are common and surprising.

The DORA metrics give a useful structured frame for this area — deploy frequency, lead time for changes, change failure rate, mean time to restore. I went deep on those in DORA Metrics: The Engineering Leader's Guide.

3. Incidents and Operations

This is how the team handles things going wrong.

What to look at:

  • Is there an on-call rotation? Who is on it? How is it organised?
  • How often do people get paged outside business hours?
  • What happens during an incident? Who runs it? Who decides when it's over?
  • Are there post-mortems? Are they blameless? Do action items actually get done?
  • What is mean time to recovery for typical incidents?

What good looks like: Clear on-call rotation with named owners. Pages are rare and meaningful — when one comes in, it's a real problem. Incidents have clear leadership during and clear post-mortems after. Action items from post-mortems actually get prioritised and shipped.

What bad looks like: "On-call" is whoever was last on Slack. People get paged constantly for noise. Incidents are chaotic — nobody is sure who is in charge or when the incident is over. Post-mortems happen sometimes, generate action items, and then the action items quietly die.

4. Decisions and Communication

This is how the team decides things and how those decisions reach the people who need to know.

What to look at:

  • How does a non-trivial technical decision get made? Who is involved?
  • Are decisions written down? Where? Are they discoverable later?
  • How are decisions communicated to the rest of the team and the company?
  • How are disagreements resolved when senior people disagree?
  • How does information flow between teams that need to coordinate?

What good looks like: Important decisions are written down — even briefly. The team has some lightweight format (RFC, ADR, Notion doc) for proposing and recording decisions. Disagreements get resolved either through clear ownership or through structured discussion, not by the person who pushes hardest.

What bad looks like: Decisions get made in DMs and meetings and are impossible to find later. The same questions get re-litigated every few months because nobody remembers the original reasoning. Cross-team work stalls because nobody knows who is supposed to decide what.

5. People — Hiring, Onboarding, Development

This is the human side of the system, and the area most audits underweight.

What to look at:

  • How long does a typical hire take from posting to starting?
  • What is the offer acceptance rate? Where do candidates drop out?
  • How long does a new engineer take to become productive?
  • Do engineers have growth plans? Are they real or theatre?
  • What is regrettable attrition over the last year? What were the reasons?
  • Do managers have one-on-ones with their reports? How often?

What good looks like: Hiring is taken as seriously as building. Onboarding is structured and gets new engineers shipping real work within two to four weeks. Engineers can describe what they are working towards in their career. Managers meet with their reports weekly and the meetings are useful.

What bad looks like: Hiring is ad-hoc and slow. New engineers spend weeks figuring out the environment because there is no onboarding. Growth conversations happen only at review time, if at all. Managers cancel one-on-ones routinely. Strong engineers leave and you find out the real reasons only later.

6. Measurement and Improvement

This is whether the team has any way to know if it is getting better or worse.

What to look at:

  • What metrics does the team actually look at? How often?
  • Are retrospectives happening? Are they useful?
  • When something improves, can anyone point to why?
  • When something gets worse, does anyone notice?
  • Is there any structured way the team is trying to improve, or is it all reactive?

What good looks like: A small number of metrics that the team genuinely looks at and discusses. Retros that produce real changes, not just venting. A clear sense of "we are trying to get better at X right now." Improvement is visible to the team.

What bad looks like: Lots of dashboards nobody looks at. Retros that produce a list of "things to think about" that nobody owns. Improvement happens by accident or not at all.

How to Actually Run the Audit

A four-week structure that works.

Week 1: Listen

One-on-one conversations with as many people on the team as possible. Engineers, managers, product partners, leadership. The same handful of open questions for everyone:

  • What is going well?
  • What is the most frustrating thing about how the team works?
  • If you could change one thing, what would it be?
  • What is the team avoiding talking about?

Take notes. Do not promise anything. Do not start fixing things yet.

Week 2: Read

Pull the artifacts. Roadmaps, sprint history, incident reports, post-mortems, deploy logs, retro notes, recent hires and departures. Look at the actual work, not the descriptions of the work.

You are looking for the gap between what people told you in week 1 and what the data shows. The gap is where the real findings live.

Week 3: Synthesise

Take everything you have and write three documents:

  1. The honest assessment. Each of the six areas, what good and bad looks like, and where this team actually sits. Be specific. "Planning is poor" is not useful. "Engineers can't connect their work to company priorities — 7 of 12 we asked could not explain why their current task matters" is useful.

  2. The two or three things to actually change. Not a long list. Two or three. The ones that will most improve the team in the next 90 days. For each: what to change, who owns it, what success looks like.

  3. What to explicitly leave alone for now. Just as important. The things you noticed but are not going to address in this round. This stops the audit from becoming a 30-item to-do list that overwhelms the team.

Week 4: Share and Decide

Walk through the documents with leadership first. Get alignment on the assessment and the changes. Adjust based on what they know that you don't.

Then share with the team — the assessment in full, the changes clearly, and the things being left alone with reasons. The transparency matters. Teams know when something is wrong. Naming it openly creates more trust than tip-toeing around it.

End with a clear commitment: who owns each change, what success looks like, and when you will look at progress (usually 60 to 90 days later).

Common Mistakes

A few specific things kill audits.

Trying to find everything wrong. The point is not to produce the longest list. It is to find the two or three things that, if fixed, would most improve the team. A great audit is short and brutal, not long and exhaustive.

Confusing symptoms with causes. "Velocity is down" is a symptom. "Engineers are spending 40% of their time in meetings that don't apply to them" is a cause. Always push past symptoms to causes. The fix is at the cause level.

Producing recommendations without owners. Every change must have a named owner and a clear definition of done. Recommendations without owners die immediately.

Doing the audit and not the follow-up. The audit is 20% of the work. The 80% is the next 90 days of actually making changes. Audits that end at the report stage are worse than no audit at all — they signal that the team's frustrations were heard, named, and then ignored.

Using the audit to attack individuals. Audits are about systems, not people. The moment an audit becomes a vehicle for "this person is the problem," the rest of the team stops being honest with you. Even if there are real people problems, address those separately, not through the audit.

What Good Looks Like 90 Days Later

If the audit worked, 90 days after the share-out you should be able to point at three things:

  1. At least two of the named changes have visibly landed. Not "in progress" — landed. The team can feel them.
  2. One of the underlying metrics has moved in the right direction. Deploy frequency up, lead time down, attrition down, time-to-onboard down — whichever was the most relevant to your changes.
  3. The team feels heard. Specifically: when you ask people now what is frustrating about working here, the answers are different than they were 90 days ago.

If you can point at all three, the audit was worth doing. If you can't, the audit was a report-writing exercise.

A Closing Note

Most engineering teams never do a real audit. They run forever on inherited practices, gradually accumulating weight, until something dramatic forces a reckoning. The audit is a way to skip the dramatic part — to look honestly at what is working and what isn't, before the cost of not looking gets too high.

It is also, in my experience, one of the most rewarding pieces of work an engineering leader can do. Teams almost always know what is wrong. They are mostly just waiting for someone to name it and do something about it.

If you want help running an audit at your company — or if you want a second set of eyes on one you've already done — I'm happy to have that conversation. And if you are also thinking about scaling or restructuring your team, How to Scale Your Engineering Team from 10 to 100 is a useful next read.

Hit like if you enjoyed this post!