6 min read

Git metrics that actually tell you something useful

Most teams collecting git metrics end up measuring the wrong things. Here's how to cut through the noise and focus on signals that actually reflect engineering health.

BuildPulse Team

June 1, 2026

Listen

Git Metrics for Engineering Leaders | BuildPulse Blog

The problem with "we track git metrics"

I've watched a team celebrate their commit frequency dashboard for two quarters while their deploys took three weeks and their engineers were quietly burning out. The commits were flying. The velocity readout looked great. Nothing actually got shipped faster.

That's the trap with git metrics. They're easy to collect — GitHub, GitLab, and Bitbucket all expose rich APIs — so teams collect everything and call it done. But raw activity counts (commits per day, lines changed, number of PRs opened) are proxies for effort, not outcomes. And effort is the last thing you should be optimizing for.

This post is about which git metrics are actually worth tracking, what they reveal, and how to avoid the classic mistake of turning them into performance scorecards that your engineers will game in about two weeks.

The metrics worth your attention

Cycle time

Cycle time measures how long it takes a code change to go from first commit to production. It's the single most useful git-derived metric for an engineering leader, and it's almost never what teams think it is.

Most teams estimate their cycle time at "a few days." When they actually measure it, it's closer to two or three weeks — because nobody was accounting for the days a PR sat waiting for review, or the three-day queue to get into the release branch.

A simple way to break this down:

Cycle time = coding time + review wait + review duration + merge-to-deploy lag

Coding time: first commit on the branch → PR opened
Review wait: PR opened → first review comment or approval
Review duration: first review → PR merged
Merge-to-deploy lag: merge → live in production

You can pull the first three from your git host's API. The fourth requires connecting your deployment events (which most CI/CD tools emit). Once you have all four, you'll immediately see where the time is actually going. Nine times out of ten, it's review wait — PRs sitting untouched for days.

PR size

This one correlates with almost everything else you care about. Large PRs take longer to review, generate more review comments, sit in the queue longer, introduce more bugs, and get merged with shallower scrutiny because reviewers give up.

A rough heuristic I've found useful: PRs under 400 lines changed (excluding generated files and lock files) review fast and merge clean. Over 800 lines, you're rolling the dice.

Tracking median PR size over time tells you something meaningful: are your engineers breaking work into small, reviewable chunks, or are they doing big batch work that creates risk and slows everything down?

Here's a quick GitHub Actions step that logs PR size on every pull request event, so you can feed it to whatever metrics store you're using:

jobs:
  pr-size:
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request'
    steps:
      - name: Compute PR size
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          ADDITIONS=$(jq -r '.pull_request.additions' "$GITHUB_EVENT_PATH")
          DELETIONS=$(jq -r '.pull_request.deletions' "$GITHUB_EVENT_PATH")
          TOTAL=$((ADDITIONS + DELETIONS))
          echo "PR size (lines changed): $TOTAL"
          # Emit as a step output or push to your metrics sink
          echo "pr_size=$TOTAL" >> "$GITHUB_OUTPUT"

This is basic, but it's a starting point. From here you can push to Datadog, write to a database, or emit a custom metric to whatever dashboard you actually look at.

Review lag (time to first review)

This is the single metric I'd pick if I could only track one. It measures how long after a PR is opened before a teammate leaves the first substantive review (not just a LGTM rubber stamp, but a real comment or approval).

When review lag is high — more than a day on average — engineers lose flow. They context-switch to something else, the mental model for the change fades, and re-engaging becomes expensive. High review lag also signals organizational problems: team silos, unclear ownership, or a review culture where nobody feels responsible for keeping the queue moving.

Target under four hours during business hours for most teams. Some high-performing teams run well under two.

Merge frequency

How often are changes being merged to your main branch? This is a better signal than commit frequency because merges represent completed, reviewed work.

Teams that merge frequently (multiple times per day across the team) tend to have smaller PRs, faster cycle times, and fewer merge conflicts. Teams that merge infrequently tend to have the opposite — big batches, painful integration days, and the fun of resolving conflicts across two weeks of diverged history.

Merge frequency is also a leading indicator for deployment frequency, which is one of the four DORA metrics. If you're not deploying often, there's a good chance it starts here.

Rework rate

This one takes a bit more setup, but it's worth it. Rework rate measures what percentage of recently merged code gets modified again in a short window — typically 2-4 weeks. High rework suggests defects caught late, unclear requirements, or code that's hard to change.

You can approximate this by tracking whether files touched in a merged PR are also touched in subsequent PRs within some time window. It's not perfect, but it separates "we shipped it and moved on" from "we shipped it and spent the next two weeks patching it."

What these metrics can't tell you

Here's where I'll save you from a meeting you don't want to have.

Git metrics are team health signals, not performance rankings. The moment you use them to evaluate individual engineers, two things happen: the metrics stop reflecting reality (people game them), and you lose the trust that makes the data useful in the first place.

Commit count per engineer is almost always misleading. A senior engineer who spends a week doing a careful architectural refactor that nobody else could have done might have ten commits. A junior engineer building a simple feature might have sixty. Commit count tells you nothing about impact.

Same with lines of code. The best PR I've seen in a while was a 47-line deletion that eliminated an entire class of production incidents. The commit with the most lines that week was an autogenerated API client update.

Use these metrics to understand system-level patterns and remove bottlenecks. Not to rank people.

A practical starting stack

If you're starting from scratch, here's the order I'd build this in:

Review lag first. It's the highest-leverage thing to improve, it's fast to measure, and improvements are immediately visible to the team.
PR size second. Set a soft target (say, 400 lines) and review it with the team in your next retro. No enforcement, just visibility.
Cycle time third. This requires connecting your git data to your deployment events, which takes more plumbing, but it's the number that will make the case to stakeholders most clearly.
Merge frequency last. Once the first three are in decent shape, merge frequency usually follows on its own.

For the actual data collection, your git host's API is the primary source. GitHub's REST API exposes PR creation times, review times, and merge times directly. From there, you can push into a time-series database (Timescale, InfluxDB, or even a Postgres table with a created_at index works fine), or use a purpose-built platform if you'd rather not wire this yourself. BuildPulse's engineering metrics product pulls this data automatically and surfaces cycle time, PR size distributions, and review lag without custom instrumentation — useful if you want the dashboards without building the pipeline.

The goal isn't metrics, it's conversations

The teams that get the most value from git metrics aren't the ones with the fanciest dashboards. They're the ones where the data prompts a specific conversation: "Our review lag spiked last month — what happened?" or "PRs to the payments service are 3x larger than everywhere else — is that deliberate?"

Good metrics narrow the space of things you need to talk about. They surface patterns that would be invisible in the daily noise of shipping software. But they're a starting point for investigation, not a conclusion.

If your cycle time is 18 days and you want it to be 5, that number won't fix itself. But it will tell you exactly which part of the process to dig into first.

Stop guessing which tests you can trust

BuildPulse finds your flaky tests, ranks them by the engineering time they cost, and lets you quarantine the worst in one click. See results on your first build.

Find my flaky tests

See pricing

Free to start · No credit card required · Setup is a single CI step

Engineering Metrics

6 min read

Cycle time: the DORA metric that actually changes behavior

Cycle time is the DORA-adjacent metric leaders love and misread constantly. Here's how to measure it, what it hides, and how teams game it.

BuildPulse Team

Jul 24, 2026

Engineering Metrics

7 min read

Change failure rate: the DORA metric everyone quotes and nobody measures right

Change failure rate is the DORA metric leaders love to cite and teams love to game. Here's how to measure it honestly — and why flaky tests corrupt it.

BuildPulse Team

Jul 17, 2026

Engineering Metrics

7 min read

Change failure rate: the DORA metric everyone measures wrong

Of the four DORA metrics, change failure rate is the one teams most often measure wrong — and the one most contaminated by flaky CI. Here's how to get it right.

BuildPulse Team

Jul 3, 2026