How engineering leaders maintain sync with stakeholders

Engineering Metrics

Oct 18, 2023

In the dynamic realm of software engineering, the bridge between engineering leaders and stakeholders is crucial. This bridge is built on trust, transparency, and most importantly, metrics. As the saying goes, "What gets measured gets managed." But how do engineering leaders ensure that they're not just measuring, but also communicating these metrics effectively to stakeholders?

The Power of Metrics

Metrics are the lifeblood of an engineering organization. They provide a quantifiable measure of team performance, the health of the development process, and the overall productivity of the engineering team. Metrics like cycle time, lead time, and lines of code offer insights into how efficiently the team is working. More importantly, they help engineering leaders identify bottlenecks, areas of optimization, and opportunities for continuous improvement.

Cycle Time and Its Significance

Cycle time, a critical engineering metric, represents the time taken from when a task starts to when it's completed. It's a clear indicator of the team's efficiency. A shorter cycle time means tasks are being completed faster, leading to quicker software development and releases. Conversely, a prolonged cycle time might indicate potential bottlenecks or areas that need optimization.

Investment Profile: Joining Project Management Metrics with Code Metrics

To present a holistic picture to stakeholders, engineering leaders often merge project management metrics with code metrics. This combined view offers a comprehensive understanding of both the planning and execution phases. For instance, while code review metrics might indicate the quality of code being produced, project management metrics can shed light on the team's adherence to timelines and their alignment with business goals - for example, the KTLO vs. non-KTLO breakdown can shed light on the team's focus and alignment with strategic objectives.

The Machine Behind the Machine

When stakeholders understand the intricacies of coding time, pickup time, review time, and deploy time, they get a glimpse into the "machine that makes the machine." These metrics, especially when tracked over time, paint a picture of the engineering team's health and efficiency. For instance, if the review time starts increasing consistently, it might indicate that the code quality is declining, necessitating longer review processes.

The Art of Communication

Engineering leaders play a pivotal role in ensuring that stakeholders are always in the loop. This involves:

  1. Transparency: Being open about the metrics, even if they aren't favorable. This builds trust and ensures stakeholders are never caught off guard.

  2. Regular Updates: Instead of waiting for a quarterly review, engineering leaders can provide regular updates, ensuring stakeholders are always informed.

  3. Using Data: As Catherine Miller aptly put it, "Planning conversations are about feelings, and execution conversations are about data." Leveraging data during discussions eliminates ambiguity and drives objective decision-making.

Course Correction: When Things Go Off-Track

Even with the best plans in place, things can go awry. Engineering leaders need to be adept at course correction. This involves identifying the root cause, be it tech debt, resource constraints, or external factors, and then implementing the necessary changes.

BuildPulse: A Partner in Effective Communication

For engineering leaders looking to streamline their communication with stakeholders, BuildPulse Engineering Metrics emerges as an invaluable ally. Not only does it provide comprehensive metrics, but its developer copilot feature also ensures timely reviews and automates routine tasks. With tools like BuildPulse, engineering teams can focus on what they do best: building quality software.

Conclusion

In the ever-evolving world of software engineering, maintaining sync with stakeholders is both an art and a science. It's about measuring the right metrics, communicating them effectively, and ensuring that the entire team is aligned with the organization's broader goals. After all, as Juan Pablo Buriticá succinctly put it, "We're a good engineering team if we're good at reacting quickly."

FAQ

What is the difference between a flaky test and a false positive?

A false positive is a test failure in your test suite due to an actual error in the code being executed, or a mismatch in what the test expects from the code.

A flaky test is when you have conflicting test results for the same code. For example, while running tests if you see that a test fails and passes, but the code hasn’t changed, then it’s a flaky test. There’s many causes of flakiness.

What is an example of a flaky test?

An example can be seen in growing test suites - when pull request builds fail for changes you haven’t made. Put differently, when you see a test pass and fail without any code change. These failed tests are flaky tests.

What are common causes of flakiness?

Broken assumptions in test automation and development process can introduce flaky tests - for example, if test data is shared between different tests whether asynchronous, high concurrency, or sequential, the results of one test can affect another. 

Poorly written test code can also be a factor. Improper polling, race conditions, improper event dependency handling, shared test data, or timeout handling for network requests or page loads. Any of these can lead to flaky test failures and test flakiness.

End-to-end tests that rely on internal API uptime can cause test flakiness and test failures.

What's the impact of flaky tests?

Flaky tests can wreck havoc on the development process - from wasted developer time from test retries, to creating bugs and product instability and missed releases, time-consuming flaky tests can grind your development process to a halt.

What is the best way to resolve or fix flaky tests?

Devops, software engineering, and software development teams will often need to compare code changes, logs, and other context across test environments from before the test instability started, and after - adding retries or reruns can also help with debugging. Test detection and test execution tooling can help automate this process as well. 

BuildPulse enables you to find, assess impact metrics, quarantine, and fix flaky tests.

What are some strategies for preventing flaky tests?

Paying attention and prioritizing flaky tests as they come up can be a good way to prevent them from becoming an issue. This is where a testing culture is important - if a flaky test case is spotted by an engineer, it should be logged right away. This, however, takes a certain level of hygiene - BuildPulse can provide monitoring so flaky tests are caught right away.

What type of tests have flaky tests?

Flaky tests can be seen across the testing process - unit tests, integration tests, end-to-end tests, UI tests, acceptance tests.

What if I don't have that many flaky tests?

Flaky tests can be stealthy - often ignored by engineers and test runs are retried, they build up until they can’t be ignored anymore. These automated tests slow down developer productivity, impact functionality, and reduce confidence in test results and test suites. Better to get ahead while it’s easy and invest in test management.

It’s also important to prevent regressions to catch flakiness early while it’s manageable.

What languages and continuous integration providers does BuildPulse work with?

BuildPulse integrates with all continuous integration providers (including GitHub Actions, BitBucket Pipelines, and more), test frameworks, and workflows.

Combat non-determinism, drive test confidence, and provide the best experience you can to your developers!

How long does implementation/integration with BuildPulse take?

Implementation/integration takes 5 minutes!

What is the difference between a flaky test and a false positive?

A false positive is a test failure in your test suite due to an actual error in the code being executed, or a mismatch in what the test expects from the code.

A flaky test is when you have conflicting test results for the same code. For example, while running tests if you see that a test fails and passes, but the code hasn’t changed, then it’s a flaky test. There’s many causes of flakiness.

What is an example of a flaky test?

An example can be seen in growing test suites - when pull request builds fail for changes you haven’t made. Put differently, when you see a test pass and fail without any code change. These failed tests are flaky tests.

What are common causes of flakiness?

Broken assumptions in test automation and development process can introduce flaky tests - for example, if test data is shared between different tests whether asynchronous, high concurrency, or sequential, the results of one test can affect another. 

Poorly written test code can also be a factor. Improper polling, race conditions, improper event dependency handling, shared test data, or timeout handling for network requests or page loads. Any of these can lead to flaky test failures and test flakiness.

End-to-end tests that rely on internal API uptime can cause test flakiness and test failures.

What's the impact of flaky tests?

Flaky tests can wreck havoc on the development process - from wasted developer time from test retries, to creating bugs and product instability and missed releases, time-consuming flaky tests can grind your development process to a halt.

What is the best way to resolve or fix flaky tests?

Devops, software engineering, and software development teams will often need to compare code changes, logs, and other context across test environments from before the test instability started, and after - adding retries or reruns can also help with debugging. Test detection and test execution tooling can help automate this process as well. 

BuildPulse enables you to find, assess impact metrics, quarantine, and fix flaky tests.

What are some strategies for preventing flaky tests?

Paying attention and prioritizing flaky tests as they come up can be a good way to prevent them from becoming an issue. This is where a testing culture is important - if a flaky test case is spotted by an engineer, it should be logged right away. This, however, takes a certain level of hygiene - BuildPulse can provide monitoring so flaky tests are caught right away.

What type of tests have flaky tests?

Flaky tests can be seen across the testing process - unit tests, integration tests, end-to-end tests, UI tests, acceptance tests.

What if I don't have that many flaky tests?

Flaky tests can be stealthy - often ignored by engineers and test runs are retried, they build up until they can’t be ignored anymore. These automated tests slow down developer productivity, impact functionality, and reduce confidence in test results and test suites. Better to get ahead while it’s easy and invest in test management.

It’s also important to prevent regressions to catch flakiness early while it’s manageable.

What languages and continuous integration providers does BuildPulse work with?

BuildPulse integrates with all continuous integration providers (including GitHub Actions, BitBucket Pipelines, and more), test frameworks, and workflows.

Combat non-determinism, drive test confidence, and provide the best experience you can to your developers!

How long does implementation/integration with BuildPulse take?

Implementation/integration takes 5 minutes!

What is the difference between a flaky test and a false positive?

A false positive is a test failure in your test suite due to an actual error in the code being executed, or a mismatch in what the test expects from the code.

A flaky test is when you have conflicting test results for the same code. For example, while running tests if you see that a test fails and passes, but the code hasn’t changed, then it’s a flaky test. There’s many causes of flakiness.

What is an example of a flaky test?

An example can be seen in growing test suites - when pull request builds fail for changes you haven’t made. Put differently, when you see a test pass and fail without any code change. These failed tests are flaky tests.

What are common causes of flakiness?

Broken assumptions in test automation and development process can introduce flaky tests - for example, if test data is shared between different tests whether asynchronous, high concurrency, or sequential, the results of one test can affect another. 

Poorly written test code can also be a factor. Improper polling, race conditions, improper event dependency handling, shared test data, or timeout handling for network requests or page loads. Any of these can lead to flaky test failures and test flakiness.

End-to-end tests that rely on internal API uptime can cause test flakiness and test failures.

What's the impact of flaky tests?

Flaky tests can wreck havoc on the development process - from wasted developer time from test retries, to creating bugs and product instability and missed releases, time-consuming flaky tests can grind your development process to a halt.

What is the best way to resolve or fix flaky tests?

Devops, software engineering, and software development teams will often need to compare code changes, logs, and other context across test environments from before the test instability started, and after - adding retries or reruns can also help with debugging. Test detection and test execution tooling can help automate this process as well. 

BuildPulse enables you to find, assess impact metrics, quarantine, and fix flaky tests.

What are some strategies for preventing flaky tests?

Paying attention and prioritizing flaky tests as they come up can be a good way to prevent them from becoming an issue. This is where a testing culture is important - if a flaky test case is spotted by an engineer, it should be logged right away. This, however, takes a certain level of hygiene - BuildPulse can provide monitoring so flaky tests are caught right away.

What type of tests have flaky tests?

Flaky tests can be seen across the testing process - unit tests, integration tests, end-to-end tests, UI tests, acceptance tests.

What if I don't have that many flaky tests?

Flaky tests can be stealthy - often ignored by engineers and test runs are retried, they build up until they can’t be ignored anymore. These automated tests slow down developer productivity, impact functionality, and reduce confidence in test results and test suites. Better to get ahead while it’s easy and invest in test management.

It’s also important to prevent regressions to catch flakiness early while it’s manageable.

What languages and continuous integration providers does BuildPulse work with?

BuildPulse integrates with all continuous integration providers (including GitHub Actions, BitBucket Pipelines, and more), test frameworks, and workflows.

Combat non-determinism, drive test confidence, and provide the best experience you can to your developers!

How long does implementation/integration with BuildPulse take?

Implementation/integration takes 5 minutes!

Ready for Takeoff?

Ready for Takeoff?

Ready for Takeoff?