Auditboard Case Study
Flaky Tests
May 14, 2024
AuditBoard’s mission is to cut down administrative tasks for your audit, risk, sustainability, and compliance teams by automating enterprise compliance risk management.
Pain Points
When developers pushed code, builds consistently failed due to flaky tests - often from different areas of the codebase, which led to hours spent troubleshooting tests for every code change. Strategies such as rerunning and retrying to triangulate flaky tests, compounded the problem over time. This was an untenable situation as the team grew, leading to a substantial decrease in the engineering team’s velocity.
Solution
The team turned to BuildPulse with confidence that they would not only automatically track flaky tests and their impact, but also provide workflows to automatically quarantine tests to reduce developer and CI time spent addressing flakiness.
Key Points
Auditboard’s engineering team faced substantial decrease in velocity from code merges impacted by flaky tests.
With BuildPulse, the quarantine process is blazingly fast, flaky tests are trackable, and their numbers are decreasing. Auditboard is now achieving cleaner results every night without additional noise.
By expanding BuildPulse to other parts of the software stack, Auditboard not only manages flaky tests, but also addresses broken and failing tests effectively.
BuildPulse helps increase engineering velocity, streamlines communication between team members, and automates test processes.
Auditboard reduced developer + CI time spent on flaky tests by integrating with BuildPulse.
Flaky Tests: Build vs. Buy
At first, the internal solution consisted of rerunning each test 10 times - code could only be merged if the tests passed. From there, failing tests were manually tracked across various test frameworks, and issues were created to assign ownership. Custom scripts were written to then skip captured failing tests. Due to the nature of flaky tests, identifying and cataloging them in a centralized location with a manual triage process became unmanageable as the number of tests and developers grew.
Without impact metrics, it was difficult to determine the level of internal resources and workflow changes required to resolve the problem.
All of this adds up in development and support cost for incidents in production. By utilizing BuildPulse, Auditboard was able to save on this cost - as well as the developer hours saved in triage and rerunning builds.
The key success criteria were:
Automatically identify and centrally catalog flaky tests
Metrics on flakiness and time consumed for prioritization
Test quarantining to mitigate impact until tests are fixed
Speed of implementation
“You need a tool to track flakiness, or you will spend a lot of time doing analytics and quarantining tests”
Dickson Wu, Director of Quality Engineering
Implementation
BuildPulse was up and running with test quarantining within a handful of days - the first set of flaky tests were captured immediately. The process consists of:
Monitor and send results to BuildPulse
Implement skip quarantine test logic built into the testing framework
Test it during releases with manual Jira ticket creation and quarantine during the release
Turn on automated Jira ticket creation and test quarantining - start using BuildPulse at 100% capacity
The following highlights Auditboard’s new flakiness workflow:
Tests are run on PR builds, nightly builds, as well as before releases.
BuildPulse skips quarantined tests, runs the remaining tests, and automatically detects new flaky tests.
Manual quarantining is also done for broken tests pre-release.
For each quarantined test, BuildPulse automatically creates Jira tickets and assign tickets.
Each ticket will trigger an investigation, either opening a bug, or fixing the test if it is broken.
After the fix is merged and the Jira ticket moves to ‘done’, BuildPulse automatically removes the test from quarantine and re-enabled on the next build.
Outcome
BuildPulse has helped Auditboard minimize time spent collecting and triaging flaky tests, streamline communication between team members, and automate testing processes around resolution. Beyond flakiness, BuildPulse has also helped address broken and failing tests effectively.