Replit Case Study

Flaky Tests

Jun 18, 2024

Replit’s mission is to bring the next billion software creators online. Their vision is that widespread code literacy will make the world a better place on many axes – education, wealth equality, and power distribution.

Pain Points

Before, it was normal to run CI tests on a commit or pull request 2 or 3+ times. This impacted testing culture, as engineers began to interpret CI test failures as flakes by default: it would take 20 - 30 minutes of CI time to triage flakiness vs. a real failure. For large stacks of pull requests, the engineers would need to babysit the stack and shepherd it into production, as each pull request amplified the flake problem - considerably impacting development velocity.

Solution

BuildPulse’s impact was night and day in terms of reliability in CI and developer experience working with CI. Replit is now able to triage flakes over time, distinguish from true failures, and delegate to the correct owner - leading to the number of flaky tests dropping week over week. Another outcome was better hygiene throughout their tests and test harness, which improved confidence in testing infrastructure.

Key Points

  • Replit’s engineering team faced substantial decrease in velocity from test flakiness in pull requests.

  • With BuildPulse, flakiness became trackable and addressable, increasing build stability. Replit is achieving cleaner results every night without additional noise and can distinguish flakiness from true failures.

  • BuildPulse helps streamline and spread awareness of flakiness, making it easier to delegate flaky tests throughout the team.

Replit reduced developer + CI time spent on flaky tests by integrating with BuildPulse.

Flaky Tests: Build vs. Buy

CI for the main web repository used to be extremely unreliable and frustrating: engineers used to have to re-run CI on most branches before they could confidently determine whether they'd broken something with their changes.

Replit leveraged multiple languages across multiple repos, and the team had limited capacity to implement an internal solution.

The key success criteria were:

  • Automatically identify and centrally catalog flaky tests

  • Metrics on flakiness and time consumed for prioritization

  • Surface flakiness to the broader team, and distinguish from true failures

  • Speed of implementation

“Thanks to BuildPulse, we were able to methodically enumerate flaky tests, prioritize them in terms of disruptive potential, drive them to 0, and keep them at 0 thanks to BuildPulse's actionable daily reports.”

Implementation

BuildPulse was up and running seamlessly - the first set of flaky tests were captured immediately. The process consisted of:

  • Monitoring and sending test results to BuildPulse

  • On-call aggressively triaging every new source of flake, creating a ticket, and assigning deflake responsibilities based on ownership

  • Parsing the backlog to triage historical flakiness over time

Outcome

BuildPulse has helped Replit minimize time spent collecting and triaging flaky tests, surface critical information in identifying root cause, spread awareness among the team, and improve confidence in testing infrastructure. BuildPulse will be a key piece to tracking efforts and measuring success.

FAQ

Does BuildPulse replace my current CI system?

No.

We use GitHub Actions / CircleCI / Semaphore CI self-hosted functionality to run your builds on our infrastructure.

Other than faster builds, there are no changes to your developers' workflows - you can continue using your CI system as-is.

How is BuildPulse faster than GitHub Actions hosted runners?

We use GitHub’s self-hosted functionality to run your builds on our infrastructure with latest generation + high single-core performance CPUs, also then further optimized for CI-type workloads. We’ve also tuned our VMs and block storage devices, increasing baseline performance while also cutting costs in half.

We also provide a toolkit to further speed up your pipelines, which includes ultra fast remote docker builders, docker layer caching, dependency caching, and more. With all of these improvements, we’ve seen 2x+ performance improvements in build times.

Can I use BuildPulse with other CI providers than GitHub Actions?

Yes! BuildPulse Runners will run jobs for CircleCI, SemaphoreCI - GitLab coming soon.

We aim to support all popular CI systems. If you're using one that's not listed, please contact support@buildpulse.io!

Is there a free trial available?

Yes, you can book a meeting here!

How do you secure my builds?

BuildPulse runs each job in a network- and compute- isolated environment with ephemeral VMs that leave behind a clean state after every run.

Do you support Mac and Windows runners?

This is on our roadmap! Email us at hello@buildpulse.io, or book a demo here!

Is BuildPulse SOC 2 compliant?

Yes, BuildPulse is SOC 2 Type 2 compliant.

Contact us at hello@buildpulse.io for more information.

How are BuildPulse Runners priced?

BuildPulse Runners charges on a per-second basis, which depend on the runner-type used. See our pricing page for more details.

How long does implementation/integration with BuildPulse take?

The minimum implementation involves 2 steps: Signing up for BuildPulse, and changing 1 in your GitHub Actions yaml file.

If you're using Semaphore CI or Circle CI, it's a 4 line change. See our Getting Started guide for more details.

Does BuildPulse replace my current CI system?

No.

We use GitHub Actions / CircleCI / Semaphore CI self-hosted functionality to run your builds on our infrastructure.

Other than faster builds, there are no changes to your developers' workflows - you can continue using your CI system as-is.

How is BuildPulse faster than GitHub Actions hosted runners?

We use GitHub’s self-hosted functionality to run your builds on our infrastructure with latest generation + high single-core performance CPUs, also then further optimized for CI-type workloads. We’ve also tuned our VMs and block storage devices, increasing baseline performance while also cutting costs in half.

We also provide a toolkit to further speed up your pipelines, which includes ultra fast remote docker builders, docker layer caching, dependency caching, and more. With all of these improvements, we’ve seen 2x+ performance improvements in build times.

Can I use BuildPulse with other CI providers than GitHub Actions?

Yes! BuildPulse Runners will run jobs for CircleCI, SemaphoreCI - GitLab coming soon.

We aim to support all popular CI systems. If you're using one that's not listed, please contact support@buildpulse.io!

Is there a free trial available?

Yes, you can book a meeting here!

How do you secure my builds?

BuildPulse runs each job in a network- and compute- isolated environment with ephemeral VMs that leave behind a clean state after every run.

Do you support Mac and Windows runners?

This is on our roadmap! Email us at hello@buildpulse.io, or book a demo here!

Is BuildPulse SOC 2 compliant?

Yes, BuildPulse is SOC 2 Type 2 compliant.

Contact us at hello@buildpulse.io for more information.

How are BuildPulse Runners priced?

BuildPulse Runners charges on a per-second basis, which depend on the runner-type used. See our pricing page for more details.

How long does implementation/integration with BuildPulse take?

The minimum implementation involves 2 steps: Signing up for BuildPulse, and changing 1 in your GitHub Actions yaml file.

If you're using Semaphore CI or Circle CI, it's a 4 line change. See our Getting Started guide for more details.

Does BuildPulse replace my current CI system?

No.

We use GitHub Actions / CircleCI / Semaphore CI self-hosted functionality to run your builds on our infrastructure.

Other than faster builds, there are no changes to your developers' workflows - you can continue using your CI system as-is.

How is BuildPulse faster than GitHub Actions hosted runners?

We use GitHub’s self-hosted functionality to run your builds on our infrastructure with latest generation + high single-core performance CPUs, also then further optimized for CI-type workloads. We’ve also tuned our VMs and block storage devices, increasing baseline performance while also cutting costs in half.

We also provide a toolkit to further speed up your pipelines, which includes ultra fast remote docker builders, docker layer caching, dependency caching, and more. With all of these improvements, we’ve seen 2x+ performance improvements in build times.

Can I use BuildPulse with other CI providers than GitHub Actions?

Yes! BuildPulse Runners will run jobs for CircleCI, SemaphoreCI - GitLab coming soon.

We aim to support all popular CI systems. If you're using one that's not listed, please contact support@buildpulse.io!

Is there a free trial available?

Yes, you can book a meeting here!

How do you secure my builds?

BuildPulse runs each job in a network- and compute- isolated environment with ephemeral VMs that leave behind a clean state after every run.

Do you support Mac and Windows runners?

This is on our roadmap! Email us at hello@buildpulse.io, or book a demo here!

Is BuildPulse SOC 2 compliant?

Yes, BuildPulse is SOC 2 Type 2 compliant.

Contact us at hello@buildpulse.io for more information.

How are BuildPulse Runners priced?

BuildPulse Runners charges on a per-second basis, which depend on the runner-type used. See our pricing page for more details.

How long does implementation/integration with BuildPulse take?

The minimum implementation involves 2 steps: Signing up for BuildPulse, and changing 1 in your GitHub Actions yaml file.

If you're using Semaphore CI or Circle CI, it's a 4 line change. See our Getting Started guide for more details.

Ready for Takeoff?

Ready for Takeoff?

Ready for Takeoff?