We Switched from Maven to Bazel and Builds Got 10x Faster

Migrating was harder than we expected, but the performance and reliability gains were worth it.

Jason Lunz
Code Red
5 min readDec 12, 2017

--

On Thursday, November 2, Redfin hosted a tech talk titled Lessons Learned Moving from Maven to Bazel at our San Francisco office. In the talk, I covered the material in this blog post in more detail. Slides and audio are available.

Discuss on Hacker News

A year ago, Redfin used Maven to prepare its source code for deployment to our production website. Despite years of effort, there were ongoing problems with this, and we concluded they could only be fixed by switching build tools completely. After reviewing several alternatives, we settled on Bazel. Converting to Bazel was a big success, despite the conversion process being much more difficult than anticipated.

Here are pass/fail logs for our CI build immediately before and after the transition:

Before: maven. After: bazel.

In looking for a Maven replacement, we had several requirements. We wanted something that would build our code fast, both by using all available hardware resources and by avoiding unnecessary work. And we wanted something that would produce correct output, 100% of the time, even when incrementally building a changed source tree. As the Bazel website puts it: “{Fast, Correct} — Choose two”.

What Motivated the Change?

The Redfin source tree is hosted in a single git repository containing about 300 Maven projects and 55 npm modules, containing 56,000 files (including 19,000 Java and 9,000 JavaScript). Maven’s build algorithm is to visit every one of these projects in dependency order and apply the same rigid set of steps to build each one. As a result, it’s difficult to request that Maven do only what’s required to perform a specific action and nothing more. For example, the Maven command line for bazel test redfin.stingray:all is something like mvn -am -pl redfin.stingray -DskipTests=true -Dfindbugs.skip=true -Dcobertura.skip -Dcheckstyle.skip && mvn test -pl redfin.stingray, and even then, the Maven version does more work than necessary.

In each Maven project, by convention, output files are collected in a “target” directory. As a consequence, incremental builds can’t be guaranteed correct, because nothing prevents outdated outputs remaining in “target” alongside newly built ones. The only reliable way to avoid this was to remove all target directories and build from scratch.

Other reliability problems were the result of building in parallel. By default, Maven doesn’t enable build parallelism; a single-threaded build is prohibitively slow for a repository our size. With parallelism enabled, however, builds sometimes failed due to missing Java dependencies, despite their being clearly expressed in the Maven configuration. To mitigate this, we ran production builds with less parallelism than we would have liked, sacrificing speed for marginally more reliability.

Redfin’s Migration Process

We migrated our build process incrementally. Rather than converting everything all at once, we created tools to generate a Bazel WORKSPACE and BUILD project definitions from the Maven pom.xml files so that both tools worked simultaneously. At first, we converted just the Java part of the build because this was easiest to do using Bazel’s built-in Java support. We added support for JavaScript and other miscellaneous Maven targets later.

Generating the Workspace

Bazel has a feature that makes it easy to use a generated workspace. If tools/bazel is an existing executable in the workspace, Bazel’s launcher invokes it automatically rather than running Bazel itself directly. We use a tools/bazel wrapper that transparently checks whether the generated Bazel build files need to be updated before running Bazel. That way, there’s no separate build-generation step for users to think about.

To generate the Java portion of the build, we wrote a Maven plugin called bazel-generator-plugin. It exports Maven’s internal project information as raw JSON files, and we then process these files in a separate Python script. This allows us to analyze the entire dependency graph at once. Doing so let us greatly reduce the size of generated BUILD files by eliminating redundant declarations.

Obstacles

Migrating Java projects wasn’t entirely trouble-free. Until we added Bazel support for Checkstyle, we had problems with code passing the Bazel build but failing Checkstyle on the Maven side. Transitive dependencies have different semantics in Bazel and Maven, leading to problems where Bazel includes several versions of a jar in cases where Maven would select a single version. We stopped using FindBugs in favor of Bazel’s built-in errorprone static analysis. And Bazel integration with the IntelliJ IDE is still not at 1:1 feature parity with IntelliJ’s Maven plugin.

Another blocker to leaving Maven entirely was simply finding all the ways we were using it! At first our goal was just to get CI migrated, but after that was done, there remained dozens of undocumented, ad-hoc uses of Maven by various teams to do things like building specific jobs, running specialized tests, or performing code coverage runs. Finding and converting these was a significant effort.

One major lesson of the migration process is that technical debt in the dependency graph eventually has to be paid down. We had to roll up our sleeves and dig into the sprawling tangle of pom.xml files, identifying outdated or unnecessary dependencies and simplifying them. This is hard but satisfying work, and we employed a number of approaches, but that’s a topic for another blog post.

Building JavaScript

To build the JavaScript portion of our site, we use standard tools like Yarn, gulp, and webpack. Running npm install under Bazel proved troublesome, so we reimplemented it completely in Python! This worked very well, but we had to abandon it: although it worked well in CI on Linux, it was prohibitively slow on macOS.

Our second attempt was to convert dependency management for our internal node modules to lerna. In addition to being more performant on developer laptops, it also worked well on Linux. We took a relatively simplistic approach of using lerna to install all npm module dependencies in one giant build rule, then gathered up the result in a tarball to reuse in subsequent rules. While crude, this approach has been effective. On Linux, building on tmpfs using Bazel’s --sandbox_tmpfs_path option sped things up substantially.

Life in the Promised Land

With the conversion mostly behind us, things are greatly improved! Our CI builds are faster (way faster: they used to take 40–90 minutes, and now dev builds average 5–6 minutes). Reliability is far higher, too. This is harder to quantify, but the shift from unexplained build failures being something that “just happens” to being viewed as real problems to be solved has put us on a virtuous cycle of ever-increasing reliability.

Another improvement is the ability to build or test specific things without doing any additional work. This is a core feature of Bazel — with a complete and accurate dependency graph, it has the information needed to schedule precisely the build tasks required.

If you’re running a medium-to-large monorepo in Maven, consider switching to Bazel. Your mileage may vary, but for us, it’s been a huge success.

Discuss on Hacker News

--

--