JEPSEN

Distributed Systems Safety Research

About Jepsen

Jepsen is an effort to improve the safety of distributed databases, queues, consensus systems, etc. We maintain an open source software library for systems testing, as well as blog posts and conference talks exploring particular systems’ failure modes. In each analysis we explore whether the system lives up to its documentation’s claims, file new bugs, and suggest recommendations for operators.

Jepsen pushes vendors to make accurate claims and test their software rigorously, helps users choose databases and queues that fit their needs, and teaches engineers how to evaluate distributed systems correctness for themselves.

In addition to public analyses, Jepsen offers technical talks, training classes, and a variety of consulting services.

Other Resources

News

Recent research, analyses, and announcements.

Kyle Kingsbury spoke at Systems Distributed 2024 on RavenDB, MariaDB/MySQL, and Datomic.

jetcd 0.8.2

2024-08-07

Jepsen traced lost update, circular information flow, and aborted reads in etcd tests to an improper retry mechanism in jetcd 0.8.2, which allowed transactions to be submitted multiple times, and for committed transactions to appear as if they had actually failed.

Kyle Kingsbury will speak on performance techniques in Jepsen at GOTO Chicago, October 21 & 22, 2024. The talk will touch on a mix of high-level and low-level performance optimizations to make checking large histories tractable, including parallelism, pure functions, immutable data structures, and deforestation; bitsets, avoiding sharing between threads, packing structures into mutable arrays, dynamic compilation of primitive boxes, and macro iteration magic.

Early bird tickets are on sale now.

We’ve made some small changes to the Jepsen ethics policy.

The policy used to promise that Jepsen could veto publication if Jepsen and a client could not agree on the content of an analysis. However, this veto has never been used. In fact, Jepsen’s contracts have given Jepsen final approval over the content of analyses since 2016. We replace the promise of a veto with a stronger promise of editorial control.

In light of Jepsen’s multiple authors, we also shift to an organizational third person voice. Finally, we’ve streamlined some language.

In collaboration with Nubank, we analyzed Datomic Pro 1.0.7075 and found that its inter-transaction safety properties appeared stronger than claimed. Datomic Pro appeared to offer Strong Session Serializable isolation, and Strong Serializable for histories restricted to update transactions. However, Datomic defines unusual intra-transaction semantics in which operations are applied logically concurrent with one another, rather than sequentially. While consistent with Datomic’s documentation, this could cause invariants preserved by individual transaction functions to be broken when those same functions are applied within a single transaction.

All news from Jepsen…