Show newer

Transaction loss in the underlying ledger system was definitely caused by basically the core ledger opted not to wait for data to be flushed to disk before saying "Got it!". We never figured out #8, but I wasn't able to reproduce it in 655dad3.

Show thread

Ah--a small clarification here! We found two issues in the raw Radix DLT ledger API. #8 involved transactions appearing in the raw txn log, then being immediately replaced by different txns. #10 involved nodes confirming then forgetting txns when many nodes crashed at once.
RT @rad_guild
@JackRNewhouse @a_vaunt The transaction logging was problematic but the actual ledger was and is fine AFAICT.

For Radix DLT folks concerned about 16 vs. 50 transactions per second--I wouldn't worry about that at all. In benchmark terms those are basically the same number. As mentioned in the report, we're talking about different hardware, a different network, and different workloads. :-)

Show thread

RDX Works (the makers of Radix DLT) states that all safety issues as well as the problem with indefinite transactions have been resolved as of Radix DLT 1.1.0.

Show thread

Jepsen's latest distributed systems safety report covers Radix DLT 1.0-beta.35.1 through 1.0.2. We found stale, aborted, and intermediate reads, as well as the partial or total loss of committed transactions. Transactions could also hang indefinitely.

If the idea of writing a page cache for a pathologically lazy filesystem that loses unsynced data on command sounds like fun, I'd like like to hear from you!

RT @asatarin
Great talk by @vanlightly @hydraconference

He compares modeling and verification of a simple distributed log with and @jepsen_io Maelstrom

- overview of both
- pros and cons of those approaches
- how to move from model to implementation

Jack covers a lot in the talk

If you hate databases and long load times, I've got great news. Jepsen 0.2.6 is now available, and includes a new binary file format which makes working with test data much faster.

RT @pfeodrippe
Hi, released a overhauled version of Recife, now using TLC :)

- Use Clojure and the REPL;
- Visualize the trace:
- Hillel's examples from his book:
- Also an example using @jepsen_io's Elle lib.

Made a small video about it at

A small change to the ethics policy ( for environmental reasons, Jepsen no longer performs analyses of proof-of-work or proof-of-space systems.

Jepsen 0.2.4 is now available. New SSH backend options, performance improvements, and a local filesystem cache to speed up DB setup. Happy testing!

"We might get outputs that are arbitrarily wrong, up to and including breaking program invariants"

"The outputs might never converge to correctness"

Giving a short talk on Jepsen at LADIS this afternoon! Our session starts at 14:45 US Eastern.

Live stream:

New release! Maelstrom 0.2.0 is a workbench for learning distributed systems by writing your own, in any language. Comes with a six-chapter tutorial in writing your own toy echo, gossip, CRDT, Datomic, and Raft systems. Powered by Jepsen and Elle!

New Jepsen release: 0.2.3. Super small, just a few bugfixes and performance improvements. Enjoy!

Howdy guys, gals, and non-binary pals! Jepsen 0.2.2 is now available, with a slew of minor bugfixes, performance improvements, and new utility functions. Look out for a change to multi-fault nemesis scheduling:

RT @asatarin
"Cobra: Making Transactional Key-Value Stores Verifiably Serializable"

Given black box trace of KV transactions determine if observed behaviors are serializable at scale (10K transactions)

New report: we worked with @ScyllaDB to identify and fix cases of LWT split-brain in healthy clusters due to improper hashing and membership changes, as well as documentation improvements. Notably, Scylla no longer claims non-LWT isolation!

Show older

A single-user Mastodon instance for Jepsen announcements & discussion.