One of the things that surprised me following last week's Jepsen report on Radix DLT (https://jepsen.io/analyses/radix-dlt-1.0-beta.35.1) was seeing both blockchain/DLT people *and* the database community go "Hang on, 16 transactions per second can't be right"--and expecting wildly different figures.
Also, like... if y'all have questions about these results, ask! Here to help.
#8 wasn't easy to find in the first place, which means it's hard to be *super* confident in it being fixed without a clear explanation for its cause. BDB durability might have been related, but sometimes errors like this are just masked, rather than fixed, by timing changes.
Transaction loss in the underlying ledger system was definitely caused by https://github.com/radixdlt/radixdlt/commit/704ee58fe9fefa92a2324a40d21483e96f5f4658#diff-391d3f54fc0db9b0261f199542d27abb88142acca928da013946f868c321fa1a: basically the core ledger opted not to wait for data to be flushed to disk before saying "Got it!". We never figured out #8, but I wasn't able to reproduce it in 655dad3.
Ah--a small clarification here! We found two issues in the raw Radix DLT ledger API. #8 involved transactions appearing in the raw txn log, then being immediately replaced by different txns. #10 involved nodes confirming then forgetting txns when many nodes crashed at once.
@JackRNewhouse @a_vaunt The transaction logging was problematic but the actual ledger was and is fine AFAICT.
For Radix DLT folks concerned about 16 vs. 50 transactions per second--I wouldn't worry about that at all. In benchmark terms those are basically the same number. As mentioned in the report, we're talking about different hardware, a different network, and different workloads. :-)
RDX Works (the makers of Radix DLT) states that all safety issues as well as the problem with indefinite transactions have been resolved as of Radix DLT 1.1.0.
Jepsen's latest distributed systems safety report covers Radix DLT 1.0-beta.35.1 through 1.0.2. We found stale, aborted, and intermediate reads, as well as the partial or total loss of committed transactions. Transactions could also hang indefinitely.
If the idea of writing a page cache for a pathologically lazy filesystem that loses unsynced data on command sounds like fun, I'd like like to hear from you!
Great talk by @vanlightly @hydraconference
He compares modeling and verification of a simple distributed log with #tlaplus and @jepsen_io Maelstrom
- overview of both
- pros and cons of those approaches
- how to move from model to implementation
Jack covers a lot in the talk https://twitter.com/lemmster/status/1471241135959859204
If you hate databases and long load times, I've got great news. Jepsen 0.2.6 is now available, and includes a new binary file format which makes working with test data much faster. https://github.com/jepsen-io/jepsen/releases/tag/v0.2.6
A small change to the ethics policy (https://jepsen.io/ethics): for environmental reasons, Jepsen no longer performs analyses of proof-of-work or proof-of-space systems.
Jepsen 0.2.4 is now available. New SSH backend options, performance improvements, and a local filesystem cache to speed up DB setup. Happy testing! https://github.com/jepsen-io/jepsen/releases/tag/0.2.4
"We might get outputs that are arbitrarily wrong, up to and including breaking program invariants"
"The outputs might never converge to correctness"
New release! Maelstrom 0.2.0 is a workbench for learning distributed systems by writing your own, in any language. Comes with a six-chapter tutorial in writing your own toy echo, gossip, CRDT, Datomic, and Raft systems. Powered by Jepsen and Elle! https://github.com/jepsen-io/maelstrom
New Jepsen release: 0.2.3. Super small, just a few bugfixes and performance improvements. Enjoy! https://github.com/jepsen-io/jepsen/releases/tag/0.2.3
Howdy guys, gals, and non-binary pals! Jepsen 0.2.2 is now available, with a slew of minor bugfixes, performance improvements, and new utility functions. Look out for a change to multi-fault nemesis scheduling: https://github.com/jepsen-io/jepsen/releases/tag/0.2.2