One of the things that surprised me following last week's Jepsen report on Radix DLT (https://jepsen.io/analyses/radix-dlt-1.0-beta.35.1) was seeing both blockchain/DLT people *and* the database community go "Hang on, 16 transactions per second can't be right"--and expecting wildly different figures.
Is it that DLTs are doing *byzantine* consensus? Etcd uses Raft (https://raft.github.io/), which is not Byzantine fault-tolerant. Takes 2 network hops plus a disk sync on a majority of nodes to commit. ~2n messages/txn. Throughput bounded by the single, totally-ordered Raft log.
Radix is based on Hotstuff (https://arxiv.org/abs/1803.05069), which is Byzantine fault-tolerant, three-phase consen. ~6n (I think?) messages/txn.
And like, Hotstuff *itself* can go fast. The paper reports c5.4xlarge clusters pushing ~120K ops/sec (1KB/op, batches of 400 ops per round).
As the crypto maxim goes: DYOR!
Here's a YourKit snapshot from one of those Radix nodes pushing ~12 txns/sec. Some of it's crypto (BouncyCastle), but it looks like it's burning a ton of time in BerkeleyDB IO. Roughly 1/3rd waiting for fsync.
Rather a *lot* of fsyncs, as it turns out. Roughly 11 calls per txn on each node, at least in this particular run.
Etcd does way more per second (!?) but, like most DBs, batches. At ~2700 txns/sec, etcd gets away with only ~0.27 syncs/txn in this run.
Zooming out: Some of these costs can probably be optimized away in time. I suspect permissionless DLTs are always going to be at a latency and throughput disadvantage though. For starters, Lamport 2002 puts a two msg-delay lower bound on async consensus: https://lamport.azurewebsites.net/pubs/lower-bound.pdf
Does this *matter*? I honestly don't know.
ACH latencies are multiple days. OTOH, some trading systems start issuing order requests while the packet with offers is still coming over the wire. Transaction value can just barely beat--or far outweigh--processing & storage costs.
A single-user Mastodon instance for Jepsen announcements & discussion.