Every broker benchmark you will read in 2026 is correct and useless at the same time.
Correct, because the numbers are real. Useless, because nobody runs a broker on a clean cluster with warm caches and a friendly network.
The real question is not “which broker pushes more messages per second.” It is “which broker’s failure mode can my team actually survive at 3 a.m. with one person on call and a partial network partition.”
That reframe changes the decision entirely. Throughput becomes a tiebreaker, not a driver. Operational cost, recovery behavior, and replay semantics become the main axes.
1. Benchmarks lie because they measure the wrong thing
Most broker benchmarks measure one broker, one topic, one producer, one consumer, one region, one network. In production you have none of those things.
What actually decides whether a broker works for you:
- how it behaves when a broker goes down mid-write
- how it behaves when a consumer is slower than the producer for hours
- how it behaves when the disk fills to 90 percent
- how long it takes to recover a lost partition leader
- what an upgrade costs you in downtime and ops time
- what a cross-region replication lag spike looks like to consumers
None of that shows up on a throughput chart. All of it shows up in your incident channel.
I pick brokers by what happens when things break, not by how fast they run when they don’t.
2. What Kafka is actually good at
Kafka is a partitioned, ordered, durable log. That is the whole product. Everything else is ecosystem.
Its strengths are specific and hard to replicate:
- strict per-partition ordering
- cheap long retention on disk
- replay from arbitrary offsets
- mature consumer group semantics
- the largest ecosystem in messaging (Connect, Streams, Flink, Debezium, ksqlDB, a dozen cloud managed options)
- well-understood operational model after a decade of war stories
If your workload is “an ordered log of facts that many consumers read at different speeds and occasionally rewind,” Kafka is still the default answer.
A minimal producer config looks friendly:
bootstrap.servers: kafka-0:9092,kafka-1:9092,kafka-2:9092
acks: all
enable.idempotence: true
max.in.flight.requests.per.connection: 5
compression.type: zstd
linger.ms: 10That config is not where the cost hides.
3. The Kafka tax
Kafka’s tax is operational, not conceptual.
- KRaft has replaced ZooKeeper for most teams, but KRaft still has its own tuning surface, controller quorum concerns, and a migration story if you are on an older cluster
- broker JVM tuning is real work: page cache behavior, GC tuning on large heaps, file descriptor and mmap limits, segment sizing
- rebalance storms during consumer group changes can freeze consumption for seconds or minutes depending on group size and partition count
- partition count is a capacity-planning decision that is hard to reverse; too few and you cannot scale consumers, too many and recovery and metadata become painful
- cross-AZ replication costs real money, and cross-region MirrorMaker 2 or managed replication is its own project
Kafka is wonderful when you have a platform team that treats it as a first-class product. It is painful when one backend team inherits it as a side concern.
The honest test: if nobody on the team can describe what happens during a controller failover, you are not ready to self-host Kafka. Use a managed service or pick a simpler broker.
4. What Pulsar is actually good at
Pulsar’s core differentiator is the separation of compute from storage. Brokers are stateless. Storage lives in Apache BookKeeper. That split gives you properties Kafka has to work much harder to match.
Where Pulsar genuinely wins:
- tiered storage is first-class; old segments offload to object storage automatically
- geo-replication is a built-in, per-topic feature, not a bolt-on
- multi-tenancy is real: tenants, namespaces, quotas, and isolation are part of the model, not patterns layered on top
- stateless brokers mean faster broker recovery and easier horizontal scaling of the serving layer
- both queue and streaming semantics on the same system, with shared, failover, key-shared, and exclusive subscription modes
For platform teams running messaging as a service for many internal teams, Pulsar’s tenancy model is the cleanest on the market.
A Pulsar topic policy makes the tenancy story concrete:
tenant: payments
namespace: payments/prod
policies:
retention:
time: 7d
size: 100G
backlog_quota:
limit: 50G
policy: producer_request_hold
replication_clusters: [ap-south-1, ap-southeast-1]
offload:
threshold: 20G
driver: aws-s3That single block is something you would be stitching together with tooling on Kafka.
5. The Pulsar tax
Pulsar’s tax is architectural complexity.
- you are operating two distributed systems, not one: brokers and BookKeeper bookies, plus a metadata store
- BookKeeper’s ledger model is powerful but has its own failure shapes: ensemble size, write quorum, ack quorum, and auto-recovery behavior are concepts the on-call needs to hold in their head
- the ecosystem, while growing, is still smaller than Kafka’s; fewer connectors, fewer third-party tools, fewer Stack Overflow answers at 2 a.m.
- client library maturity varies by language; the Java client is excellent, others trail
- managed Pulsar options exist but the market is thinner than managed Kafka
Pulsar rewards teams that invest in it. It punishes teams that expected a drop-in Kafka replacement.
If the primary selling point for your team is “we want geo-replication and true multi-tenancy,” Pulsar earns its complexity. If it is “we heard it is faster,” you will regret it.
6. What NATS JetStream is actually good at
JetStream is the one I reach for when operational simplicity is the dominant constraint.
- a single Go binary, static, no JVM, no external metadata store in the common case
- clustering is built on Raft and is far less ceremony than Kafka or Pulsar
- subject-based routing is genuinely expressive; wildcards like
orders.*.createdwork naturally - request/reply, pub/sub, and durable streaming live in one system with one mental model
- built-in key-value and object store APIs on top of streams for simple stateful use cases
- edge-friendly: leaf nodes, super-clusters, and deployment topologies that Kafka and Pulsar do not really address
A JetStream stream definition reads like something a backend engineer can own end to end:
{
"name": "ORDERS",
"subjects": ["orders.>"],
"retention": "limits",
"max_age": 604800000000000,
"max_bytes": 107374182400,
"storage": "file",
"replicas": 3,
"discard": "old",
"duplicate_window": 120000000000
}That is the full configuration. No tuning essay required.
7. The JetStream limits
JetStream is not magic.
- at true multi-terabyte, multi-month retention with heavy replay, Kafka and Pulsar are still more battle-tested
- ecosystem depth is thinner; if you need Debezium, Flink, ksqlDB, or dozens of off-the-shelf connectors, JetStream will feel sparse
- per-subject ordering and consumer semantics are different from Kafka partitions; teams coming from Kafka often misuse it at first
- very large fan-out with strict ordering across millions of subjects is not where JetStream shines
- the tooling and UI story is improving but lags the older systems
JetStream wins when “small team, big reliability need, modest-to-large scale” describes you. It loses when “data platform with hundreds of producers and consumers and years of replay” describes you.
8. A comparison that is honest about tradeoffs
| Dimension | Kafka | Pulsar | NATS JetStream |
|---|---|---|---|
| Ordering | strict per-partition | per-partition (shared mode relaxes) | per-subject, per-stream |
| Retention | cheap, long, disk-bound | cheap and tiered to object storage | disk-bound, tiered still maturing |
| Replay | offset-based, excellent | cursor-based, excellent | sequence-based, good |
| Multi-tenancy | patterns on top | first-class (tenants, namespaces) | accounts and isolation, good |
| Geo-replication | MirrorMaker 2 or managed | built-in per topic | super-clusters and mirrors |
| Ops surface | brokers + KRaft + JVM | brokers + BookKeeper + metadata | single binary + Raft |
| Ecosystem | largest | mid | smaller |
| Consumer models | groups | exclusive, shared, failover, key-shared | push, pull, queue, ordered |
| Typical recovery story | partition leader election, rebalance | broker recycle (stateless) + bookie recovery | Raft leader election |
| Sweet spot scale | huge | huge, especially multi-region | medium to large, simple ops |
The table is useful as a shape, not a verdict. The verdict comes from matching the row that hurts most to the broker that handles it best.
9. Pick by failure mode, not throughput
Here is the framework I actually use. For each likely failure, ask which broker recovers in a way you can live with.
Partition or broker leader failover
- Kafka: controller elects a new leader per partition. Fast if partition count is sane, painful if it is extreme. Producer
acks=allandenable.idempotence=truematter. - Pulsar: brokers are stateless, a new broker picks up the topic. BookKeeper continues serving storage. Usually the fastest recovery story of the three.
- JetStream: Raft leader election per stream. Fast, bounded, predictable.
Slow consumer while producer keeps pushing
- Kafka: backlog grows, retention eventually drops old data on the floor. You need monitoring on consumer lag and disk.
- Pulsar: backlog quota kicks in and you can choose to throttle the producer or drop. Cleanest policy model of the three.
- JetStream:
max_bytes/max_agewithdiscardpolicy. Simple, explicit.
Disk fills
- Kafka: brokers start refusing writes. Painful if you did not size retention and did not offload.
- Pulsar: tiered storage moves cold segments to object storage before disk pressure becomes terminal.
- JetStream: you hit
max_bytesand the discard policy decides. Put monitoring on it early.
Cross-region replication lag spike
- Kafka: MirrorMaker 2 or managed replication; lag is visible but recovery is manual tuning.
- Pulsar: per-topic geo-replication with explicit monitoring. Built for this.
- JetStream: super-clusters and mirrors; works well for moderate scale, less battle-tested for heavy global traffic.
Upgrade window
- Kafka: rolling broker restart, controller moves, consumer groups may rebalance. Needs care.
- Pulsar: stateless brokers recycle quickly; bookies need their own rolling plan.
- JetStream: rolling restart of a Go binary. The simplest of the three, by a lot.
Poisoned message or bad deployment
- Kafka: replay from offset. The canonical use case.
- Pulsar: reset cursor or replay by time. Excellent.
- JetStream: redeliver by sequence or time. Good, with a smaller tooling story.
Pick the row that scares you most. The broker that handles that row gracefully is your answer, even if its throughput chart is not the tallest.
10. When none of them is the right answer
There is a fourth option that quietly wins more often than people admit: no broker at all.
If the real need is “when I write a row, I also want to publish an event reliably,” the right pattern is a transactional outbox in your primary database, plus a relay process that ships rows to a broker or directly to consumers.
// pseudo-outbox write, same transaction as the business state
await db.tx(async (tx) => {
await tx.orders.insert(order);
await tx.outbox.insert({
id: crypto.randomUUID(),
type: "order.created",
payload: order,
createdAt: new Date(),
});
});A separate relay reads unsent outbox rows and publishes them downstream with idempotency keys. The database is the source of truth for “did this event happen.” The broker becomes a delivery mechanism, not a commit log.
Why this is often better than reaching for a broker first:
- the atomic boundary matches the business transaction exactly
- replay is a SQL query, not a broker operation
- the team already knows how to operate the database
- you can introduce a broker later without changing producer semantics
Use a broker when you genuinely need fan-out, retention, or decoupled consumption at scale. Do not use one because the architecture diagram looked empty without one.
11. Managed vs self-hosted changes the answer
The broker you pick and the decision to self-host are two decisions, not one.
- managed Kafka is a strong default for most teams; it removes the ops tax that makes Kafka regret-inducing
- managed Pulsar exists but the market is thinner; self-hosting Pulsar without a platform team is a hard path
- managed JetStream offerings are growing, and self-hosting it is genuinely feasible for a small team because the ops surface is small
If you are self-hosting, weight ops cost heavily. If you are on a managed service, weight ecosystem and semantics more. The same broker is a different product depending on who runs it.
12. How I would actually choose in 2026
My rough decision tree, after years of picking these systems wrong at least once each:
- I need an ordered log, long retention, many consumers at different speeds, and a deep ecosystem of connectors and stream processors: Kafka, on a managed service unless there is a strong reason not to.
- I am a platform team serving many internal tenants, need real multi-tenancy and geo-replication, and have the headcount to operate it: Pulsar.
- I am a product team, I want messaging that one backend engineer can own end to end, I value operational simplicity over ecosystem breadth: NATS JetStream.
- I need reliable event publication tied to a database write, and I am not yet dealing with broker-scale fan-out: database outbox first, broker later.
There is nothing wrong with mixing. A product team can run JetStream for internal flows and publish a curated subset to a platform Kafka for the data teams downstream. The two systems handle different failure modes on purpose.
13. The anti-patterns worth naming
Anti-pattern 1: picking the broker by throughput
The team benchmarks three brokers on a clean cluster, picks the fastest, and inherits the worst operational model for their skill set. Six months later they are rewriting it.
Anti-pattern 2: self-hosting because it felt cheap
Self-hosting any of these is a platform commitment. If there is no platform team, the licensing savings evaporate in incident time within the first year.
Anti-pattern 3: using a broker as a database
Storing the truth in the broker and deriving state only from consumers. Works until you need a new consumer, a schema change, or a correctness audit. The broker is a transport, not a source of truth, unless you are explicitly designing an event-sourced system with the discipline to match.
Anti-pattern 4: one broker to rule them all
Forcing request/reply, durable streaming, low-latency pub/sub, and long retention through one system because “we already have it.” Sometimes the right answer is two brokers with two roles.
14. The pragmatic takeaway
Throughput benchmarks are the least interesting thing about a message broker.
Pick the broker whose failure mode you can operate. Pick the one whose recovery story you can explain to a new engineer in an afternoon. Pick the one whose ecosystem matches the integrations you will actually build, not the ones on the brochure.
Kafka rewards teams with platform maturity and a love of the log model. Pulsar rewards teams that need true multi-tenancy and geo-replication and are willing to operate two systems to get them. JetStream rewards teams that want strong guarantees with minimum ceremony.
And sometimes the honest answer is that a well-designed outbox in your primary database would have been enough for another two years, and the broker discussion was premature.
The best broker is the one that makes your worst day shorter. That is the number worth optimizing for.