Kafka, Pulsar, NATS JetStream in 2026: Choose by Failure Mode, Not Throughput

Every broker benchmark you will read in 2026 is correct and useless at the same time.

Correct, because the numbers are real. Useless, because nobody runs a broker on a clean cluster with warm caches and a friendly network.

The real question is not "which broker pushes more messages per second." It is "which broker's failure mode can my team actually survive at 3 a.m. with one person on call and a partial network partition."

That reframe changes the decision entirely. Throughput becomes a tiebreaker, not a driver. Operational cost, recovery behavior, and replay semantics become the main axes.

1. Benchmarks lie because they measure the wrong thing

Most broker benchmarks measure one broker, one topic, one producer, one consumer, one region, one network. In production you have none of those things.

What actually decides whether a broker works for you:

how it behaves when a broker goes down mid-write
how it behaves when a consumer is slower than the producer for hours
how it behaves when the disk fills to 90 percent
how long it takes to recover a lost partition leader
what an upgrade costs you in downtime and ops time
what a cross-region replication lag spike looks like to consumers

None of that shows up on a throughput chart. All of it shows up in your incident channel.

I pick brokers by what happens when things break, not by how fast they run when they don't.

2. What Kafka is actually good at

Kafka is a partitioned, ordered, durable log. That is the whole product. Everything else is ecosystem.

Its strengths are specific and hard to replicate:

strict per-partition ordering
cheap long retention on disk
replay from arbitrary offsets
mature consumer group semantics
the largest ecosystem in messaging (Connect, Streams, Flink, Debezium, ksqlDB, a dozen cloud managed options)
well-understood operational model after a decade of war stories

If your workload is "an ordered log of facts that many consumers read at different speeds and occasionally rewind," Kafka is still the default answer.

A minimal producer config looks friendly:

YAML

bootstrap.servers: kafka-0:9092,kafka-1:9092,kafka-2:9092
acks: all
enable.idempotence: true
max.in.flight.requests.per.connection: 5
compression.type: zstd
linger.ms: 10

That config is not where the cost hides.

3. The Kafka tax

Kafka's tax is operational, not conceptual.

KRaft has replaced ZooKeeper for most teams, but KRaft still has its own tuning surface, controller quorum concerns, and a migration story if you are on an older cluster
broker JVM tuning is real work: page cache behavior, GC tuning on large heaps, file descriptor and mmap limits, segment sizing
rebalance storms during consumer group changes can freeze consumption for seconds or minutes depending on group size and partition count
partition count is a capacity-planning decision that is hard to reverse; too few and you cannot scale consumers, too many and recovery and metadata become painful
cross-AZ replication costs real money, and cross-region MirrorMaker 2 or managed replication is its own project

Kafka is wonderful when you have a platform team that treats it as a first-class product. It is painful when one backend team inherits it as a side concern.

The honest test: if nobody on the team can describe what happens during a controller failover, you are not ready to self-host Kafka. Use a managed service or pick a simpler broker.

4. What Pulsar is actually good at

Pulsar's core differentiator is the separation of compute from storage. Brokers are stateless. Storage lives in Apache BookKeeper. That split gives you properties Kafka has to work much harder to match.

Where Pulsar genuinely wins:

tiered storage is first-class; old segments offload to object storage automatically
geo-replication is a built-in, per-topic feature, not a bolt-on
multi-tenancy is real: tenants, namespaces, quotas, and isolation are part of the model, not patterns layered on top
stateless brokers mean faster broker recovery and easier horizontal scaling of the serving layer
both queue and streaming semantics on the same system, with shared, failover, key-shared, and exclusive subscription modes

For platform teams running messaging as a service for many internal teams, Pulsar's tenancy model is the cleanest on the market.

A Pulsar topic policy makes the tenancy story concrete:

YAML

tenant: payments
namespace: payments/prod
policies:
  retention:
    time: 7d
    size: 100G
  backlog_quota:
    limit: 50G
    policy: producer_request_hold
  replication_clusters: [ap-south-1, ap-southeast-1]
  offload:
    threshold: 20G
    driver: aws-s3

That single block is something you would be stitching together with tooling on Kafka.

5. The Pulsar tax

Pulsar's tax is architectural complexity.

you are operating two distributed systems, not one: brokers and BookKeeper bookies, plus a metadata store
BookKeeper's ledger model is powerful but has its own failure shapes: ensemble size, write quorum, ack quorum, and auto-recovery behavior are concepts the on-call needs to hold in their head
the ecosystem, while growing, is still smaller than Kafka's; fewer connectors, fewer third-party tools, fewer Stack Overflow answers at 2 a.m.
client library maturity varies by language; the Java client is excellent, others trail
managed Pulsar options exist but the market is thinner than managed Kafka

Pulsar rewards teams that invest in it. It punishes teams that expected a drop-in Kafka replacement.

If the primary selling point for your team is "we want geo-replication and true multi-tenancy," Pulsar earns its complexity. If it is "we heard it is faster," you will regret it.

6. What NATS JetStream is actually good at

JetStream is the one I reach for when operational simplicity is the dominant constraint.

a single Go binary, static, no JVM, no external metadata store in the common case
clustering is built on Raft and is far less ceremony than Kafka or Pulsar
subject-based routing is genuinely expressive; wildcards like orders.*.created work naturally
request/reply, pub/sub, and durable streaming live in one system with one mental model
built-in key-value and object store APIs on top of streams for simple stateful use cases
edge-friendly: leaf nodes, super-clusters, and deployment topologies that Kafka and Pulsar do not really address

A JetStream stream definition reads like something a backend engineer can own end to end:

JSON

{
  "name": "ORDERS",
  "subjects": ["orders.>"],
  "retention": "limits",
  "max_age": 604800000000000,
  "max_bytes": 107374182400,
  "storage": "file",
  "replicas": 3,
  "discard": "old",
  "duplicate_window": 120000000000
}

That is the full configuration. No tuning essay required.

7. The JetStream limits

JetStream is not magic.

at true multi-terabyte, multi-month retention with heavy replay, Kafka and Pulsar are still more battle-tested
ecosystem depth is thinner; if you need Debezium, Flink, ksqlDB, or dozens of off-the-shelf connectors, JetStream will feel sparse
per-subject ordering and consumer semantics are different from Kafka partitions; teams coming from Kafka often misuse it at first
very large fan-out with strict ordering across millions of subjects is not where JetStream shines
the tooling and UI story is improving but lags the older systems

JetStream wins when "small team, big reliability need, modest-to-large scale" describes you. It loses when "data platform with hundreds of producers and consumers and years of replay" describes you.

8. A comparison that is honest about tradeoffs

Dimension	Kafka	Pulsar	NATS JetStream
Ordering	strict per-partition	per-partition (shared mode relaxes)	per-subject, per-stream
Retention	cheap, long, disk-bound	cheap and tiered to object storage	disk-bound, tiered still maturing
Replay	offset-based, excellent	cursor-based, excellent	sequence-based, good
Multi-tenancy	patterns on top	first-class (tenants, namespaces)	accounts and isolation, good
Geo-replication	MirrorMaker 2 or managed	built-in per topic	super-clusters and mirrors
Ops surface	brokers + KRaft + JVM	brokers + BookKeeper + metadata	single binary + Raft
Ecosystem	largest	mid	smaller
Consumer models	groups	exclusive, shared, failover, key-shared	push, pull, queue, ordered
Typical recovery story	partition leader election, rebalance	broker recycle (stateless) + bookie recovery	Raft leader election
Sweet spot scale	huge	huge, especially multi-region	medium to large, simple ops

The table is useful as a shape, not a verdict. The verdict comes from matching the row that hurts most to the broker that handles it best.

9. Pick by failure mode, not throughput

Here is the framework I actually use. For each likely failure, ask which broker recovers in a way you can live with.

Partition or broker leader failover

Kafka: controller elects a new leader per partition. Fast if partition count is sane, painful if it is extreme. Producer acks=all and enable.idempotence=true matter.
Pulsar: brokers are stateless, a new broker picks up the topic. BookKeeper continues serving storage. Usually the fastest recovery story of the three.
JetStream: Raft leader election per stream. Fast, bounded, predictable.

Slow consumer while producer keeps pushing

Kafka: backlog grows, retention eventually drops old data on the floor. You need monitoring on consumer lag and disk.
Pulsar: backlog quota kicks in and you can choose to throttle the producer or drop. Cleanest policy model of the three.
JetStream: max_bytes / max_age with discard policy. Simple, explicit.

Disk fills

Kafka: brokers start refusing writes. Painful if you did not size retention and did not offload.
Pulsar: tiered storage moves cold segments to object storage before disk pressure becomes terminal.
JetStream: you hit max_bytes and the discard policy decides. Put monitoring on it early.

Cross-region replication lag spike

Kafka: MirrorMaker 2 or managed replication; lag is visible but recovery is manual tuning.
Pulsar: per-topic geo-replication with explicit monitoring. Built for this.
JetStream: super-clusters and mirrors; works well for moderate scale, less battle-tested for heavy global traffic.

Upgrade window

Kafka: rolling broker restart, controller moves, consumer groups may rebalance. Needs care.
Pulsar: stateless brokers recycle quickly; bookies need their own rolling plan.
JetStream: rolling restart of a Go binary. The simplest of the three, by a lot.

Poisoned message or bad deployment

Kafka: replay from offset. The canonical use case.
Pulsar: reset cursor or replay by time. Excellent.
JetStream: redeliver by sequence or time. Good, with a smaller tooling story.

Pick the row that scares you most. The broker that handles that row gracefully is your answer, even if its throughput chart is not the tallest.

10. When none of them is the right answer

There is a fourth option that quietly wins more often than people admit: no broker at all.

If the real need is "when I write a row, I also want to publish an event reliably," the right pattern is a transactional outbox in your primary database, plus a relay process that ships rows to a broker or directly to consumers.

TypeScript

// pseudo-outbox write, same transaction as the business state
await db.tx(async (tx) => {
  await tx.orders.insert(order);
  await tx.outbox.insert({
    id: crypto.randomUUID(),
    type: "order.created",
    payload: order,
    createdAt: new Date(),
  });
});

A separate relay reads unsent outbox rows and publishes them downstream with idempotency keys. The database is the source of truth for "did this event happen." The broker becomes a delivery mechanism, not a commit log.

Why this is often better than reaching for a broker first:

the atomic boundary matches the business transaction exactly
replay is a SQL query, not a broker operation
the team already knows how to operate the database
you can introduce a broker later without changing producer semantics

Use a broker when you genuinely need fan-out, retention, or decoupled consumption at scale. Do not use one because the architecture diagram looked empty without one.

11. Managed vs self-hosted changes the answer

The broker you pick and the decision to self-host are two decisions, not one.

managed Kafka is a strong default for most teams; it removes the ops tax that makes Kafka regret-inducing
managed Pulsar exists but the market is thinner; self-hosting Pulsar without a platform team is a hard path
managed JetStream offerings are growing, and self-hosting it is genuinely feasible for a small team because the ops surface is small

If you are self-hosting, weight ops cost heavily. If you are on a managed service, weight ecosystem and semantics more. The same broker is a different product depending on who runs it.

12. How I would actually choose in 2026

My rough decision tree, after years of picking these systems wrong at least once each:

I need an ordered log, long retention, many consumers at different speeds, and a deep ecosystem of connectors and stream processors: Kafka, on a managed service unless there is a strong reason not to.
I am a platform team serving many internal tenants, need real multi-tenancy and geo-replication, and have the headcount to operate it: Pulsar.
I am a product team, I want messaging that one backend engineer can own end to end, I value operational simplicity over ecosystem breadth: NATS JetStream.
I need reliable event publication tied to a database write, and I am not yet dealing with broker-scale fan-out: database outbox first, broker later.

There is nothing wrong with mixing. A product team can run JetStream for internal flows and publish a curated subset to a platform Kafka for the data teams downstream. The two systems handle different failure modes on purpose.

13. The anti-patterns worth naming

Anti-pattern 1: picking the broker by throughput

The team benchmarks three brokers on a clean cluster, picks the fastest, and inherits the worst operational model for their skill set. Six months later they are rewriting it.

Anti-pattern 2: self-hosting because it felt cheap

Self-hosting any of these is a platform commitment. If there is no platform team, the licensing savings evaporate in incident time within the first year.

Anti-pattern 3: using a broker as a database

Storing the truth in the broker and deriving state only from consumers. Works until you need a new consumer, a schema change, or a correctness audit. The broker is a transport, not a source of truth, unless you are explicitly designing an event-sourced system with the discipline to match.

Anti-pattern 4: one broker to rule them all

Forcing request/reply, durable streaming, low-latency pub/sub, and long retention through one system because "we already have it." Sometimes the right answer is two brokers with two roles.

14. The pragmatic takeaway

Throughput benchmarks are the least interesting thing about a message broker.

Pick the broker whose failure mode you can operate. Pick the one whose recovery story you can explain to a new engineer in an afternoon. Pick the one whose ecosystem matches the integrations you will actually build, not the ones on the brochure.

Kafka rewards teams with platform maturity and a love of the log model. Pulsar rewards teams that need true multi-tenancy and geo-replication and are willing to operate two systems to get them. JetStream rewards teams that want strong guarantees with minimum ceremony.

And sometimes the honest answer is that a well-designed outbox in your primary database would have been enough for another two years, and the broker discussion was premature.

The best broker is the one that makes your worst day shorter. That is the number worth optimizing for.