Skip to content
back to writing
9 min readpostgres · databases · architecture

The Postgres-First Architecture: When One Database Replaces Four Services

pgvector, logical replication, partitioning, LISTEN/NOTIFY, FDWs. Before you add another service to the stack, check whether Postgres alone covers the workload. Often it does.

RG
Rahul Gupta
Senior Software Engineer
share

Most architecture diagrams I review have the same silhouette. A transactional database. A Redis cluster. A search cluster. A vector database. A queue. Sometimes a separate event bus. Sometimes a columnar store off on the side.

Then I ask what each one carries in production, and the answer is usually small. A few hundred megabytes of cache entries. One index of a few million documents. A handful of consumer groups reacting to state changes that already live in Postgres.

My thesis is simple. A surprising amount of the “modern stack” is Postgres with the right extension turned on. Before you commit to four specialized systems, find out how far one database takes you. For most greenfield teams it is further than they expect, and the ones that skip this step end up paying for complexity they never needed.

I would default to Postgres-first and only peel services out when a specific, measurable limit is hit.

1. The real cost of a polyglot stack

Every additional data system costs more than the line item on the cloud bill.

You pay in:

  • one more failure domain during incidents
  • one more backup and restore story
  • one more access-control surface
  • one more client library to upgrade
  • one more consistency boundary to reason about
  • one more on-call rotation to keep honest

Teams underestimate this every time. The vector database is not just a vector database. It is a service, a schema, a deploy pipeline, a monitoring stack, and a new way your data can go out of sync with the source of truth.

Postgres is already in the stack. Running it harder is almost always cheaper than running it alongside three cousins.

2. pgvector covers most RAG workloads teams will ever build

The first thing I stop teams from buying is a dedicated vector database. If the corpus is under a few tens of millions of vectors and the query volume is reasonable, pgvector is enough.

SQL
create extension if not exists vector;
 
create table documents (
  id          bigserial primary key,
  tenant_id   uuid not null,
  chunk_text  text not null,
  embedding   vector(1536) not null,
  metadata    jsonb not null default '{}'::jsonb,
  created_at  timestamptz not null default now()
);
 
create index on documents
  using hnsw (embedding vector_cosine_ops)
  with (m = 16, ef_construction = 64);
 
create index on documents (tenant_id);
create index on documents using gin (metadata);

Querying the top K neighbors with tenant scoping and metadata filtering:

SQL
select id, chunk_text, metadata,
       1 - (embedding <=> $1) as similarity
from documents
where tenant_id = $2
  and metadata @> '{"source": "handbook"}'::jsonb
order by embedding <=> $1
limit 10;

What you get for free by staying in Postgres:

  • transactional writes alongside the chunk metadata
  • tenant isolation using the same row-level filter you already use
  • joins against business tables (user, document, permission)
  • a single backup, a single restore, a single audit trail

You absolutely will outgrow this at some scale. The question is whether your product is ever going to hit that scale. Most will not. The ones that do usually earn their way into a specialized system by measuring the specific queries that stopped fitting, not by following a reference architecture.

3. LISTEN/NOTIFY as lightweight pub/sub

LISTEN/NOTIFY is the feature teams forget exists, then rediscover the week they were about to deploy Redis for pub/sub.

SQL
create or replace function notify_order_change()
returns trigger as $$
begin
  perform pg_notify(
    'order_events',
    json_build_object(
      'op', tg_op,
      'id', new.id,
      'tenant_id', new.tenant_id,
      'status', new.status
    )::text
  );
  return new;
end;
$$ language plpgsql;
 
create trigger order_change_notify
after insert or update on orders
for each row execute function notify_order_change();

A worker subscribes once, and reacts as state changes:

TypeScript
import { Client } from "pg";
 
const client = new Client({ connectionString: process.env.DATABASE_URL });
await client.connect();
await client.query("LISTEN order_events");
 
client.on("notification", (msg) => {
  const payload = JSON.parse(msg.payload ?? "{}");
  handleOrderEvent(payload);
});

Where this works well:

  • background workers reacting to row changes
  • cache invalidation signals
  • fan-out to a small number of in-cluster consumers
  • local dev environments that need event-like behavior without a broker

Where it stops:

  • payload size is capped at 8000 bytes
  • delivery is best-effort in-memory, not durable
  • if the listener is offline during the notify, the event is gone
  • high-fanout and cross-region are not its job

I treat LISTEN/NOTIFY as a nudge, not a contract. If the product truly needs durability and replay, I pair it with a proper outbox. If it needs durable cross-service distribution, I reach for a broker. Until then, this is free infrastructure that already ships with the database you are running.

4. Logical replication is your CDC, your outbox, and your read replica

Once you turn on logical replication, a lot of services quietly become unnecessary.

SQL
alter system set wal_level = 'logical';
-- restart required
 
create publication app_pub for table orders, payments, users;
 
select pg_create_logical_replication_slot('app_slot', 'pgoutput');

With that one publication you can drive:

  • read replicas without app changes
  • a CDC pipeline into a warehouse or search index
  • an outbox that is the database itself rather than a separate table you have to maintain
  • fan-out into whatever downstream system eventually earns its place

A consumer reads the stream once and decides what to do. The source of truth never forks. You stop writing double-entry bookkeeping code where the app must dual-write to the database and a queue and hope neither step fails.

A common shape I reach for:

  1. The service writes to Postgres in a single transaction.
  2. Logical replication emits the change.
  3. A consumer transforms the change into a domain event and publishes it where it needs to go.

That is effectively a transactional outbox without a hand-rolled outbox table. You trade a little operational discipline around replication slots for a lot of correctness. Slots can also be a liability. An abandoned slot pins WAL and fills the disk. Monitor pg_replication_slots.active and slot lag, and alert on both.

5. Declarative partitioning solves the time-series problem most apps actually have

Most teams do not have a time-series problem. They have “a table that got big because it records one row per event forever” problem. Declarative partitioning is the right tool for that, and it has been production-ready for years.

SQL
create table events (
  id          bigserial,
  tenant_id   uuid not null,
  occurred_at timestamptz not null,
  kind        text not null,
  payload     jsonb not null
) partition by range (occurred_at);
 
create table events_2026_04 partition of events
  for values from ('2026-04-01') to ('2026-05-01');
 
create table events_2026_05 partition of events
  for values from ('2026-05-01') to ('2026-06-01');
 
create index on events_2026_04 (tenant_id, occurred_at desc);
create index on events_2026_05 (tenant_id, occurred_at desc);

Retention becomes a detach, not a delete. Dropping a month of data is instant:

SQL
alter table events detach partition events_2026_04 concurrently;
drop table events_2026_04;

What this replaces:

  • a separate time-series store for moderate volumes
  • a “cold storage” service added because deleting from one giant table took hours
  • expensive background jobs that chew through WAL trying to reclaim space

If your write volume eventually outgrows one Postgres node, you will move. Until then, partitioning plus a sensible retention policy carries a lot of workloads people assume need Cassandra or Clickhouse.

6. JSONB with GIN is the schemaless escape hatch

Half the time a team asks for a document database, what they actually want is schema flexibility for a small slice of their data. Config. Metadata. Provider-specific fields that change as integrations come and go.

jsonb plus a GIN index handles that without splitting the stack.

SQL
alter table integrations
  add column config jsonb not null default '{}'::jsonb;
 
create index on integrations using gin (config jsonb_path_ops);
 
-- queries
select id, name
from integrations
where config @> '{"provider":"stripe","mode":"live"}';
 
select id, config->'limits'->>'monthly_cap' as cap
from integrations
where config ? 'limits';

The rule I keep in my head: structured fields go into columns, flexible fields go into one well-named jsonb column per table. Do not throw everything into a single data blob because you did not want to write a migration. That habit is how you end up with the worst parts of both worlds.

JSONB will not replace a search engine. Full-text relevance, tokenizers, synonyms, language stemming across dozens of locales, that is a specialist problem. For tag filters, attribute lookups, and “find me the row where config matches this”, Postgres is enough.

7. Foreign Data Wrappers keep integrations inside SQL

postgres_fdw and friends let you query external systems as if they were local tables. You do not build another ingestion pipeline for a lookup table that changes once a week. You read it through.

SQL
create extension if not exists postgres_fdw;
 
create server analytics_db
  foreign data wrapper postgres_fdw
  options (host 'analytics.internal', port '5432', dbname 'warehouse');
 
create user mapping for current_user
  server analytics_db
  options (user 'readonly', password 'redacted');
 
import foreign schema public
  limit to (customer_ltv_daily)
  from server analytics_db into reports;
 
select o.id, o.total, l.ltv
from orders o
join reports.customer_ltv_daily l on l.customer_id = o.customer_id
where o.created_at > now() - interval '7 days';

Use this for:

  • read-through access to a warehouse or a legacy system
  • stitching tenant data that lives in a sibling database
  • small reference tables that do not deserve their own pipeline

Do not use FDWs as a hot path for user-facing traffic. They are a way to avoid building a pipeline, not a way to avoid network latency. Treat them as “the query plan crosses a WAN, price accordingly” and they earn their place.

8. Full-text search that ships in the box

Before you deploy a search cluster, check whether Postgres full-text search handles the workload.

SQL
alter table articles
  add column search_tsv tsvector
  generated always as (
    to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''))
  ) stored;
 
create index on articles using gin (search_tsv);
 
select id, title,
       ts_rank(search_tsv, plainto_tsquery('english', $1)) as rank
from articles
where search_tsv @@ plainto_tsquery('english', $1)
order by rank desc
limit 20;

For admin panels, help centers, internal tools, and a lot of product search, that is the entire feature. Relevance tuning, multilingual corpora, typo tolerance, learning-to-rank, faceted aggregations at tens of millions of documents, those are reasons to leave. “We might want search someday” is not.

9. Where Postgres honestly stops

Being Postgres-first is not the same as being Postgres-only. There are places where it genuinely is the wrong tool.

  • Sub-millisecond cache lookups at very high QPS. A single-digit-millisecond Postgres query is great. If your product needs a p99 of 500 microseconds for hot reads at hundreds of thousands of requests per second, a dedicated cache is the right answer.
  • High-fanout durable streaming. Many consumer groups, durable replay over days, backpressure-aware partitioning, ordering guarantees per key at scale. That is broker territory. Do not pretend LISTEN/NOTIFY replaces Kafka or Pulsar past a certain load.
  • Cross-region active-active writes. Postgres logical replication is great for follower reads and CDC. It is not a multi-master story. If you genuinely need writes in multiple regions with low latency for each, pick a database designed for it, do not bolt it onto Postgres.
  • Very large vector corpora with heavy filtering. pgvector with HNSW is excellent up to a point. If you need billion-scale vectors with complex pre-filters and strict latency SLAs, a dedicated vector engine will outperform it.
  • Analytical workloads over wide columnar scans. Postgres can run ad-hoc analytics. It is not a columnar engine. If analysts are scanning terabytes for dashboards, push that to a warehouse and keep the transactional database for transactions.

The good news is that you can usually tell when you are approaching these limits, because a specific query or workload starts dominating your pain. That is a much better reason to adopt a specialized system than a reference architecture slide.

10. The operational side is where Postgres-first earns its keep

One database to back up. One database to restore. One set of metrics to watch. One connection pool to tune. One upgrade cycle. One security posture.

Teams underestimate how much of senior engineering is operational taxes. Every extra service multiplies them. Keeping the stack small is not laziness, it is focus. You get to spend your attention on the product, not on why the queue and the search index disagree about reality at 2 a.m.

The things worth investing in early if you commit to this approach:

  • a realistic connection pooler (PgBouncer in transaction mode for most apps)
  • alerting on replication slot lag and unused slots
  • autovacuum tuning for the big tables, not just defaults
  • partition management automation so nobody forgets to create next month’s partition
  • index bloat monitoring, not because it is frequent, but because it is invisible until it hurts

None of these are exotic. They are the boring work that makes one database do the job of four.

11. A decision rule for greenfield teams

Here is the rule I hand to teams starting fresh:

  1. Start with Postgres as the only data system in the stack.
  2. Use jsonb for flexible fields, partitioning for big tables, and full-text search where it fits.
  3. Turn on logical replication the day you need any kind of “something else needs to see this change” pipeline.
  4. Add pgvector the day you need embeddings. Do not evaluate vector databases on day one.
  5. Introduce a cache only when you have measured a specific hot-read workload that Postgres cannot serve cheaply.
  6. Introduce a broker only when you have more than one durable consumer, or replay over time actually matters.
  7. Introduce a search cluster only when Postgres full-text search can no longer meet the relevance or scale requirement you can describe concretely.
  8. Introduce a warehouse only when analysts, not app traffic, are the bottleneck.

Every one of those steps asks for evidence before buying a new system. That is the whole trick.

12. The takeaway

Most new services I see added to a stack could have been one more extension, one more index, or one more carefully designed Postgres table. The polyglot architecture is not automatically more scalable. It is automatically more expensive, more fragile, and more complex to operate.

If you are deciding how to build the next system, write down every data responsibility you think you need. Next to each one, write which Postgres feature already covers it. The ones that remain are the ones worth buying specialized systems for. Everything else is just one database, doing its job, quietly, for far longer than the reference architectures on the internet suggest.

Rahul Gupta
share