conduktor.io ↗
← All errors
high PostgreSQL Replication

FATAL: could not write to file "pg_wal/..." — disk full (no Debezium error, PostgreSQL-side symptom)

Root cause

When the Debezium connector stops or lags, the PostgreSQL replication slot freezes its restart_lsn. PostgreSQL cannot remove WAL segments past that point. On high-write databases, 20–50 GB/h of WAL can accumulate silently until the disk is full.

How to fix

  1. Monitor pg_replication_slots:
    SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag FROM pg_replication_slots;
  2. Set max_slot_wal_keep_size = 50GB in postgresql.conf as a safety cap (PG 13+). PostgreSQL will invalidate the slot rather than fill the disk.
  3. Enable heartbeat.interval.ms in Debezium to advance the slot even on idle source tables.
  4. Alert when replication lag exceeds 1 GB. If the slot is already invalid, drop it and re-snapshot.
⚠ A frozen slot on a busy PostgreSQL server can fill the disk in hours, causing a full outage. Set max_slot_wal_keep_size and monitor slot lag.
Official Debezium documentation ↗