Root cause
Kafka Connect worker JVM ran out of heap. Common triggers: large initial snapshot (millions of rows buffered), very large individual transactions, or the wal2json plugin (PostgreSQL) which serializes an entire transaction into memory before streaming it.
How to fix
- Increase Kafka Connect heap:
export KAFKA_HEAP_OPTS="-Xms1g -Xmx4g" - PostgreSQL: use plugin.name=pgoutput (recommended, built-in since PG 10, streams row-by-row) instead of decoderbufs which buffers the whole transaction in memory.
- Reduce snapshot batch size with snapshot.fetch.size.
- For large tables, use incremental snapshots instead of full snapshots.
- Limit captured tables with table.include.list to reduce scope.
Official Debezium documentation ↗