Ensure Kafka Consumer Exactly-Once Delivery

Exactly-once consumption with manual commits and read-committed isolation.

Recommended starting points curated by Conduktor. Always benchmark with your workload. Some broker configs are not available on managed services (AWS MSK, Confluent Cloud) — check your provider's documentation.

consumer

Config	Change	Why
Consumer Group
enable.auto.commit ⚠ Kafka 0.9.0+	true → false	Auto-commit is fundamentally incompatible with strong durability: it commits offsets on a timer regardless of whether records have been successfully processed. If the consumer crashes between auto-commit and processing completion, those records are silently skipped. Manual commitSync() after confirmed processing is the only safe pattern for durable consumers. • commitSync() blocks the processing thread until the coordinator acknowledges the commit; under coordinator unavailability this adds seconds of stall to the processing loop. Use commitAsync() with a retry callback for production systems.
isolation.levelcaution ⚠ Kafka 0.11.0+	read_uncommitted → read_committed	read_committed ensures the consumer only receives records that belong to committed transactions. Aborted transactional records are filtered at the broker and never delivered to the consumer. This is the only safe mode when consuming from producers using Kafka transactions, preventing processing of data that the producer intended to roll back. • Adds read latency proportional to the producer's transaction commit delay (typically 10-100ms). The consumer will not advance past an open transaction boundary, so a slow or hung transactional producer will stall consumption of ALL records past that offset, including non-transactional ones.
auto.offset.resetcaution ⚠ Kafka 0.9.0+	latest → earliest	For durable consumers, offset loss (e.g., after consumer group deletion or offset retention expiry) must result in reprocessing from the beginning, not skipping forward to latest. earliest guarantees no records are missed on first startup or offset reset. Monitor for this case explicitly via consumer_group_offsets metrics. • If offsets are genuinely lost after the topic's retention period has passed, earliest causes the consumer to replay from the oldest retained record, not from the beginning of all history. Design for idempotent processing to handle replays safely.
group.instance.idcaution ⚠ Kafka 2.3+	null →	Static membership eliminates the rebalance triggered by consumer restarts. When a durable consumer restarts (e.g., after processing an error), the broker recognizes its stable ID and does not trigger a full group rebalance within session.timeout.ms. This prevents the duplicate-processing burst that rebalances cause in at-least-once consumers. • If the same instance.id reconnects from a different host (e.g., after a container reschedule to a different node), the broker fences the previous session. Ensure your orchestration system either maintains stable IDs per pod or handles FencedInstanceIdException gracefully.
Fetching
max.poll.records ⚠ Kafka 0.10.0+	500 → 100	Smaller batches reduce the reprocessing blast radius: if processing fails midway through a 500-record batch, all 500 records must be reprocessed. With 100-record batches, at most 100 records are reprocessed on failure. This directly limits the duplicates-per-failure window in at-least-once systems. • Requires 5× more poll() calls and 5× more offset commit operations to process the same volume of records, increasing overhead and commit traffic to the group coordinator.
Partitioning
partition.assignment.strategycaution ⚠ Kafka 2.4+	class org.apache.kafka.clients.consumer.RangeAssignor,class org.apache.kafka.clients.consumer.CooperativeStickyAssignor → org.apache.kafka.clients.consumer.CooperativeStickyAssignor	CooperativeStickyAssignor performs incremental rebalancing: only partitions that must move are revoked, all others keep processing without interruption. For durable consumers this eliminates the 'stop-the-world' rebalance gap where no consumer commits — the primary window for duplicate records in at-least-once systems. • All group members must use the same assignor; a mixed-strategy group during rolling upgrade reverts to eager rebalancing. Requires a coordinated rollout. Kafka 2.4+ required.