| KIP | Title | Domain | Status | Author | Created | Updated | JIRA | Kafka | Protocol APIs |
|---|---|---|---|---|---|---|---|---|---|
| 1297 | Role-Aware Metric Tags for NodeToControllerChannelManagerImpl TLDR: Adds a role tag (broker, controller, or broker+controller) to the metric labels emitted by NodeToControllerChannelManagerImpl so that SelectorMetrics can be disambiguated by node role in KRaft mode. The component currently tags metrics only with broker-id, producing duplicate or misleading metrics on controller-only or combined nodes where brokerId is not meaningful. |
Metrics KRaft | Discussion | Nilesh Kumar | 2026-03-19 | 2026-03-20 | KAFKA-20199 | ||
| 1294 | Metadata-Version-Aware Configuration Constraints TLDR: Introduces metadata-version-aware configuration validation, so that broker config constraints can be tied to the cluster's active MetadataVersion and enforced dynamically as the cluster upgrades. Current ConfigDef validation is version-agnostic and runs statically at startup, preventing safe introduction of configs that are only valid above a certain MetadataVersion. |
KRaft Admin | Discussion | TaiJuWu | 2025-12-28 | 2026-03-18 | KAFKA-20294 | BrokerRegistration | |
| 1292 | New MirrorMaker2 Connector for syncing consumer offsets TLDR: Introduces a dedicated MirrorMaker 2 connector for consumer group offset synchronization that is separate from the existing MirrorCheckpointConnector, providing real-time offset sync via the consumer offsets topic rather than periodic checkpoint emission. MirrorCheckpointConnector conflates checkpoint generation with offset sync and commits offsets only periodically, creating a lag window during which a failover to the target cluster results in reprocessing. |
MirrorMaker Consumer | Discussion | Gantigmaa Selenge | 2026-02-25 | 2026-03-13 | |||
| 1290 | Rack-Aware Minimum In-Sync Replicas TLDR: Introduces rack-aware min.insync.replicas enforcement at produce time, rejecting writes unless ISR acknowledgments come from replicas spanning a configurable minimum number of distinct racks. The existing min.insync.replicas counts total ISR replicas without regard to rack placement, meaning all acknowledging replicas could reside in the same availability zone, providing no durability guarantee against a single-AZ failure. |
Broker | Discussion | Karl Sorensen | 2026-02-25 | 2026-02-26 | KAFKA-20231 | ||
| 1289 | Support Transactional Acknowledgments for Share Groups TLDR: Adds transactional acknowledgment support for share group consumers, ensuring that a record's side-effect writes and its share group acknowledgment are committed atomically within a single Kafka transaction. Without this, a framework processing share group records can produce output and then fail before acknowledging, causing re-delivery and double-processing with no EOS guarantee. |
Consumer Transactions | Discussion | Shekhar Rajak | 2026-02-24 | 2026-03-24 | KAFKA-19883 | Produce OffsetCommit AddOffsetsToTxn EndTxn TxnOffsetCommit ShareAcknowledge | |
| 1288 | SSL Hot Reload for Kafka Clients TLDR: Adds opt-in SSL hot reload to Kafka clients by monitoring keystore and truststore files for changes and automatically reconfiguring the SSL context without restarting the client. Currently, SSL credentials are loaded once at startup and never refreshed, so certificate rotation or expiry requires restarting every client—a significant operational burden in environments using short-lived certificates. |
Security Client | Discussion | Skander Soltane | 2026-02-21 | 2026-03-16 | KAFKA-10731 | ||
| 1287 | Disallow sendOffsetsToTransaction from committing offsets for unassigned partitions TLDR: Enforces partition assignment validation in the broker for KafkaProducer.sendOffsetsToTransaction(), rejecting transactional offset commits for partitions not assigned to the requesting consumer group member. The broker currently verifies the member's epoch but does not check whether the committed partitions are actually assigned to that member, allowing a valid-epoch member to maliciously or accidentally overwrite offsets for partitions it does not own. |
Transactions Broker | Discussion | Ken Huang | 2026-02-20 | 2026-03-20 | KAFKA-20191 | OffsetCommit TxnOffsetCommit | |
| 1285 | DSL Opt-in Support for Headers-Aware State Stores TLDR: Wires KIP-1271's headers-aware state store implementations into the Kafka Streams DSL via an explicit opt-in mechanism, allowing operators like table(), aggregate(), and join() to use header-preserving stores. Without this DSL integration, developers using the high-level API had no path to adopt header-aware stores even after KIP-1271 provided the underlying interfaces. |
Streams | Accepted | Alieh Saeedi | 2026-02-18 | 2026-03-26 | KAFKA-20194 | 4.3 | |
| 1284 | Introduce CloseOptions.DEFAULT for Kafka Streams TLDR: Introduces CloseOptions.DEFAULT as an explicit enum value for KafkaStreams.close() that selects the correct shutdown behavior based on the active rebalance protocol (LEAVE_GROUP for the Streams Protocol, REMAIN_IN_GROUP for Classic). Without this, KafkaStreams.close() with no CloseOptions implicitly uses REMAIN_IN_GROUP, which is semantically wrong for the KIP-1071 Streams Protocol where dynamic membership makes remaining in the group after shutdown counterproductive. |
Streams | Discussion | Ken Huang | 2026-02-18 | 2026-03-09 | KAFKA-20167 | ||
| 1283 | Clarify KafkaStreams cleanUp semantics to preserve process metadata and state directory lock file TLDR: Updates the public contract of KafkaStreams#cleanUp to explicitly preserve the process identity metadata file (kafka-streams-process-metadata) and the application directory lock file (.lock), clarifying that the method clears local state but may retain the application directory. The existing behavior incorrectly attempted to delete the application directory even when valid metadata files were present, generating misleading warning logs. |
Streams | Discussion | sanghyeok an | 2026-02-17 | 2026-02-17 | KAFKA-17251 | ||
| 1282 | Prevent data loss during partition expansion for dynamically added partitions TLDR: Introduces a new auto.offset.reset policy called partition-start that resets consumers to the beginning of only newly discovered partitions while continuing from committed offsets on existing ones. The existing earliest policy forces full historical replay and latest silently drops messages produced between partition discovery and consumer assignment—neither is safe during partition expansion or initial consumer startup. |
Consumer Broker | Discussion | Ken Huang | 2026-01-24 | 2026-03-27 | KAFKA-20035 | ListOffsets Heartbeat ConsumerGroupHeartbeat | |
| 1280 | Update MirrorMaker to use KIP-877 to emit metrics TLDR: Updates MirrorSourceConnector and MirrorCheckpointConnector to register their metrics via the KIP-877 connector metrics API instead of creating private Metrics instances with their own MetricsReporter instances. The previous workaround—each connector instantiating its own Metrics object—bypassed standard metrics plumbing and was never available for MirrorHeartbeatConnector at all. |
MirrorMaker Metrics | Accepted | Mickael Maison | 2026-02-13 | 2026-02-26 | KAFKA-19149 | 4.3 | |
| 1279 | Cluster Mirroring TLDR: Cluster Mirroring introduces native broker-side cross-cluster replication as a built-in Kafka feature, replacing the need for external MirrorMaker 2 (MM2) Connect workers. MM2's standalone deployment model creates significant operational burden—separate provisioning, independent lifecycle management, and redundant compression/decompression overhead—which this KIP eliminates by integrating replication directly into brokers. |
MirrorMaker Broker | Discussion | Luke Chen | 2026-02-13 | 2026-03-27 | KAFKA-18723 | 3.9.1 | Metadata FindCoordinator ApiVersions |
| 1278 | Configurable Retry Mechanism for KafkaStatusBackingStore TLDR: Adds configurable retry parameters (max retries, backoff, timeout) to KafkaStatusBackingStore in Kafka Connect for handling failures when writing connector and task status to the internal status topic. Currently, some write paths block indefinitely on broker unavailability while others silently drop status updates, making connector state persistence unreliable and unobservable during broker outages. |
Connect | Discussion | Said BOUDJELDA | 2026-02-02 | 2026-02-03 | KAFKA-20113 | 4.3 | |
| 1277 | Support Delayed Message in Kafka TLDR: Introduces native delayed message delivery to Kafka, allowing producers to set a future delivery timestamp on records so they are held by the broker and only made visible to consumers at the specified time. Kafka has no built-in scheduling mechanism; teams implementing delayed retry logic for share groups (or any queue workload) must build external scheduling infrastructure or poll-and-republish loops. |
Broker Producer | Discussion | Henry Cai | 2026-01-30 | 2026-02-11 | Fetch | ||
| 1274 | Deprecate and remove support for Classic rebalance protocol in KafkaConsumer TLDR: Deprecates the Classic (pre-KIP-848) consumer rebalance protocol in KafkaConsumer with removal planned for a future major version, making the new incremental rebalance protocol (KIP-848) the only supported protocol. The Classic protocol's eager stop-the-world rebalances and thick client-side assignor logic are known operational pain points; the new protocol has been production-ready since Kafka 4.0 and the Classic implementation carries ongoing maintenance cost with diminishing benefit. |
Consumer | Accepted | Lianet Magrans | 2026-01-22 | 2026-03-09 | KAFKA-20282 | 4.3 | |
| 1273 | Improve Connect configurable components discoverability TLDR: Introduces a shared Configurable interface with a config() method across all Kafka Connect pluggable component types (Connector, Converter, Transformation, Predicate), and adds a REST endpoint to expose component configuration metadata. Each component type currently defines config() independently with no common supertype, making it impossible to generically discover and document configuration for all plugin types. |
Connect | Accepted | Mario Fiore Vitale | 2026-01-19 | 2026-02-25 | KAFKA-20079 | 4.3 | |
| 1272 | Support Compacted Topic in Tiered Storage TLDR: Extends tiered storage (KIP-405) to support compacted topics by handling the semantics of compaction in remote object storage, including uploading compacted segments, tracking which keys have been compacted away, and serving offset lookups over compacted remote segments. KIP-405 only supports append-only (delete-retention) log segments; compacted topics are a fundamentally different retention model that requires tracking the latest value per key across both local and remote tiers. |
Tiered Storage Broker | Discussion | Henry Cai | 2026-01-12 | 2026-01-12 | |||
| 1271 | Allow to Store Headers in State Stores TLDR: Extends Kafka Streams state stores to persist record headers alongside key and value bytes, introducing interfaces like TimestampedKeyValueStoreWithHeaders and VersionedKeyValueStoreWithHeaders. Currently, headers are stripped when records enter stateful operators, breaking header-dependent serdes (e.g., Schema Registry header-based schema-id format), tracing propagation, and any other header-carried semantics. |
Streams | Accepted | Alieh Saeedi | 2026-01-09 | 2026-03-26 | KAFKA-20056 | 4.3 | |
| 1270 | Introduce ProcessExceptionalHandler for Global Thread TLDR: Extends the ProcessingExceptionHandler to cover GlobalStreamThread processors (GlobalKTable), with a new processing.exception.handler.global.enabled boolean config to opt in. Currently any exception in GlobalKTable record processing kills the GlobalStreamThread and shuts down the entire Streams application, unlike regular KStream/KTable processing where the handler can recover gracefully. |
Streams | Accepted | Arpit Goyal | 2026-01-03 | 2026-02-24 | KAFKA-19939 | 4.3 | |
| 1269 | Configurable number of batches to retain in broker TLDR: Introduces a configurable broker-side limit on the number of producer batch sequence numbers retained for idempotent deduplication, replacing the hardcoded constant ProducerStateEntry#NUM_BATCHES_TO_RETAIN (currently 5). This hardcoded value forces max.in.flight.requests.per.connection to be ≤5 for idempotent producers, artificially limiting producer throughput. |
Broker Producer | Discussion | PoAn Yang | 2026-01-08 | 2026-03-20 | KAFKA-17967 | Produce InitProducerId | |
| 1268 | Prevent Dynamic Partition Expansion of Internal Topics TLDR: Prevents partition count expansion for internal Kafka topics (__transaction_state, __consumer_offsets, __share_group_state) by blocking AdminClient alter-partition-count operations on them. These topics use hash-based routing of transaction IDs or group IDs to specific partitions; changing partition count breaks the routing invariant and causes coordinators to lose track of existing state, leading to data loss or correctness failures. |
Broker | Discussion | sanghyeok an | 2026-01-07 | 2026-02-17 | KAFKA-20029 | CreatePartitions | |
| 1267 | KIP-1267: Tiered Storage Cost Attribution Metrics TLDR: Adds per-topic and per-consumer-group tiered storage cost attribution metrics (remote bytes fetched, remote API call counts) to enable operators to charge back object storage costs to individual tenants. With tiered storage, remote fetch operations (S3 GetObject calls) incur variable per-request costs, but Kafka currently exposes no metrics to attribute those costs to specific topics or consumer groups. |
Tiered Storage Metrics | Discussion | ViquarKhan | 2026-01-07 | 2026-02-14 | KAFKA-20047 | Fetch | |
| 1266 | Bounding The Number Of RemoteLogMetadata Messages via Compacted RemoteLogMetadata Topic TLDR: Converts the __remote_log_metadata internal topic to a compacted topic so that only the latest event per remote segment key is retained, bounding the topic's size and reducing broker startup time for rebuilding the in-memory RemoteLogMetadataCache. The topic currently grows unbounded as an append-only log of all remote segment lifecycle events, causing slow startup (full topic replay required) and high broker memory usage for clusters with large tiered storage histories. |
Tiered Storage | Discussion | Lijun Tong | 2026-01-05 | 2026-03-06 | KAFKA-19265 | 4.3 | |
| 1264 | Configurable TTL for Tiered Storage Index Cache TLDR: Fixes the RemoteIndexCache TTL eviction logic so that stale and expired tiered storage index entries (offset, time, transaction, producer snapshot indexes) are actually evicted from the cache. A bug in the current implementation prevents TTL-based eviction from triggering correctly, causing the cache to grow unbounded with stale entries and eventually consuming excessive heap or causing repeated unnecessary remote fetches. |
Tiered Storage | Discussion | Nandini Singhal | 2026-01-03 | 2026-01-05 | KAFKA-19970 | 4.3 | |
| 1263 | Group Coordinator Assignment Batching and Offload TLDR: Moves consumer group partition assignment calculation from a synchronous, per-heartbeat computation to a batched, asynchronous offload executed by a dedicated thread pool in the group coordinator. The KIP-848 server-side assignor runs assignment on the broker's main thread during heartbeat processing, which for large groups (thousands of members) can block for hundreds of milliseconds and degrade broker latency. |
Consumer Broker | Accepted | Sean Quah | 2025-12-26 | 2026-03-01 | KAFKA-20209 | 4.3 | |
| 1262 | Enable auto-formatting directories TLDR: Enables KRaft brokers to auto-format their log directories on first startup, eliminating the mandatory pre-start kafka-storage format step and automatically generating or joining a cluster ID. The current requirement forces operators to run a separate formatting command and supply a --cluster-id before any node can start, complicating automated and containerized deployments. |
KRaft Broker | Discussion | Kevin Wu | 2026-01-01 | 2026-03-25 | KAFKA-20174 | Fetch | |
| 1259 | Add configuration to wipe Kafka Streams local state on startup TLDR: Adds a startup configuration flag (state.dir.wipe.on.startup or equivalent) that causes Kafka Streams to delete all local state store data and checkpoint files before restoring from changelogs. A zombie-data edge case occurs when an instance restarts with stale local state after the changelog's delete.retention.ms has expired: the local checkpoint offset still exists on the broker but the corresponding changelog records have been deleted, causing silent state corruption on restore. |
Streams | Accepted | Uladzislau Blok | 2025-12-21 | 2026-03-01 | KAFKA-19943 | 4.3 | |
| 1258 | Add Support for OAuth Client Assertion to client_credentials Grant Type TLDR: Adds support for OAuth 2.0 client assertion (JWT Bearer, per RFC 7521/7523) as an authentication method for the client_credentials grant in Kafka's OAUTHBEARER SASL implementation. The current implementation only supports client_secret via HTTP Basic authentication (KIP-768), which requires sharing a long-lived secret—a security liability in zero-trust, short-lived-credential environments. |
Security | Accepted | Prabhash Kumar | 2025-12-17 | 2026-03-05 | KAFKA-18608 | ||
| 1257 | Partition Size Percentage Metrics for Storage Monitoring TLDR: Adds partition-level storage utilization percentage metrics that express current log size as a fraction of configured retention limits (both size-based and time-based local retention, and remote retention for tiered storage). Operators must currently compute these percentages manually by correlating size metrics with retention configs, and the calculation is especially complex for tiered topics with separate local/remote retention constraints. |
Metrics Broker | Accepted | Manan Gupta | 2025-12-16 | 2026-02-09 | KAFKA-20157 | 4.3 | |
| 1256 | Align broker and controller behavior for the Admin.incrementalAlterConfigs API TLDR: Aligns the behavior of Admin.incrementalAlterConfigs between broker-routed and controller-routed requests for null value handling, validation, and idempotency. When connecting via bootstrap.servers, brokers reject null values on non-DELETE operations; when connecting via bootstrap.controllers, controllers accept them silently—creating inconsistent semantics depending on the client's connection target. |
Admin KRaft | Discussion | Ken Huang | 2025-12-13 | 2026-03-16 | KAFKA-19931 | BrokerRegistration | |
| 1255 | Remote Read Replicas for Kafka Tiered Storage TLDR: Proposes a Remote Read Replica (RRR) broker role that serves historical reads exclusively from tiered remote storage, decoupling cold-read traffic from hot-path leader/ISR brokers and enabling per-AZ RRR deployments to reduce cross-AZ latency and cost. Leader replicas currently handle all reads regardless of data age, causing I/O contention, elevated local storage requirements, and cross-AZ fetch costs for consumers reading historical offsets from tiered storage. |
Tiered Storage Broker | Discussion | Manan Gupta | 2025-12-11 | 2025-12-15 | |||
| 1254 | Kafka Consumer Support for Remote Tiered Storage Fetch TLDR: Describes the client-side protocol changes required for Kafka consumers to fetch data directly from remote tiered storage, complementing the broker-side protocol introduced in KIP-1248. When brokers redirect consumers to tiered storage, consumers must resolve the remote segment location, authenticate to the object store, parse Kafka log segment format from raw object storage bytes, and handle transactions correctly—none of which the current consumer client supports. |
Tiered Storage Consumer | Discussion | Tom Thornton | 2025-12-09 | 2026-01-08 | Fetch | ||
| 1253 | Add TopologyValidator Utility for Kafka Streams Topology Compatibility TLDR: Proposes a TopologyValidator utility for Kafka Streams that detects breaking topology changes between deployments — such as internal topic renames, state store name shifts from added source nodes, or changed source/sink topics — before they cause data loss or silent state reset. Kafka Streams auto-generates incremental identifiers for unnamed nodes and stores, so inserting a new source node early in the topology silently shifts all downstream node names, reassigning changelog and repartition topics. |
Streams Testing | Discussion | sanghyeok an | 2025-12-08 | 2025-12-09 | KAFKA-19935 | ||
| 1251 | Assignment epochs for consumer groups TLDR: Adds an assignment epoch to the new consumer group protocol (KIP-848) so that OffsetCommit and AddOffsetsToTransaction requests can be validated against the member's current assignment, fencing zombie commits from members whose partitions have been reassigned. The KIP-848 protocol uses member epochs for heartbeat fencing but did not carry epoch information in offset commit or transactional offset commit paths, leaving a window for stale members to overwrite offsets of their successors. |
Consumer Protocol | Accepted | Lucas Brutschy | 2025-12-04 | 2026-02-16 | KAFKA-20066 | 4.3 | |
| 1250 | Add metric to track size of in-memory state stores TLDR: Adds a num-keys metric to in-memory Kafka Streams state stores (InMemoryKeyValueStore, InMemorySessionStore, etc.) tracking the current number of entries. RocksDB stores expose an estimated key count metric, but in-memory stores emit no equivalent, leaving operators without visibility into memory pressure from state store growth. |
Streams Metrics | Accepted | Evan Zhou | 2025-12-03 | 2026-02-13 | KAFKA-17895 | 4.3 | |
| 1249 | Better offset reset for the new consumer group rebalance protocol TLDR: Enables consumer group offset resets while the group is actively consuming by introducing a new ResetOffset admin API that the group coordinator processes atomically with assignment reconciliation. Currently, offset resets require the consumer group to be fully stopped, forcing production services offline or requiring complex custom control planes to coordinate safe online resets. |
Consumer | Discussion | Levani Kokhreidze | 2025-12-02 | 2026-02-23 | KAFKA-20195 | ||
| 1247 | Make Bytes utils class part of the public API TLDR: Promotes the Bytes class from org.apache.kafka.common.utils (internal package) to the public Kafka API surface. Bytes is already referenced in public Kafka Streams interfaces (e.g., KeyValueStore<Bytes, byte[]>), meaning users can depend on it through public APIs but cannot import it without using an internal package, creating a usability gap. |
Client | Accepted | Siddhartha Devineni | 2025-12-01 | 2026-02-24 | KAFKA-17939 | 4.3 | |
| 1246 | Deprecate backdoor that allows any client to produce to internal topics TLDR: Deprecates the backdoor mechanism that allows any client with clientId __admin_client to produce to internal topics (e.g. __consumer_offsets, __transaction_state) without authorization checks, with removal planned for Kafka 5.0. This bypass poses a security and operational risk: malicious or misconfigured clients can corrupt internal topic state with no differentiation from legitimate admin tooling. |
Security Broker | Discussion | TaiJuWu | 2025-11-28 | 2025-12-07 | KAFKA-5246 | ||
| 1245 | Enforce 'application.server' <server>:<port> format at config level TLDR: Moves validation of the application.server config (must be <host>:<port> format) from KafkaStreams construction time to StreamsConfig construction time, enabling faster fail detection. The check currently happens at client construction, which is too late for frameworks that create StreamsConfig independently and want to surface configuration errors before instantiating the streams instance. |
Streams | Discussion | sanghyeok an | 2025-11-26 | 2026-02-24 | KAFKA-17164 | ||
| 1242 | Detection and handling of misrouted connections TLDR: Introduces misrouted-connection detection by having the broker send a DisconnectResponse (or equivalent signal) when a client connects to the wrong broker, enabling the client to refresh metadata and reconnect correctly. In Kubernetes rolling restarts, a broker can change its advertised hostname and a client using stale bootstrap metadata silently connects to the wrong broker, leading to confusing failures that require a full client restart to resolve. |
Broker Protocol | Accepted | Andrew Schofield | 2025-11-20 | 2026-03-14 | KAFKA-20246 | ||
| 1241 | Reduce tiered storage redundancy with delayed upload TLDR: Adds a configurable delayed upload option for tiered storage, deferring segment upload to remote storage until the segment ages beyond a threshold relative to the local retention period. The current implementation eagerly uploads all closed segments to remote storage immediately, storing the same data on both local disk and remote object storage simultaneously and incurring unnecessary storage costs when data is still within its local retention window. |
Tiered Storage | Discussion | fu.jian | 2025-11-19 | 2026-03-18 | KAFKA-19893 | ||
| 1240 | Additional group configurations for share groups TLDR: Introduces per-group dynamic configuration for share groups, including max delivery attempts and delivery timeout, superseding the global broker-level config for these parameters. KIP-932 fixed share group behavior at the broker level without group-level overrides, making it impossible to tune error tolerance independently for different share groups on the same cluster. |
Consumer | Accepted | Andrew Schofield | 2025-11-17 | 2026-02-09 | KAFKA-20037 | 4.3 | |
| 1238 | Multipartition for TopologyTestDriver in Kafka Streams TLDR: Extends TopologyTestDriver to support multi-partition topics, enabling unit tests that exercise repartitioning, co-partitioned joins, and other partition-dependent stream processing logic. The driver currently simulates all topics as single-partition, making it impossible to test topologies that rely on repartition semantics without spinning up a full EmbeddedKafkaCluster. |
Streams Testing | Discussion | Marie-Laure Momplot | 2025-10-10 | 2026-03-09 | KAFKA-19871 | ||
| 1236 | Adjust quorum-related config lower bounds TLDR: Enforces a minimum lower bound of 1000ms for controller.quorum.fetch.timeout.ms to preserve the timing invariant that fetch timeout must be at least 2× the Raft maximum fetch wait time (500ms). Operators setting an unusually small fetch timeout could violate the invariant, causing the KRaft follower to prematurely transition to Prospective state and trigger spurious elections even when the leader is healthy. |
KRaft | Discussion | TaiJuWu | 2025-11-05 | 2025-12-10 | KAFKA-19847 | 5.0 | |
| 1235 | Correct the default min.insync.replicas to 2 for the __remote_log_metadata topic TLDR: Adds a new broker config remote.log.metadata.topic.min.isr (default 2) that is applied as min.insync.replicas when creating the __remote_log_metadata topic. Without an explicit min.isr, the topic defaulted to 1, meaning a single-broker failure in a 3-replica setup could cause tiered storage metadata loss and render remote data unreachable even though it still exists in object storage. |
Tiered Storage Broker | Discussion | fu.jian | 2025-11-05 | 2025-12-24 | KAFKA-19858 | 4.3 | |
| 1234 | Move and Add Arguments to version-mapping Commands TLDR: Moves --bootstrap-server/--bootstrap-controller from a globally required argument group in kafka-feature.sh to per-subcommand required args, and adds a --unstable-feature-versions flag to the version-mapping subcommand. Offline subcommands like version-mapping were blocked from running without unnecessary cluster connection arguments, and unstable metadata versions were not queryable from the CLI. |
KRaft Admin | Discussion | Chang-Yu Huang | 2025-10-27 | 2025-12-23 | KAFKA-20018 | ||
| 1233 | Maximum lengths for resource names and IDs TLDR: Proposes enforcing maximum length limits on Kafka resource names and identifiers (topic names, consumer group IDs, transactional IDs, client IDs, etc.). Without length limits, extremely long identifiers can inflate group coordinator data structures, cause unreadable CLI output, degrade broker performance, or be exploited for denial-of-service attacks. |
Admin Broker | Discussion | Andrew Schofield | 2025-10-26 | 2025-12-02 | |||
| 1230 | Add config for file system permission TLDR: Adds a Kafka Streams config allow.os.group.write.access (default false) that grants Unix group write permission on the state directory when enabled. Streams hard-coded state directory permissions to owner-only write, blocking multi-user or group-based deployments where shared write access is a legitimate operational requirement. |
Streams | Accepted | Matthias J. Sax | 2025-10-16 | 2025-10-24 | KAFKA-19803 | 4.2 | |
| 1228 | Add Transaction Version to WriteTxnMarkersRequest TLDR: Adds a transactionVersion field to WriteTxnMarkersRequest so that partition leaders can validate Transaction Version 2 (TV2) end markers by requiring producerEpoch > currentEpoch instead of accepting equality. Without this, leaders could not distinguish late/duplicate TV1 markers from active TV2 transaction markers, creating a correctness gap in exactly-once semantics even after KIP-890 introduced epoch-bump semantics. |
Transactions Protocol | Accepted | Ritika Reddy | 2025-10-13 | 2025-10-30 | KAFKA-19446 | 4.2 | WriteTxnMarkers v2 ApiVersions EndTxn |
| 1227 | Expose Rack ID in MemberDescription and ShareMemberDescription TLDR: Surfaces the rackId field already present in ConsumerGroupDescribeResponse and ShareGroupDescribeResponse into the MemberDescription and ShareMemberDescription objects returned by AdminClient. The rack ID was silently discarded during response parsing despite being available in the wire protocol, making it impossible to verify rack-aware assignment via the Admin API. |
Consumer Admin | Accepted | fu.jian | 2025-10-13 | 2025-10-22 | KAFKA-19784 | 4.2 | ConsumerGroupDescribe |
| 1226 | Introducing Share Partition Lag Persistence and Retrieval TLDR: Introduces share partition lag computation and persistence so that operators can query per-partition consumption progress for share groups via DescribeShareGroupOffsets and kafka-share-groups.sh. Share groups previously had no per-partition lag metric because their non-sequential, multi-consumer consumption model makes lag more complex than a simple end offset minus committed offset. |
Consumer | Accepted | Chirag Wadhwa | 2025-10-08 | 2025-10-27 | KAFKA-19778 | 4.2 | WriteShareGroupState v1 ReadShareGroupStateSummary v1 DescribeShareGroupOffsets v1 |
| 1224 | Adaptive append.linger.ms for the group coordinator and share coordinator TLDR: Introduces an adaptive mode for group.coordinator.append.linger.ms (value -1) that dynamically adjusts batch linger time toward 0ms under low load and rises to reduce latency impact under high load. The fixed 5ms default linger introduced a minimum latency floor for all group coordinator and share coordinator writes under KIP-848's new coordinator runtime, causing latency regressions for low-throughput workloads after upgrading to Kafka 4.0. |
Consumer Broker | Accepted | Sean Quah | 2025-10-05 | 2025-10-22 | KAFKA-19764 | 4.2 | |
| 1223 | Add user tag to DeprecatedRequestsMetric TLDR: Adds an optional user (principal name) tag to the DeprecatedRequestsPerSec JMX metric, gated behind a new config kafka.metrics.deprecated.requests.tag.user.enabled. The existing metric from KIP-896 could not identify which principal was sending deprecated API requests in multi-tenant clusters where multiple users share the same client. |
Metrics Security | Discussion | Gaurav Narula | 2025-10-03 | 2025-10-21 | |||
| 1222 | Acquisition lock timeout renewal in share consumer explicit mode TLDR: Adds a lock renewal RPC for share consumers in explicit acknowledgement mode so that a consumer processing a long-running record can extend its acquisition lock timeout without changing the record's delivery state. Without renewal, a record under active processing would have its acquisition lock expire and become re-deliverable to another consumer if processing exceeded the lock timeout. |
Consumer | Accepted | Sushant Mahajan | 2025-09-29 | 2025-11-17 | KAFKA-19742 | 4.2 | ShareFetch v2 ShareAcknowledge v2 Fetch |
| 1221 | Add application-id tag to Kafka Streams state metric TLDR: Adds an application-id tag to the Kafka Streams client-state JMX metric so that multiple instances of the same logical Streams application can be grouped together in monitoring systems. The existing metric distinguished physical instances via a UUID processId but had no application-level grouping, making it impossible to correlate separate physical instances into a single logical application in dashboards. |
Streams Metrics | Accepted | Bill Bejeck | 2025-09-24 | 2025-10-07 | KAFKA-19734 | 4.2 | |
| 1220 | kafka-broker-api-versions tool support bootstrap controllers TLDR: Extends kafka-broker-api-versions.sh to support the --bootstrap-controller flag, enabling operators to query the ApiVersions of KRaft controllers directly. Controllers have their own set of supported RPCs (e.g., metadata quorum RPCs) that are distinct from broker RPCs, but there was no tooling to inspect controller API capabilities independently of broker capabilities. |
KRaft Admin | Discussion | TaiJuWu | 2025-09-23 | 2026-01-09 | KAFKA-19663 | ApiVersions DescribeCluster | |
| 1219 | Configurations for KRaft Fetch and FetchSnapshot Byte Size TLDR: Introduces two new KRaft configs — controller.quorum.fetch.max.bytes and controller.quorum.snapshot.max.bytes — to control the maximum bytes per Fetch and FetchSnapshot request respectively. New controllers joining a cluster with slow network links could get stuck cycling between Unattached/Follower/Prospective because the 8MiB default snapshot byte limit prevented completing a FetchSnapshot within fetch.timeout.ms, repeatedly triggering unnecessary elections. |
KRaft Protocol | Accepted | Jonah Hooper | 2025-09-15 | 2025-12-01 | KAFKA-19541 | 4.3 | |
| 1217 | Include push interval in ClientTelemetryReceiver context TLDR: Updates the ClientTelemetryReceiver interface (introducing ClientTelemetryExporter and ClientTelemetryContext) to include the push interval in milliseconds alongside each client telemetry push. Without the push interval, MetricsReporter implementations receiving client telemetry could not determine when a client had disconnected versus simply not yet sent its next push, leaving stale metrics exposed indefinitely. |
Metrics | Accepted | Mickael Maison | 2025-09-09 | 2025-10-08 | KAFKA-19773 | 4.2 | PushTelemetry |
| 1216 | Add rebalance listener metrics for Kafka Streams TLDR: Adds tasks-revoked-latency, tasks-assigned-latency, and tasks-lost-latency (avg + max) metrics for the Kafka Streams rebalance listener callbacks under the new KIP-848 streams rebalance protocol. Streams migrated from the consumer rebalance listener to a streams-specific implementation, dropping the latency metrics that previously existed for revoked/assigned/lost partition callbacks. |
Streams Metrics | Accepted | Travis Zhang | 2025-09-08 | 2025-10-14 | KAFKA-19691 | 4.2 | |
| 1214 | Change log.segment.bytes configuration type from int to long to support segments larger than 2GB TLDR: Changes the type of `log.segment.bytes` from `int` to `long` to allow log segments larger than 2 GB. Modern high-capacity servers (e.g., EPYC nodes with 15 TB drives per disk) create disproportionately large numbers of segment files at the 2 GB limit, wasting file descriptors and increasing I/O overhead from frequent segment rotations. |
Broker | Discussion | Mikhail Fesenko | 2025-09-03 | 2025-09-03 | KAFKA-19603 | ||
| 1213 | Deprecated the log.dir configuration TLDR: Deprecates the log.dir configuration (singular) in favor of the already-existing log.dirs (plural), since KIP-1161 made log.dir a LIST-type config accepting comma-separated paths, making both settings functionally identical. Having two configurations with the same behavior causes confusion for operators and pollutes the configuration surface. |
Broker | Discussion | Ken Huang | 2025-09-02 | 2026-02-18 | KAFKA-19639 | ||
| 1209 | Add configuration to control internal topic creation in Kafka Connect TLDR: Adds a configuration flag to Kafka Connect workers to control whether internal topics (config, offsets, status) are automatically created if missing, defaulting to the current auto-create behavior. Auto-creation can corrupt an existing Connect cluster if a misconfigured worker starts against a wrong cluster or if partially deleted topics get silently recreated with wrong partition counts or replication factors. |
Connect | Accepted | Anton Liauchuk | 2025-08-26 | 2026-03-02 | |||
| 1208 | Add prefix to TopicBasedRemoteLogMetadataManagerConfig to enable setting admin configs TLDR: Adds a remote.log.metadata.admin. config prefix for TopicBasedRemoteLogMetadataManager (TRLMM) to pass admin client-specific configurations separately from producer (remote.log.metadata.producer.) and consumer (remote.log.metadata.consumer.) configs. Without this prefix, admin-specific settings (e.g., admin.request.timeout.ms) must be set via the common remote.log.metadata.common.client. prefix, which applies to all three client types and can cause unintended interactions. |
Tiered Storage | Accepted | Lan Ding | 2025-08-21 | 2026-01-10 | KAFKA-19590 | 4.3 | |
| 1207 | Fix anomaly of JMX metrics RequestHandlerAvgIdlePercent in kraft combined mode TLDR: Fixes the RequestHandlerAvgIdlePercent JMX metric in KRaft combined mode (broker+controller co-located) where the value incorrectly reported ~2.0 instead of 0–1, and adds separate BrokerRequestHandlerAvgIdlePercent and ControllerRequestHandlerAvgIdlePercent metrics. In combined mode, two separate KafkaRequestHandlerPool objects both divided their idle time by numIoThreads before reporting to a shared Meter, effectively halving the denominator and doubling the reported ratio. |
KRaft Metrics | Accepted | tony tang | 2025-08-19 | 2025-10-16 | KAFKA-19606 | 4.2 | |
| 1206 | Strict max fetch records in share fetch TLDR: Changes the maxFetchRecords limit in ShareFetch from a soft limit (batch-boundary aligned) to a strict record-count limit, ensuring the number of records returned never exceeds the configured value. The soft limit caused unpredictable batch sizes for share consumers with long per-record processing times, leading to frequent acquisition lock timeouts and uncontrolled redelivery. |
Consumer | Accepted | Jimmy Wang | 2025-08-18 | 2025-11-26 | KAFKA-19020 | 4.2 | ShareFetch v2 Fetch |
| 1204 | MetadataQuorumCommand describe to include CommittedVoters TLDR: Extends the kafka-metadata-quorum describe command to display CommittedVoters—the set of controller voters that has been durably committed to the Raft log—separately from the in-memory VoterSet which may include uncommitted membership changes. During a controller membership change (add/remove voter), the in-memory and committed voter sets diverge; without surfacing CommittedVoters, operators cannot determine whether a membership change has been successfully committed to quorum. |
KRaft Admin | Discussion | TaiJuWu | 2025-08-12 | 2025-12-09 | KAFKA-17243 | DescribeQuorum | |
| 1203 | Allow to configure custom `ReplicaPlacer` implementation TLDR: Allows operators to plug in a custom ReplicaPlacer implementation via a new unstable broker config, bypassing the default StripedReplicaPlacer round-robin algorithm. The default round-robin placement cannot be overridden without code changes, but some advanced use cases such as internal testing or specialized AZ-aware scenarios require custom placement strategies. |
Broker | Discussion | Ken Huang | 2025-08-04 | 2025-08-23 | KAFKA-19577 | ||
| 1201 | Add disk threshold strategy to prevent disk full failure TLDR: Introduces configurable disk usage threshold configs (disk.max.used.percent, disk.warning.used.percent) so that brokers proactively reject new produce requests when disk usage exceeds the threshold, while remaining operational for reads, log cleanup, and admin operations. A disk-full condition currently causes a java.io.IOException that fails the log directory, offlines all partitions, and ultimately triggers a broker shutdown, requiring manual intervention to recover. |
Broker | Discussion | mapan | 2025-07-25 | 2025-07-31 | KAFKA-19568 | ||
| 1198 | implement a ConfigKey.Builder class TLDR: Introduces a ConfigKey.Builder class with a fluent API as an alternative to the overloaded ConfigDef.define() methods for constructing ConfigKey instances. The growing number of parameters in ConfigDef.define() exceeded the practical limit where a builder pattern is recommended, and the lack of extensibility in ConfigKey made adding new functionality require widespread changes. |
Admin Client | Discussion | Claude Warren | 2025-07-29 | 2025-09-03 | |||
| 1197 | Introduce new method to improve the TopicBasedRemoteLogMetadataManager's initialization TLDR: Replaces the polling/retry initialization loop in TopicBasedRemoteLogMetadataManager with a direct dependency on broker readiness, adding a new initialization hook to avoid retrying before the broker can handle __remote_log_metadata requests. The default 2-minute retry timeout was insufficient for clusters that take more than 2 minutes to start, causing tiered storage initialization failures that left local log segments undeleted and produced excessive WARN logs. |
Tiered Storage | Discussion | fu.jian | 2025-07-25 | 2025-11-01 | KAFKA-19426 | 4.2 | |
| 1196 | Introduce group.coordinator.append.max.buffer.size config TLDR: Introduces a group.coordinator.append.max.buffer.size config to cap the group coordinator's append cache buffer size independently of the broker's message.max.bytes setting. The current implementation derives the buffer ceiling from message.max.bytes, so deployments with large max message sizes (common in event pipelines) create excessive memory pressure on the group coordinator and risk OutOfMemoryError. |
Consumer Broker | Accepted | Lan Ding | 2025-07-22 | 2025-11-04 | KAFKA-19519 | 4.3 | |
| 1195 | deprecate and remove org.apache.kafka.streams.errors.BrokerNotFoundException TLDR: Marks org.apache.kafka.streams.errors.BrokerNotFoundException as deprecated in Kafka 4.2 for removal in 5.0. The exception has been unused by Kafka Streams internals since version 2.8, leaving dead code that confuses users and complicates the exception class hierarchy. |
Streams | Accepted | Rajani Karuturi | 2025-07-16 | 2025-07-29 | KAFKA-12281 | 4.2 | |
| 1194 | Optimize Replica Assignment for Broker Load Balance in Uneven Rack Configurations TLDR: KIP-1194 changes the replica assignment algorithm to balance replica counts across individual brokers when rack configurations are uneven (i.e., racks have different numbers of brokers). The existing strategy prioritizes rack balance, which causes individual brokers in smaller racks to receive disproportionately more replicas and become hot spots. |
Broker | Discussion | pengjialun | 2025-07-15 | 2025-07-15 | |||
| 1193 | Deprecate MX4j support TLDR: Deprecates MX4J support (the kafka_mx4jenable JVM flag) in Kafka 4.2 and removes it in 5.0. MX4J is a library for exposing JMX over HTTP that was last released in 2006, is completely undocumented in Kafka's official docs, and has had no reported users since a 2016 mailing list query, yet it adds maintenance burden and potential security exposure. |
Metrics | Accepted | Federico Valeri | 2025-07-14 | 2025-07-21 | KAFKA-19503 | 4.2 | |
| 1192 | Add include argument to ConsumerPerformance tool TLDR: Adds an --include argument to kafka-consumer-perf-test.sh (mutually exclusive with --topic) that accepts a regex pattern for multi-topic pattern subscription. The ConsumerPerformance tool only supported single-topic subscription while ConsoleConsumer already had --include, making it impossible to benchmark consumer performance across multiple topics or dynamic topic sets. |
Consumer Testing | Accepted | Federico Valeri | 2025-07-13 | 2025-07-22 | KAFKA-19498 | 4.2 | |
| 1191 | Dead-letter queues for share groups TLDR: Adds dead-letter queue (DLQ) support for share groups, automatically routing records that exceed their max delivery count or are explicitly rejected (ARCHIVED acknowledgment type) to a configurable dead-letter topic. KIP-932's share groups track delivery counts and enforce limits, but provided no automated mechanism to handle poison messages—they simply accumulated redeliveries until manually addressed. |
Consumer | Accepted | Andrew Schofield | 2025-07-04 | 2026-03-05 | KAFKA-19469 | Produce | |
| 1190 | Add a metric for controller thread idleness TLDR: Adds an AvgIdleRatio metric to the ControllerEventManager showing the fraction of time the controller event loop thread is idle (0=always busy, 1=always idle). No metric existed for controller thread idleness, making it impossible to detect saturation, anticipate bottlenecks, or distinguish whether slow operations were caused by overloaded processing or simply rare events. |
KRaft Metrics | Accepted | Mahsa Seifikar | 2025-07-03 | 2025-10-02 | KAFKA-19467 | 4.2 | |
| 1189 | Allow custom topic configurations for RemoteStorageManager TLDR: Enables RemoteStorageManager (RSM) plugins to declare custom per-topic configuration properties and have the broker forward those topic-level configs to the RSM at runtime. The broker currently forwards only broker-level configs prefixed with rsm.config to the RSM, with no mechanism for RSM plugins to express or receive topic-level configuration overrides (e.g., different remote storage buckets per topic). |
Tiered Storage | Discussion | Ivan Yurchenko | 2025-06-30 | 2025-12-04 | KAFKA-19448 | ||
| 1188 | New ConnectorClientConfigOverridePolicy with allowlist of configurations TLDR: Introduces a new built-in ConnectorClientConfigOverridePolicy named AllowlistConnectorClientConfigOverridePolicy with a configurable connector.client.config.override.policy.allowlist that restricts connector config overrides to an explicit set. Multiple CVEs (including RCE via SASL JAAS JndiLoginModule) exploited the default All policy; the None policy is too restrictive while the Principal policy proved unsafe. |
Connect | Accepted | Mickael Maison | 2025-06-20 | 2025-10-24 | KAFKA-19824 | 4.2 | |
| 1187 | Support to retrieve remote log size via DescribeLogDirs RPC TLDR: Extends DescribeLogDirs RPC and the kafka-log-dirs CLI to report remote (tiered) log size per partition in addition to local log size. KIP-405 introduced tiered storage, but remote segment metadata (size, offsets) is only used internally by the broker—administrators have no CLI tool to observe how much data is stored remotely, complicating capacity planning and tiered storage management. |
Tiered Storage Admin | Discussion | PoAn Yang | 2025-06-15 | 2026-03-22 | KAFKA-19368 | DescribeLogDirs | |
| 1186 | Update AddRaftVoterRequest RPC to support auto-join TLDR: Fixes a deadlock in AddRaftVoterRequest handling by allowing the in-flight voter-add RPC to be superseded by new Fetch RPCs when the adding controller is also needed to commit the voter set change. The existing implementation required the controller being added to send a Fetch to commit the new voter set, but that controller's single in-flight slot was occupied by the pending AddRaftVoterRequest, causing the quorum leader to resign via checkQuorumTimer expiry. |
KRaft | Accepted | Kevin Wu | 2025-06-11 | 2025-11-25 | KAFKA-19400 | 4.2 | Fetch AddRaftVoter RemoveRaftVoter |
| 1183 | Unified Shared Storage TLDR: Proposes a unified log layer for Apache Kafka that supports both local replication-based storage and cloud shared storage (block, file, object) simultaneously, making Kafka stateless and removing the need for ISR replication when running on durable shared storage. Kafka's ISR replication was designed for local disks and duplicates durability guarantees already provided by cloud shared storage services, wasting resources and preventing stateless elasticity. |
Tiered Storage Broker | Discussion | Xinyu Zhou | 2025-05-13 | 2025-05-16 | |||
| 1182 | Quality of Service (QoS) Framework TLDR: Proposes a Quality of Service (QoS) framework for Kafka that allows producers, brokers, and consumers to declare desired performance characteristics (latency SLA, throughput targets) and for the cluster to expose its actual QoS capabilities and observations. Kafka and its compatible implementations vary significantly in performance and reliability characteristics with no standard mechanism for clients to negotiate or discover these constraints. |
Broker Admin | Discussion | Peter Corless | 2025-05-09 | 2025-05-09 | |||
| 1180 | Add generic feature level metrics TLDR: Adds generic feature-level JMX metrics (FinalizedLevel, minimum-supported-level, maximum-supported-level) tagged by feature name for each production feature (metadata.version, kraft.version, transaction.version, group.version, etc.). Only metadata.version had a dedicated metric; as new features were added (ELR version, share version, streams version), there was no way to monitor their finalized or supported levels for safe upgrade/downgrade planning. |
Metrics KRaft | Accepted | Kevin Wu | 2025-05-07 | 2025-07-08 | KAFKA-19254 | 4.2 | |
| 1178 | Introduce remote.max.partition.fetch.bytes.config in Consumer TLDR: Introduces a new consumer config remote.max.partition.fetch.bytes that controls the maximum bytes per partition returned from tiered remote storage in a Fetch request, propagated to the broker via a new FetchRequest v18 field. The existing max.partition.fetch.bytes applied to both local and remote reads uniformly, but remote storage plugins are often tuned for larger read chunks (e.g., 4MB), and reading only 1MB per remote fetch increased round-trip overhead without benefiting local read performance. |
Consumer Tiered Storage | Discussion | Kamal Chandraprakash | 2025-05-07 | 2025-05-08 | KAFKA-15777 | Fetch | |
| 1176 | Tiered Storage for Active Log Segment TLDR: Proposed extending tiered storage to upload the active (open) log segment to object storage so that follower replicas could replicate from object storage rather than from the leader broker. This KIP was withdrawn; the use case is covered by the Diskless Topics design (KIP-1150) which takes a more fundamental approach to eliminating local disk replication. |
Tiered Storage | Discarded | Henry Cai | 2025-05-01 | 2025-12-18 | KAFKA-19225 | Produce Fetch | |
| 1175 | Fix the typo `PARTITIONER_ADPATIVE_PARTITIONING_ENABLE` in ProducerConfig TLDR: Deprecates the misspelled ProducerConfig constant PARTITIONER_ADPATIVE_PARTITIONING_ENABLE_CONFIG and introduces the correctly-spelled PARTITIONER_ADAPTIVE_PARTITIONING_ENABLE_CONFIG alias for removal in the next major version. The typo in the Java constant name made documentation, code search, and autocomplete unreliable for users configuring adaptive partitioning. |
Producer | Accepted | Ming-Yen Chung | 2025-04-30 | 2025-06-12 | KAFKA-18068 | 4.2 | |
| 1173 | Connect Storage Topics Sharing Across Clusters TLDR: Proposes allowing multiple Kafka Connect clusters to share a single set of internal storage topics (offset, status, config) rather than requiring three dedicated topics per cluster. Each new Connect cluster requires three compacted topics, leading to topic proliferation, cross-team provisioning bottlenecks, and excessive compaction overhead as deployment scale grows. |
Connect | Discussion | Pritam Kumar Mishra | 2025-04-27 | 2025-04-28 | KAFKA-19211 | ||
| 1166 | Improve high-watermark replication TLDR: Propagates the follower's current high-watermark (HWM) in the KRaft Fetch request so the leader can immediately complete parked Fetch requests when only the HWM has advanced (without new records). KRaft could not distinguish whether a follower was behind on records versus behind on HWM, so it unconditionally parked Fetch requests for up to 500ms even when only the HWM needed propagation, causing admin operation metadata updates to appear delayed by up to 500ms. |
Broker | Accepted | José Armando García Sancio | 2025-04-21 | 2025-08-27 | KAFKA-19223 | 4.1 | Fetch v18 |
| 1164 | Diskless Coordinator TLDR: Describes the Diskless Coordinator, a new component that manages per-partition metadata (batch locations, WAL file references, producer state, transaction state) for Diskless Topics on behalf of brokers. Because Diskless Topics bypass local disk and direct replication, brokers need an authoritative external metadata service to answer offset queries, commit batch writes, and coordinate truncation. |
Tiered Storage Broker | Discussion | Giuseppe Lillo | 2025-04-16 | 2026-02-27 | KAFKA-19161 | ||
| 1163 | Diskless Core TLDR: Specifies the produce and consume paths for Diskless Topics (introduced by KIP-1150), where records are written to shared object storage via a Write-Ahead Log (WAL) without appending to local block devices or performing direct ISR replication. The existing classic topic architecture requires durable local block storage and peer-to-peer replication, making it unsuitable for cloud environments where object storage is cheaper and more elastic. |
Tiered Storage Broker | Discussion | Ivan Yurchenko | 2025-04-16 | 2026-03-09 | KAFKA-19161 | Fetch Metadata | |
| 1161 | Unifying LIST-Type Configuration Validation and Default Values TLDR: Standardizes LIST-type configuration validation across Kafka to disallow empty lists and duplicate values in configs where such inputs cause misconfiguration or silent semantic redundancy. Previously, LIST-type configs accepted empty or null values without error and allowed duplicates without warning, misleading operators who expected validation at configuration load time. |
Admin Client | Accepted | Ken Huang | 2025-04-12 | 2026-03-11 | KAFKA-19112 | 4.2 | |
| 1160 | Enable returning supported features from a specific broker TLDR: Adds an optional nodeId parameter to the DescribeFeaturesOptions in AdminClient and a --node-id argument to the kafka-features.sh describe command so that supported feature versions can be queried from a specific node. Supported feature levels depend on the unstable.api.versions.enable flag which can differ per node, but AdminClient previously sent the request to an arbitrary node, making it impossible to check a specific node's supported range. |
Admin Protocol | Accepted | PoAn Yang | 2025-04-11 | 2025-08-04 | KAFKA-18786 | 4.2 | |
| 1159 | Large message reference based Serializer TLDR: Proposes a reference-based Serializer/Deserializer that transparently stores large payloads in external object storage (e.g., S3) and embeds only the storage reference in the Kafka message. Kafka's hard limit on message.max.bytes cannot be raised indefinitely without causing broker memory, network, and stability issues, and existing workarounds (chunking, manual reference patterns) require complex application-level logic. |
Client Producer | Discussion | Omnia Ibrahim | 2025-04-10 | 2025-08-13 | KAFKA-19125 | ||
| 1157 | Enforce KafkaPrincipalSerde Implementation for KafkaPrincipalBuilder TLDR: Makes KafkaPrincipalBuilder extend KafkaPrincipalSerde, converting the need for custom principals to implement serialization into a compile-time requirement rather than a runtime failure. KRaft requires brokers to forward requests to controllers with serialized principal objects; without KafkaPrincipalSerde, custom KafkaPrincipalBuilder implementations silently failed at runtime during request forwarding. |
Security | Accepted | Szu-Yung Wang | 2025-04-08 | 2025-06-13 | KAFKA-18926 | 4.2 | |
| 1155 | Metadata Version Downgrades TLDR: Introduces metadata version downgrade support, allowing a KRaft cluster to roll back its MetadataVersion to an earlier value after a failed or undesirable upgrade. Currently, MetadataVersion is a one-way ratchet—clusters can upgrade but never downgrade—meaning operators who encounter bugs in a new metadata version have no recovery path short of a full cluster rebuild. |
KRaft | Discussion | Colin McCabe | 2025-04-07 | 2025-09-22 | KAFKA-19104 | ||
| 1154 | Extending support for Microsecond Precision for Kafka Connect TLDR: Extends Kafka Connect's timestamp logical type support to include microsecond (`timestamp-micros`) and nanosecond (`timestamp-nanos`) precision alongside the existing millisecond-only `Timestamp` type. Source systems and formats such as Avro and Parquet support sub-millisecond timestamp precision; Connect's millisecond truncation silently loses precision when ingesting or egressing high-resolution timestamp data. |
Connect | Discussion | Pritam Kumar Mishra | 2025-04-05 | 2025-04-17 | KAFKA-19086 | ||
| 1153 | Refactor Kafka Streams CloseOptions to Fluent API Style TLDR: Deprecates KafkaStreams.CloseOptions (inner class with public constructor) and replaces it with a new top-level CloseOptions class using a fluent builder API, aligned with KIP-1092's CloseOptions for the consumer. The existing CloseOptions exposed a public constructor rather than following the fluent API style used consistently throughout Kafka Streams configuration objects. |
Streams | Accepted | Ken Huang | 2025-04-04 | 2025-10-07 | KAFKA-18193 | 4.2 | |
| 1152 | Add transactional ID pattern filter to ListTransactions API TLDR: Adds a `TransactionalIdPattern` (RE2J regex) filter to the `ListTransactions` admin API so clients can filter transactions by transactional ID pattern server-side alongside the existing state, producer ID, and duration filters. On large clusters with millions of transactional IDs, fetching all transactions and filtering client-side wastes network bandwidth and broker CPU. |
Transactions Admin | Accepted | Calvin Liu | 2025-04-01 | 2025-04-28 | KAFKA-19073 | 4.1 | ListTransactions v2 |
| 1151 | Minimal movement replica balancing algorithm for reassignment TLDR: Proposes a minimal-movement replica balancing algorithm for `kafka-reassign-partitions.sh` that computes an assignment minimizing the number of replica relocations while still achieving even distribution across brokers. The current `kafka-reassign-partitions.sh --generate` algorithm does not consider the existing replica placement, producing assignments that move far more replicas than necessary and causing excessive bandwidth consumption during rebalancing. |
Broker Admin | Discussion | pengjialun | 2025-03-27 | 2025-04-22 | |||
| 1150 | Diskless Topics TLDR: Introduces Diskless Topics, a new topic type that stores all data in shared object storage (e.g., S3) without writing to local broker disk or performing ISR-based replication. Classic Kafka topics require expensive replicated block storage for active segments even in cloud environments where high-durability object storage is far more cost-effective, blocking truly elastic, serverless Kafka deployments. |
Tiered Storage Broker | Accepted | Josep Prat | 2025-03-24 | 2026-03-02 | KAFKA-19161 | ||
| 1149 | Helm Chart for Apache Kafka TLDR: Proposes an official Apache Kafka Helm chart for Kubernetes deployments using the Apache-published Docker image, with all broker properties templated via Helm values and stored as ConfigMaps. Existing Kubernetes deployment options (Bitnami Helm chart, Confluent/Strimzi operators) use proprietary container images and require additional managed-service components, while no official lightweight chart existed using the upstream Apache image. |
Broker | Discussion | Steve A | 2025-03-24 | 2025-03-24 | KAFKA-6416 | ||
| 1148 | Remove log.cleaner.enable and set lower bound 1 to log.cleaner.threads TLDR: Deprecates `log.cleaner.enable` and removes it in Kafka 5.0, and sets a lower bound of 1 on `log.cleaner.threads`. The configuration is effectively a no-op since any compacted topic requires an active log cleaner; keeping it creates misleading silent failures (compaction silently stops) and unnecessary null-check complexity throughout the `LogCleaner`/`LogManager` code paths. |
Broker | Accepted | TengYao Chi | 2025-03-24 | 2025-05-20 | KAFKA-13610 | 4.1 | |
| 1147 | Improve consistency of command-line arguments TLDR: Adds --bootstrap-server to kafka-producer-perf-test.sh, introduces --command-property as a consistent replacement for --producer-props, and adds --command-property to kafka-consumer-perf-test.sh and kafka-share-consumer-perf-test.sh. Different perf tools used different argument names for the same concepts (e.g., bootstrap.servers vs --bootstrap-server), creating unnecessary friction and inconsistency for operators running comparative benchmarks. |
Admin | Accepted | Andrew Schofield | 2025-03-16 | 2025-09-19 | KAFKA-19487 | 4.2 | |
| 1146 | Anchored punctuation TLDR: Adds anchored (wall-clock-aligned) punctuation to Kafka Streams so that periodic callbacks can be triggered at fixed calendar intervals (e.g., every hour at HH:00:00) regardless of when the punctuation was registered. Existing wall-clock punctuation starts its interval from registration time, producing non-deterministic trigger offsets relative to wall clock, which caused complex workarounds for time-aligned processing in sectors like energy grid balancing. |
Streams | Accepted | Herman Kolstad Jakobsen | 2025-03-15 | 2025-07-01 | KAFKA-7699 | 4.2 | |
| 1144 | Exposing a new public REST API for MirrorMaker2 TLDR: Introduces a dedicated REST API layer for MirrorMaker2 that abstracts over the Connect REST API, exposing replication-flow–centric endpoints (flows, connectors, tasks, replicated topics) rather than generic connector endpoints. Operators currently must use low-level Connect REST endpoints or parse logs to inspect MirrorMaker2 state; the lack of a purpose-built API makes automation and monitoring difficult. |
MirrorMaker Admin | Discussion | Bertlan Kondrat | 2025-03-12 | 2025-03-14 | KAFKA-18985 | ||
| 1142 | Allow to list non-existent group which has dynamic config TLDR: Extends `kafka-configs.sh --describe --all-groups` to include groups that have dynamic configuration but no active members or committed offsets. Currently, listing all groups' dynamic configs is impossible because `--all-groups` only surfaces groups known to the group coordinator, excluding groups that exist solely via config records. |
Admin | Accepted | PoAn Yang | 2025-03-11 | 2025-06-02 | KAFKA-18904 | 4.1 | ListConfigResources v1 |
| 1140 | Avoid to return null value in Map from public api of consumer TLDR: Changes `KafkaConsumer.offsetsForTimes()` to return `Optional<OffsetAndTimestamp>` instead of `OffsetAndTimestamp` (nullable) for partitions with no matching timestamp, removing null values from the returned map. Returning null values in a `Map` violates the principle of least surprise and causes `NullPointerException` when callers iterate the map without null checks. |
Consumer | Discussion | Chia-Chuan Yu | 2024-10-31 | 2025-05-04 | KAFKA-17826 | ||
| 1139 | Add support for OAuth jwt-bearer grant type TLDR: Adds support for the OAuth 2.0 `urn:ietf:params:oauth:grant-type:jwt-bearer` grant type (RFC 7523) in Kafka clients alongside the existing `client_credentials` grant, allowing authentication via a signed JWT assertion instead of a plaintext secret. The `client_credentials` grant requires embedding plaintext secrets in client configuration, which many organizations and cloud providers prohibit for security compliance. |
Security | Accepted | Kirk True | 2025-02-07 | 2025-05-13 | KAFKA-18573 | 4.1 | |
| 1138 | Clean up TopologyConfig and API for supplying configs needed by the topology TLDR: Cleans up the Kafka Streams configuration API by deprecating TopologyConfig and requiring topology-specific configs (topology.optimization, processor.wrapper.class, etc.) to be passed via StreamsConfig directly to both KafkaStreams and the topology builder. Topology-specific configs passed only to KafkaStreams were silently ignored at the topology level, and multiple partially-overlapping config APIs (TopologyConfig, StreamsBuilder constructor, Topology constructor) caused fragmentation and misconfiguration. |
Streams | Accepted | Sébastien Viale | 2025-02-26 | 2025-09-03 | KAFKA-18053 | ||
| 1137 | Standardize Configuration Precedence for Tool Scripts TLDR: Establishes a uniform configuration precedence order (CLI args > CLI properties > config files > defaults) across all Kafka tool scripts, with a new --modern flag for scripts that don't yet follow it. Inconsistent precedence across tools caused user confusion and made configurations from files or properties silently ignored when multiple sources were present. |
Admin | Discussion | Jhen-Yung Hsu | 2025-02-23 | 2025-11-06 | KAFKA-10043 | 2.6 | |
| 1136 | Make ConsumerGroupMetadata an interface TLDR: Makes ConsumerGroupMetadata an interface instead of a concrete class so that frameworks (e.g., Kafka Streams with EOS) can provide custom implementations without depending on the internal constructor. The class is passed to KafkaProducer.sendOffsetsToTransaction() to carry group metadata for transactional offset commits; making it an interface enables mock implementations in tests and custom metadata carriers in frameworks. |
Transactions Consumer | Accepted | Paweł Szymczyk | 2025-02-21 | 2025-10-20 | KAFKA-18836 | 4.2 | |
| 1134 | Multi-tenancy in Kafka: Virtual Clusters TLDR: Introduces Virtual Clusters as a multi-tenancy layer on top of a shared Kafka cluster, providing each tenant with isolated topic namespaces, quotas, and ACLs through a proxy or broker-side virtualization layer. Operators running multi-tenant Kafka must today either provision separate physical clusters per tenant (expensive, wasteful) or manually partition namespaces with naming conventions and ACLs (error-prone, no true isolation). |
Broker Security | Discussion | Viktor Somogyi | 2025-02-13 | 2025-12-18 | KAFKA-18793 | ||
| 1133 | AK Documentation and Website in Markdown TLDR: Migrates the Apache Kafka documentation website from raw HTML with Handlebars.js templates and server-side includes to Markdown with a Hugo/Docsy static site generator. The existing raw HTML documentation mixed content with styling, had inconsistent heading levels, required a web server for testing, and was a high barrier to contribution for developers unfamiliar with HTML/CSS. |
Admin | Accepted | Harish Vishwanath | 2025-02-02 | 2025-04-13 | KAFKA-14815 | ||
| 1132 | KRaft servers support invalid static SocketServer configurations TLDR: KIP-1132 fixes KRaft brokers to tolerate static `SocketServer` configurations that would be invalid in isolation (e.g., a listener referencing a certificate not yet present on disk) when a prior dynamic reconfiguration already makes the effective configuration valid. Brokers crashed on startup with an uncaught exception whenever the static config was syntactically invalid, even if the dynamic config override rendered it harmless. |
KRaft Broker | Discarded | Kevin Wu | 2025-01-29 | 2025-02-12 | KAFKA-17431 | 4.1 | |
| 1131 | Improved controller-side monitoring of broker states TLDR: Adds per-broker `BrokerRegistrationState` metrics and a `ControlledShutdownBrokerCount` metric on the KRaft controller to expose individual broker states without requiring log inspection. The existing `ActiveControllerCount` and fenced/unfenced broker count metrics provide only aggregate totals; diagnosing which specific broker entered a particular state requires manual parsing of the metadata log. |
KRaft Metrics | Accepted | Kevin Wu | 2025-01-24 | 2025-04-30 | KAFKA-18666 | 4.1 | |
| 1130 | Add metrics indicating the connection count exceeds TLDR: Adds two new broker metrics — `waiting-connection` (gauge of connections queued for an available slot) and `connection-latency` (histogram of connection wait time) — per listener at the Acceptor level. When clients hit broker connection limits or throttling, they receive only a timeout exception with no broker-side visibility into how many connections are backlogged or how long they wait. |
Broker Metrics | Discussion | TengYao Chi | 2025-01-27 | 2025-05-05 | KAFKA-18455 | ||
| 1128 | Replace KTable.transformValues with KTable.processValues and add new KStreams.process TLDR: Deprecates KTable.transformValues (still using the old Processor API) and replaces it with KTable.processValues using the new typed Processor API; also deprecates KStreams.processValues to fix a backward-incompatibility bug introduced in its implementation. The old Processor API classes could not be fully removed in Kafka 4.0 because KTable.transformValues still referenced them, and a bug in KStream.processValues made migration via the new API unsafe. |
Streams | Discussion | Matthias J. Sax | 2025-01-24 | 2025-11-27 | KAFKA-17178 | ||
| 1127 | Flexible Windows for Late Arriving Data TLDR: Adds a new `FlexibleWindows` window type to Kafka Streams that retains open windows indefinitely until explicitly closed by a grace period expiry or new records, rather than discarding late-arriving records that fall outside a fixed window end time. Existing Kafka Streams window types (tumbling, hopping, sliding, session) all rely on event-time boundaries and discard truly late data, which breaks use cases such as micro-batching by offset order where no record should ever be dropped. |
Streams | Discussion | Almog Gavra | 2025-01-23 | 2025-02-24 | KAFKA-18626 | ||
| 1126 | Serialize changes to Kafka with a build queue TLDR: Adopts GitHub Merge Queues to serialize all changes to Kafka's mainline branches so each commit is validated against the current HEAD before merging. As the contributor rate increases, concurrent PR merges without serialization allow individual green PRs to have bad interactions that break the main branch when combined. |
Admin | Accepted | David Arthur | 2025-01-21 | 2025-02-13 | KAFKA-18789 | ||
| 1125 | Remove Invalid 'numberOfOpenFiles' Metric from RocksDB State Store TLDR: Removes the `opennumberfiles` metric from the RocksDB state store in Kafka Streams, which always returns -1 since RocksDB 8.7.3 removed the underlying `NO_FILE_CLOSES` internal counter. The stale metric misleads operators and can trigger false alarms in monitoring systems. |
Streams Metrics | Discussion | Swikar Patel | 2025-01-13 | 2025-01-23 | KAFKA-18495 | 5.0 | |
| 1123 | Rack-aware partitioning for Kafka Producer TLDR: Adds rack-aware (AZ-aware) partitioning to the Kafka producer: when client.rack is set, the producer preferentially routes to partitions whose leaders reside in the same AZ, reducing cross-AZ network traffic and latency. The default adaptive partitioner rotated across all partitions without rack awareness, causing producers to regularly send records cross-AZ and incur inter-AZ data transfer costs in cloud deployments. |
Producer | Accepted | Ivan Yurchenko | 2024-12-20 | 2025-05-19 | KAFKA-19193 | ||
| 1122 | Create a dedicated data module for Kafka Connect data classes TLDR: Extracts the Kafka Connect data format classes (Schema, Struct, SchemaBuilder, etc.) from the `connect-api` module into a standalone `connect-data` module. This removes the coupling that forces non-Connect projects to depend on the entire Connect API just to use Kafka's structured data model. |
Connect | Discussion | Mario Fiore Vitale | 2024-12-18 | 2025-01-03 | KAFKA-18299 | ||
| 1121 | Compression acceleration in Kafka - Apache Kafka - Apache Software Foundation TLDR: Proposes a hardware acceleration framework for Kafka's existing compression algorithms (gzip, zstd, snappy, lz4), allowing producers, brokers, and consumers to offload compression/decompression to hardware accelerators (e.g., Intel QAT, GPU) without changing the wire format. Compression improves throughput and reduces storage costs but adds CPU latency; hardware accelerators can reduce that latency without sacrificing compression ratio, but Kafka has no pluggable acceleration abstraction. |
Broker Producer | Discussion | Olasoji Denloye | 2024-12-17 | 2025-11-05 | |||
| 1120 | AppInfo metrics don't contain the client-id TLDR: Fixes AppInfo metrics registration for Kafka Connect workers and MirrorMaker 2 clients to include the client-id label, aligning them with the behavior of standalone producer, consumer, and admin clients. AppInfo metrics (start-time-ms, commit-id, version) lack client-id tags for worker and MM2 contexts, making it impossible to distinguish metrics from multiple workers or MM2 instances sharing the same JVM. |
Metrics Connect | Accepted | Ken Huang | 2024-12-13 | 2026-03-11 | KAFKA-15186 | 4.2 | |
| 1119 | Add support for SSL hot reload TLDR: Adds `ssl.auto.reload` configuration to Kafka clients (producers, consumers) and brokers so SSL/TLS certificates are automatically reloaded when updated on disk without requiring a restart. Currently, only brokers support dynamic SSL certificate rotation via dynamic config; producers and consumers require disruptive restarts when certificates are rotated by external agents. |
Security Client | Discussion | Moncef Abboud | 2024-12-02 | 2025-03-20 | KAFKA-10731 | ||
| 1118 | Add Deadlock Protection on Producer Network Thread TLDR: Adds deadlock detection to `KafkaProducer` so that calling `Producer#flush()` from within a `send()` callback throws `KafkaException` instead of blocking indefinitely. Invoking `flush()` from the I/O thread that executes send callbacks creates an irresolvable circular wait, causing application hangs that are difficult to diagnose. |
Producer | Accepted | TengYao Chi | 2024-11-30 | 2025-01-03 | KAFKA-10790 | 4.1 | |
| 1117 | Support keystore with multiple alias entries TLDR: Adds a new ssl.keystore.alias config that instructs the DefaultSslEngineFactory to select a specific key alias from a multi-entry keystore when constructing the SSLContext, rather than defaulting to the first entry. Keystores with multiple key entries caused SSL handshake failures or wrong-certificate authentication because Kafka had no mechanism to select a specific alias. |
Security | Discussion | Rahul Nirgude | 2024-11-27 | 2025-04-13 | |||
| 1116 | Adding new Principal Types on Standard ACL side for filtering KafkaPrincipal TLDR: KIP-1116 proposes adding new principal types beyond `User` to the standard ACL system so that ACL rules can match groups of `KafkaPrincipal` identities without losing the original client identity in logs. Current workarounds using principal mapping rules (to embed group membership in the principal name) discard the unique client identity, making audit log attribution impossible. |
Security | Discussion | Franck LEDAY | 2024-11-24 | 2024-11-24 | KAFKA-16707 | ||
| 1115 | Bazel Builds TLDR: Proposes migrating Apache Kafka's build system from Gradle to Bazel, leveraging Bazel's fine-grained build graph, aggressive caching, and hermetic toolchains. Gradle builds had non-deterministic dependency graphs (no lockfile), per-module dependency resolution (risking version conflicts), and CI times of ~2 hours average, while internal Confluent Bazel builds achieved ~30-minute average CI times. |
Broker | Discussion | Vince Rose | 2024-11-21 | 2024-11-21 | |||
| 1114 | Introducing Chunk in Partition TLDR: Proposes splitting a Kafka partition's data into immutable inactive chunks and a single active chunk, allowing inactive chunks to be reassigned to different brokers independently of the active chunk. The entire partition replica is managed as one unit for assignment and rebalancing, making partition moves expensive even when only the active (tail) data matters for most operations. |
Broker | Discussion | De Gao | 2024-11-19 | 2025-02-18 | Metadata | ||
| 1112 | allow custom processor wrapping TLDR: Adds a `ProcessorWrapper` interface to Kafka Streams (`StreamsConfig.PROCESSOR_WRAPPER_CLASS_CONFIG`) that allows users to wrap every processor in a topology with a common decorator (logging, tracing, error handling) without modifying each processor individually. Applying cross-cutting concerns to every processor currently requires manual wrapping at every PAPI processor and is completely impossible for DSL-generated processors. |
Streams | Accepted | A. Sophie Blee-Goldman | 2024-11-16 | 2024-11-21 | KAFKA-18026 | 4.0 | |
| 1111 | Enforcing Explicit Naming for Kafka Streams Internal Topics TLDR: Adds a validation mode to Kafka Streams that enforces explicit naming of all internal repartition and changelog topics, failing topology construction when auto-generated names are used. Auto-generated names embed topology node indices, so any topology restructuring silently renames internal topics and causes state loss or repartition mismatches. |
Streams | Accepted | Sébastien Viale | 2024-11-15 | 2024-12-16 | KAFKA-18023 | 4.1 | |
| 1109 | Unifying Kafka Consumer Topic Metrics TLDR: Standardizes Kafka consumer topic-level metric names to use the original topic name (preserving dots) instead of replacing dots with underscores, aligning consumer metric naming with the existing producer convention. The current asymmetry between producer metrics (dots preserved) and consumer metrics (dots replaced by underscores) causes confusion and breaks monitoring dashboards that use unified naming. |
Consumer Metrics | Accepted | Apoorv Mittal | 2024-11-12 | 2024-12-17 | KAFKA-12469 | 4.1 | |
| 1107 | topic-level acks and compressions for producer TLDR: Allows producers to override `acks` and `compression.type` on a per-topic basis rather than only at the producer level. Since `KafkaProducer` is designed to be shared across threads writing to different topics, there is no current mechanism to send to one topic with `acks=all` and another with `acks=1` from the same producer instance. |
Producer Broker | Discussion | TaiJuWu | 2024-11-04 | 2025-01-03 | KAFKA-17930 | Produce | |
| 1106 | Add duration based offset reset option for consumer clients TLDR: Adds a duration-based offset reset option (`by-duration`) to consumer clients and the `kafka-consumer-groups.sh` tool, allowing consumers to reset to the offset corresponding to `now - duration` rather than only to earliest/latest/specific timestamp. With tiered storage enabling long retention, operators need a human-friendly way to seek back by time interval without computing an absolute timestamp. |
Consumer Admin | Accepted | Manikumar Reddy O. | 2024-11-04 | 2024-12-12 | KAFKA-17934 | 4.0 | |
| 1105 | Make remote log manager thread-pool configs dynamic TLDR: Makes the `RemoteLogManager` thread pool sizes (`remote.log.manager.copier.thread.pool.size`, `remote.log.manager.expiration.thread.pool.size`, `remote.log.reader.threads`) dynamically reconfigurable without broker restart. Under variable tiered storage load, operators need to tune concurrency at runtime to avoid backlogs or over-provisioning. |
Tiered Storage | Accepted | Kamal Chandraprakash | 2024-11-03 | 2024-11-12 | KAFKA-17928 | 4.0 | |
| 1104 | Allow Foreign Key Extraction from Both Key and Value in KTable Joins TLDR: Adds a BiFunction<K, V, KO> foreignKeyExtractor overload to KTable.join() and related DSL methods so that the foreign key can be derived from both the record key and value, not just the value. The existing Function<V, KO> extractor forced users to duplicate key fields into the record value whenever the foreign key depended on the message key, creating data redundancy and potential inconsistency. |
Streams | Accepted | Chu Cheng Li | 2024-10-30 | 2025-03-27 | KAFKA-17893 | 4.1 | |
| 1103 | Additional metrics for cooperative consumption TLDR: Adds broker-side Share Group metrics: TotalShareFetchRequestsPerSec, FailedShareFetchRequestsPerSec, TotalShareAcknowledgementRequestsPerSec, and related per-topic and purgatory metrics. Share Group (KIP-932) cooperative consumption had no dedicated broker metrics for share fetch requests, acknowledgements, or Delayed Share Fetch purgatory behavior, preventing observability and performance diagnosis. |
Consumer Metrics | Accepted | Apoorv Mittal | 2024-10-29 | 2025-05-27 | KAFKA-17894 | 4.1 | Fetch ShareFetch |
| 1102 | Enable clients to rebootstrap based on timeout or error code TLDR: Extends client rebootstrap (KIP-899) to trigger not only when all metadata nodes are unreachable, but also after configurable timeouts or on specific broker error codes indicating stale metadata. This covers edge cases such as rolling broker replacements where old IPs become unresponsive but no connection error is immediately returned. |
Client | Accepted | Rajini Sivaram | 2024-10-28 | 2024-11-08 | KAFKA-17885 | 4.0 | Metadata |
| 1101 | Trigger rebalance on rack topology changes TLDR: Replaces per-partition rack topology data (the full set of replica racks) in the ConsumerGroup subscribed topic metadata with a compact hash, and triggers rebalances when the hash changes due to partition additions or rack changes. Storing the full rack set per partition for groups with thousands of members and thousands of partitions consumed disproportionate memory (up to 79% of ConsumerGroup object size in production cases) after the new consumer group protocol (KIP-848) was introduced. |
Consumer | Accepted | PoAn Yang | 2024-10-28 | 2025-04-17 | KAFKA-17578 | 4.0 | Heartbeat ConsumerGroupHeartbeat |
| 1100 | Rename org.apache.kafka.server:type=AssignmentsManager and org.apache.kafka.storage.internals.log.RemoteStorageThreadPool metrics TLDR: Renames the incorrectly prefixed metrics org.apache.kafka.server:type=AssignmentsManager and org.apache.kafka.storage.internals.log:type=RemoteStorageThreadPool back to the canonical kafka.server:type=AssignmentsManager and kafka.log.remote:type=RemoteStorageThreadPool naming convention. A class relocation and a patch introduced non-standard package-based JMX bean prefixes that diverged from Kafka's kafka.COMPONENT convention, breaking monitoring dashboards. |
Metrics Tiered Storage | Accepted | Ken Huang | 2024-10-26 | 2025-07-31 | KAFKA-17876 | 4.2 | |
| 1099 | Extend kafka-consumer-groups command line tool to support new consumer group TLDR: Extends ConsumerGroupDescription with optional groupEpoch and targetAssignmentEpoch fields, MemberDescription with memberEpoch and upgraded fields, and ShareGroupDescription with groupEpoch and targetAssignmentEpoch, surfacing these in kafka-consumer-groups.sh and kafka-share-groups.sh. The new consumer group protocol (KIP-848) introduced epoch-based assignment tracking but the admin tooling had no visibility into group or member epochs. |
Consumer Admin | Accepted | PoAn Yang | 2024-10-20 | 2024-12-06 | KAFKA-17750 | 4.0 | ConsumerGroupDescribe v1 |
| 1098 | Reverse Checkpointing in MirrorMaker TLDR: Extends MirrorMaker2 checkpointing to support reverse replication so that consumer group offsets made on the downstream cluster are checkpointed back upstream, enabling minimal reprocessing on failback after a failover. In bidirectional replication, the standard checkpoint flow only covers failover (upstream→downstream direction), leaving consumers that progressed on the downstream cluster unable to resume accurately when switching back. |
MirrorMaker Consumer | Discussion | Daniel Urban | 2024-10-18 | 2024-10-28 | KAFKA-17828 | ||
| 1097 | Add Kafka Connect exception handler TLDR: Proposes a pluggable ErrorHandler interface for Kafka Connect that allows connector developers to define custom per-record error handling logic (DROP, FAIL, ACK) during specific processing stages. Connect's existing error tolerance (none/all) and dead letter queue were coarse-grained, with no way to implement stage-specific or record-specific handling logic comparable to Kafka Streams' ProcessingExceptionHandler. |
Connect | Discussion | Anton Liauchuk | 2024-10-07 | 2025-06-25 | |||
| 1095 | Kafka Canary Isolation TLDR: Proposes a Kafka Canary service that sends small probe messages through the full produce/consume path in isolation from production traffic, enabling SLA monitoring and early detection of deployment-induced failures. Even with thorough testing, bad deployments (misconfiguration, untested scenarios) can cause silent data loss or latency regressions that only manifest in production, requiring an automated canary layer separate from business traffic. |
Metrics Broker | Discussion | Zhifeng Chen | 2024-09-20 | 2024-09-30 | |||
| 1094 | Add a new constructor with nextOffsets to ConsumerRecords TLDR: Adds a `nextOffsets` map to `ConsumerRecords` that exposes, per partition, the offset to be fetched in the next `poll()` call (i.e., `lastOffset + 1`), pre-wrapped in `OffsetAndMetadata`. This allows Kafka Streams and other frameworks to commit precise next-fetch offsets without re-computing them from the last received record. |
Consumer | Accepted | Alieh Saeedi | 2024-09-24 | 2024-10-14 | KAFKA-17600 | 4.0 | |
| 1093 | Add more observability for MirrorMaker2 TLDR: Adds high-level MirrorMaker2 metrics for `topic-count`, `partition-count` per topic, `consumer-group-count`, and `consumer-group-offset` to complement the existing record-level metrics. The current MM2 metrics cover data throughput and replication latency but provide no visibility into the aggregate number of topics and consumer groups being synchronized, making capacity planning and correctness verification opaque. |
MirrorMaker Metrics | Discussion | PoAn Yang | 2024-09-16 | 2024-09-23 | KAFKA-17220 | ||
| 1091 | Improved Kafka Streams operator metrics TLDR: Introduces numeric JMX metrics for StreamThread state (CREATED/RUNNING/DEAD etc.) and KafkaStreams client state, plus a metrics.recording.level metric, enabling operator monitoring systems that require numeric gauges rather than string values. Existing Kafka Streams state metrics were string-typed and incompatible with KIP-714/KIP-1076 telemetry pipelines, making automated health alerting difficult. |
Streams Metrics | Discussion | Bill Bejeck | 2024-09-16 | 2025-11-04 | KAFKA-17561 | 4.0 | |
| 1090 | Flaky Test Management TLDR: Introduces a formal flaky test management process including a `@Flaky` annotation to tag known-flaky tests, a threshold mechanism to quarantine tests exceeding a flakiness rate, and a dedicated re-run strategy for annotated tests. Unmanaged flaky tests erode CI confidence, mask real failures, and waste infrastructure by requiring full suite re-runs up to three times to achieve reliable results. |
Testing | Accepted | David Arthur | 2024-09-15 | 2024-10-02 | KAFKA-17629 | ||
| 1089 | Allow disabling heartbeats replication in MirrorSourceConnector TLDR: Adds a `replication.heartbeats.topic.enabled` configuration option to `MirrorSourceConnector` to allow users to opt out of replicating the heartbeat topic between clusters. When running multiple `MirrorSourceConnector` instances replicating different topic subsets between the same two clusters, heartbeat topic replication cannot be scoped to a single connector and creates unwanted topic duplication. |
MirrorMaker | Accepted | Daniel Urban | 2024-09-12 | 2024-10-08 | KAFKA-17534 | 4.0 | |
| 1088 | Replace KafkaClientSupplier with KafkaClientInterceptor TLDR: Replaces the `KafkaClientSupplier` interface in Kafka Streams with a new `KafkaClientInterceptor` that wraps clients after construction rather than supplying them whole, preserving the runtime's ability to build clients using internal APIs. As Kafka Streams moves toward using internal consumer primitives (KIP-1071) it can no longer treat client creation as a black-box user concern without losing the ability to co-develop clients and Streams internals. |
Streams | Discussion | Matthias J. Sax | 2024-09-04 | 2024-10-04 | KAFKA-17485 | ||
| 1087 | Removing intermediateTopicsOption from StreamsResetter TLDR: Deprecates and removes the `--intermediate-topics` option from the `StreamsResetter` tool in version 5.0. The `StreamsResetter`'s `intermediateTopicsOption` was only relevant for topologies using the `KStream#through()` method, which was deprecated and subsequently removed; the option is now a dead code path. |
Streams Admin | Accepted | Arnav Dadarya | 2024-09-01 | 2024-09-28 | KAFKA-17462 | 5.0 | |
| 1085 | Fix leaking *_DOC variables in StreamsConfig TLDR: Deprecates several `*_DOC` constants in `StreamsConfig` that have public visibility but are only used internally by `StreamsConfig` or `TopologyConfig`, with a plan to make them private in the next major release. Leaking internal documentation string constants as public API members unnecessarily enlarges the public surface area and risks accidental external usage that would prevent future refactoring. |
Streams | Accepted | Ken Huang | 2024-08-29 | 2024-09-12 | |||
| 1083 | Increase default value of task.shutdown.graceful.timeout.ms in Connect TLDR: KIP-1083 increases the default value of `task.shutdown.graceful.timeout.ms` in Kafka Connect from 5 seconds to a larger value. On large Connect workers running many tasks, 5 seconds is insufficient for all tasks to flush and commit their state before being forcibly terminated, leading to unnecessary reprocessing or data loss on restart. |
Connect | Discussion | Sagar Rao | 2024-08-14 | 2024-08-14 | KAFKA-15229 | ||
| 1082 | Require Client-Generated IDs over the ConsumerGroupHeartbeat RPC TLDR: Requires clients to generate and send member IDs on the initial `ConsumerGroupHeartbeat` RPC rather than letting the broker assign them, making the first heartbeat idempotent. Without client-generated IDs, a consumer that sends a join heartbeat but closes before receiving the response may issue a leave heartbeat with a null member ID, causing the broker to register an orphaned member that never expires. |
Consumer Protocol | Accepted | TengYao Chi | 2024-08-08 | 2024-10-04 | KAFKA-17116 | 4.0 | ConsumerGroupHeartbeat v1 Heartbeat |
| 1080 | Fix the typo: `maxlifeTimeMs` in CreateDelegationTokenOptions TLDR: Deprecates the misspelled `maxlifeTimeMs()` method in `CreateDelegationTokenOptions` and adds a correctly named `maxLifetimeMs()` replacement. The typo (`maxlifeTimeMs` vs. `maxLifetimeMs`) has been in the public API since delegation token creation was introduced, creating a poor developer experience and a breaking rename burden if left uncorrected past the next major release. |
Security Admin | Accepted | Ken Huang | 2024-08-12 | 2024-11-11 | KAFKA-17314 | 4.0 | |
| 1079 | Deprecate `delete-config` of TopicCommand TLDR: Deprecates the `--delete-config` option of `kafka-topics.sh` (`TopicCommand`) with a plan to remove it in Kafka 5.0. The `--delete-config` command is currently a no-op (it performs no actual deletion), so retaining it without deprecation misleads users into believing topic-level config keys can be removed via this path. |
Admin | Accepted | TengYao Chi | 2024-07-31 | 2024-09-18 | KAFKA-17087 | 4.0 | |
| 1078 | Remove Leaking Getter Methods in Joined Helper Class TLDR: Deprecates gracePeriod(), keySerde(), valueSerde(), and otherValueSerde() getter methods from the public Joined helper class, since these methods expose internal concern and are already available on the internal JoinedInternal class used by the DSL implementation. These getters leaked implementation details into the public API, and removing them prevents external code from depending on internals. |
Streams | Accepted | TengYao Chi | 2024-08-04 | 2024-08-29 | KAFKA-17253 | 4.0 | |
| 1077 | Deprecate `ForeachProcessor` and move to internal package TLDR: KIP-1077 deprecates `org.apache.kafka.streams.kstream.ForeachProcessor` and moves it to the internal package `org.apache.kafka.streams.kstream.internals`. The class was mistakenly placed in a public package despite being an internal implementation detail, preventing API evolution without going through the KIP process. |
Streams | Accepted | Kuan Po Tseng | 2024-08-04 | 2024-08-26 | KAFKA-17224 | 4.0 | |
| 1076 | Metrics for client applications KIP-714 extension TLDR: Extends KIP-714's client metrics telemetry framework to include application-level metrics from Kafka Streams instances, not just the embedded producer/consumer clients. Cluster operators can observe client-side metrics via the broker pull mechanism, but Kafka Streams applications emit additional metrics (task-level lag, state store sizes, etc.) that remain invisible without this extension. |
Metrics Client | Accepted | Bill Bejeck | 2024-08-02 | 2026-03-09 | KAFKA-17248 | 4.0 | |
| 1075 | Introduce delayed remote list offsets purgatory to make LIST_OFFSETS async TLDR: KIP-1075 introduces a delayed purgatory (`DelayedRemoteListOffsets`) and a new `TimeoutMs` field in `ListOffsetsRequest` so timestamp-based offset lookups that require fetching remote indexes are handled asynchronously by dedicated remote-log-reader threads rather than blocking request-handler threads. `ListOffsets` requests against remote-storage-enabled topics require non-deterministic latency to fetch remote offset and time indexes; when many such requests arrive concurrently, request-handler thread exhaustion starves higher-priority `FETCH` and `PRODUCE` requests. |
Tiered Storage Broker | Accepted | Kamal Chandraprakash | 2024-08-02 | 2024-08-29 | KAFKA-15859 | 4.0 | ListOffsets v10 |
| 1074 | Allow the replication of user internal topics TLDR: Adds a `replication.policy.internal.topic.separator.enabled` configuration (and related `ReplicationPolicy` changes) so MirrorMaker2 can replicate user-defined topics whose names end in `.internal` or `-internal`. The current `ReplicationPolicy.isInternalTopic()` logic unconditionally excludes any topic matching the internal naming pattern, silently dropping replication of legitimate business topics with those suffixes. |
MirrorMaker | Accepted | Patrik Márton | 2024-07-30 | 2024-11-18 | KAFKA-17200 | 4.0 | |
| 1073 | Return fenced brokers in DescribeCluster response TLDR: Extends `DescribeClusterRequest` (v2+) to optionally include fenced brokers in the response, adding an `includeFencedBrokers` option to `DescribeClusterOptions` and a `Fenced` field per broker in the response. When a KRaft broker node is removed without being unregistered, it stays fenced in the controller but is invisible to `DescribeCluster`, preventing operators from discovering the node ID needed to call the unregister API. |
KRaft Admin | Accepted | Gantigmaa Selenge | 2024-07-23 | 2024-11-19 | KAFKA-17094 | 4.0 | DescribeCluster v2 |
| 1071 | Streams Rebalance Protocol TLDR: Introduces a native Streams Rebalance Protocol that moves task assignment from a client-side custom PartitionAssignor piggyback to a server-side broker protocol built on the KIP-848 group coordinator. The existing approach serializes Streams task assignment state into the consumer group join/sync protocol payload, coupling Streams scheduling correctness to the slow, stop-the-world Classic rebalance and preventing incremental assignment. |
Streams Consumer | Accepted | Lucas Brutschy | 2024-07-12 | 2025-12-11 | KAFKA-17125 | 4.1 | StreamsGroupHeartbeat Fetch OffsetCommit OffsetFetch JoinGroup Heartbeat SyncGroup ListGroups ConsumerGroupHeartbeat StreamsGroupDescribe |
| 1070 | deprecate MockProcessorContext TLDR: Deprecates MockProcessorContext (old test API), Transformer, TransformerSupplier, ValueTransformer, and ValueTransformerSupplier to complete the removal of the old Kafka Streams Processor API. These interfaces were already functionally replaced by their new Processor API equivalents (api.MockProcessorContext, FixedKeyProcessor, etc.) but could not be removed without first being deprecated. |
Streams Testing | Accepted | Matthias J. Sax | 2024-07-11 | 2024-07-31 | KAFKA-9738 | 4.0 | |
| 1068 | New metrics for the new KafkaConsumer TLDR: Adds new metrics specific to the `AsyncKafkaConsumer` (CONSUMER rebalance protocol) including thread coordination metrics and event queue depth that are not applicable to the `LegacyKafkaConsumer` (CLASSIC protocol). The new consumer implementation introduced in KIP-848 uses a dual-thread model (API thread + network I/O background thread) whose operational characteristics cannot be observed with the existing single-threaded consumer metrics. |
Consumer Metrics | Accepted | Philip Nee | 2024-07-08 | 2024-12-11 | KAFKA-16143 | 4.0 | |
| 1066 | Mechanism to cordon brokers and log directories TLDR: Adds a cordoning mechanism for brokers and log directories that marks them as unavailable for new partition placement while still allowing existing partitions to remain. The round-robin partition placer had no concept of planned maintenance, making broker decommissioning or JBOD disk removal painful because newly created partitions would still be assigned to the target being drained. |
Broker Admin | Accepted | Mickael Maison | 2024-07-04 | 2025-10-08 | KAFKA-19774 | 4.3 | Heartbeat DescribeLogDirs BrokerRegistration BrokerHeartbeat |
| 1064 | Upgrade slf4j to 2.x TLDR: Upgrades Kafka's logging interface from SLF4J 1.7 to SLF4J 2.0 and updates the default log4j2 binding to log4j-slf4j2-impl, enabling the -Dslf4j.provider JVM property for logging backend selection. SLF4J 1.x required classpath ordering tricks or manual JAR removal to select a logging backend, and Kafka's dependency on it prevented users from easily switching to modern logging implementations. |
Broker | Discussion | Muralidhar Basani | 2024-06-27 | 2025-04-10 | KAFKA-16936 | ||
| 1062 | Introduce Pagination for some requests used by Admin API TLDR: Extends KIP-966's pagination pattern to additional admin protocol requests—OffsetFetchRequest and DescribeLogDirsRequest—by adding a max-partitions limit and a cursor field for continuation. On large clusters with thousands of partitions, these requests can return responses that exceed broker timeout limits, causing failures in admin operations. |
Admin Protocol | Discussion | Omnia Ibrahim | 2024-06-26 | 2024-08-19 | KAFKA-17041 | Fetch ListOffsets Metadata OffsetFetch DescribeGroups ListGroups DescribeAcls DescribeConfigs DescribeLogDirs ListPartitionReassignments DescribeProducers DescribeTransactions ListTransactions ConsumerGroupDescribe DescribeTopicPartitions | |
| 1061 | Allow exporting SCRAM credentials TLDR: KIP-1061 proposes adding a `--describe` export mode to `kafka-configs.sh` that outputs SCRAM credential fields (`salt`, `stored_key`, `server_key`) needed to replicate credentials to another cluster. KIP-554 stored SCRAM credentials in KRaft metadata but intentionally withheld these fields from the describe output; without them, migrating a cluster requires all users to reset their passwords in the new cluster. |
Security Admin | Discussion | Gaurav Narula | 2024-06-24 | 2024-07-02 | KAFKA-17063 | DescribeUserScramCredentials | |
| 1058 | Txn consumer exerts pressure on remote storage when reading non-txn topic TLDR: KIP-1058 adds a `txnIdxEmpty` tagged field to `RemoteLogSegmentMetadataRecord` and `RemoteLogSegmentMetadataSnapshot` so the Remote Log Manager can skip fetching remote transaction indexes for segments that have no aborted transactions. When `isolation.level=READ_COMMITTED` consumers fetch from topics with no aborted transactions, the broker currently scans all remote segment transaction indexes sequentially before reaching local segments, causing excessive remote storage API calls and increased fetch latency. |
Tiered Storage Transactions | Accepted | Kamal Chandraprakash | 2024-06-17 | 2024-11-06 | KAFKA-16780 | 4.0 | |
| 1057 | Add remote log metadata flag to the dump log tool TLDR: Adds a --remote-log-metadata-decoder flag to kafka-dump-log.sh that decodes RemoteLogSegmentMetadata records stored in the __remote_log_metadata internal topic using RemoteLogMetadataSerde. Debugging tiered storage issues required manual decoding of opaque binary records in this topic, since dump-log only supported decoders for __consumer_offsets, __transaction_state, and other internal topics. |
Tiered Storage Admin | Accepted | Federico Valeri | 2024-06-15 | 2024-08-28 | KAFKA-16228 | 3.9 | |
| 1056 | Remove `default.` prefix for exception handler StreamsConfig TLDR: Deprecates default.deserialization.exception.handler and default.production.exception.handler StreamsConfig keys and adds replacement names without the default. prefix (deserialization.exception.handler, production.exception.handler). The default. prefix implies that per-topic overrides are possible, but these handlers have no per-topic override mechanism, making the prefix semantically incorrect. |
Streams | Accepted | Muralidhar Basani | 2024-06-08 | 2024-09-08 | KAFKA-16863 | 4.0 | |
| 1055 | Introducing Round-Robin Assignment Strategy to ConnectorUtils TLDR: Adds a round-robin element distribution method to ConnectorUtils in Kafka Connect, assigning elements to groups in interleaved order (e.g., {a,d}, {b,e}, {c}) rather than the existing contiguous grouping ({a,b}, {c,d}, {e}). Connector workload distribution via contiguous grouping produces uneven load when groups must process each element's work serially, since the last group may receive fewer or lighter elements. |
Connect | Discussion | Fan Yang | 2024-06-08 | 2024-06-08 | KAFKA-16893 | ||
| 1054 | Support External schema in JSONConvertor TLDR: Introduces an `external.schema` configuration option for `JsonConverter` in Kafka Connect that reads the schema from a file path rather than embedding it in every message payload. When connectors require a schema (e.g., JDBC connectors) but the message producer cannot embed a schema in each JSON record, the current approach either bloats every message with a full schema copy or requires a schema registry. |
Connect | Accepted | Priyanka K U | 2024-06-05 | 2025-03-20 | KAFKA-16913 | 4.2 | |
| 1053 | Align the naming convention for config and default variables in *Config classes TLDR: Proposes renaming constants in CommonClientConfigs and StreamsConfig from a DEFAULT_VARIABLE_NAME pattern to a VARIABLE_NAME_DEFAULT suffix pattern to align with standard Java naming conventions. The DEFAULT_ prefix convention was inconsistent with how most Java frameworks represent default values, making the API harder to navigate. |
Streams Admin | Discarded | Eric Lu | 2024-06-06 | 2024-06-07 | KAFKA-16638 | ||
| 1052 | Enable warmup in producer performance test TLDR: Adds --warmup-secs and --warmup-records options to kafka-producer-perf-test.sh to exclude the high-latency initialization phase from steady-state measurements. Without a warmup phase, performance test results included startup latency spikes from producer initialization and ISR establishment, requiring multi-hour test runs to dilute the startup noise below 1% of the measurement window. |
Producer Testing | Accepted | Matt Welch | 2024-06-06 | 2025-07-24 | KAFKA-17645 | 4.2 | |
| 1051 | Statically configured log replication throttling TLDR: Allows `leader.replication.throttled.rate` and `follower.replication.throttled.rate` to be configured statically in broker properties files in addition to the existing dynamic-only support. Dynamic-only throttle configuration cannot protect against unplanned replication spikes (e.g., a broker rejoining after extended downtime) since an operator must react after the spike begins rather than proactively setting a safe ceiling. |
Broker | Discussion | Harry Fallows | 2024-06-06 | 2024-12-26 | KAFKA-16910 | ||
| 1050 | Consistent error handling for Transactions TLDR: Defines a comprehensive error taxonomy for Kafka transactional and idempotent producer APIs — classifying each error as retriable, abortable, or fatal — and aligns error handling behavior across all client SDK implementations (Java, librdkafka, Go, .NET, Python). Inconsistent error handling across SDKs and incomplete error specifications in KAFKA-7787/KAFKA-14439 made it difficult for non-Java clients to implement correct transactional semantics. |
Transactions Client | Accepted | Kaushik Raina | 2024-06-06 | 2025-07-15 | KAFKA-16906 | 4.1 | |
| 1049 | Add config log.summary.interval.ms to Kafka Streams TLDR: KIP-1049 adds the `log.summary.interval.ms` configuration to Kafka Streams, controlling how frequently thread-level summary statistics are logged (default 2 minutes). The hardcoded 2-minute interval cannot be adjusted, causing log pollution in busy production environments and insufficient visibility in debugging scenarios. |
Streams | Accepted | Jian.DU | 2024-05-22 | 2024-07-23 | KAFKA-16584 | 3.9 | |
| 1048 | Improve kafka-consumer-perf-test to benchmark single partition TLDR: Extends kafka-consumer-perf-test to support consuming from a single specified partition via a new --partition flag, enabling partition-level throughput and latency benchmarking. The tool previously subscribed to all partitions, making it impossible to isolate per-broker or per-partition performance and diagnose hardware-level issues (e.g., a faulty switch affecting one broker). |
Consumer Testing | Discussion | Harsh Panchal | 2024-05-22 | 2024-05-22 | KAFKA-16810 | ||
| 1046 | Expose producer.id.expiration.check.interval.ms as dynamic broker configuration TLDR: Promotes `producer.id.expiration.check.interval.ms` from an internal static config to a dynamic public broker configuration. Without this, operators cannot tune the frequency of producer ID expiration checks at runtime, making `producer.id.expiration.ms` dynamic-only in name since the check interval stays fixed. |
Broker Producer | Discussion | Jorge Esteban Quilcate Otoya | 2024-05-16 | 2024-05-16 | KAFKA-16264 | ||
| 1045 | Move MockAdminClient to public api TLDR: KIP-1045 proposes moving `MockAdminClient` from the internal test source tree to the public API alongside `MockConsumer` and `MockProducer`. Developers testing applications that use `AdminClient` have no supported mock implementation, forcing them to either stand up a real broker in unit tests or maintain their own incomplete mock. |
Admin Testing | Discussion | Muralidhar Basani | 2024-05-16 | 2024-06-06 | KAFKA-15258 | ||
| 1044 | A proposal to change idempotent producer -- server implementation TLDR: Proposes changes to the idempotent producer's server-side PID (producer ID) tracking to address OOM failures caused by an unbounded explosion in the number of tracked PID/epoch pairs and to reduce the window between when a PID is received and when it is durably persisted. The fixed-size circular buffer design in the original KIP-98 spec was not implemented as specified, leading to unbounded memory growth in clusters with many concurrent transactional producers. |
Producer Broker | Discussion | Claude Warren | 2024-05-15 | 2024-05-29 | |||
| 1043 | Administration of groups TLDR: Adds unified group administration APIs and tooling (AdminClient and CLI) that work consistently across all group types (classic consumer, new consumer, share, streams, Connect worker groups) using a single DescribeGroups-style RPC that identifies group type. The existing tools and APIs were group-type-specific: kafka-consumer-groups.sh returned misleading errors for share groups, and there was no unified way to list or describe groups across all types. |
Admin Consumer | Accepted | Andrew Schofield | 2024-05-08 | 2025-04-17 | KAFKA-16891 | 4.0 | DescribeGroups v6 |
| 1038 | Add Custom Error Handler to Producer TLDR: Adds a pluggable `ProducerErrorHandler` interface that users can configure on `KafkaProducer` to intercept and handle producer-level errors (serialization, send failures) with custom logic. Currently producers only support fixed retry/fail semantics with no user hook for custom error routing or dead-letter handling. |
Producer | Discussion | Alieh Saeedi | 2024-04-17 | 2024-05-15 | KAFKA-15309 | ||
| 1037 | Allow WriteTxnMarkers API with Alter Cluster Permission TLDR: Allows the WriteTxnMarkers API to be authorized with the Alter permission on the Cluster resource in addition to the existing ClusterAction permission, enabling non-broker admin clients to abort hanging transactions (as introduced by KIP-664) without needing ClusterAction. ClusterAction is reserved for inter-broker communication and granting it to operator tooling violates the principle of least privilege. |
Security Transactions | Accepted | Nikhil Ramakrishnan | 2024-04-11 | 2024-04-30 | KAFKA-16513 | 3.8 | |
| 1036 | Extend RecordDeserializationException exception TLDR: Extends `RecordDeserializationException` to carry the raw key bytes, value bytes, headers, timestamp, and timestamp type of the failed record, plus a `DeserializationExceptionOrigin` enum indicating whether the key or value caused the failure. KIP-334 added offset information to this exception for skip-and-continue semantics, but without access to the raw record bytes, implementing a dead-letter queue or diagnostic logging of poison-pill records requires a separate consumer re-fetch. |
Consumer | Accepted | Damien Gasparina | 2024-04-10 | 2024-05-22 | KAFKA-16507 | 3.8 | |
| 1035 | StateStore managed changelog offsets TLDR: Allows Kafka Streams StateStore implementations to manage their own changelog topic offsets rather than delegating that responsibility to the engine's per-task .checkpoint file. The current centralized checkpoint file creates a tight coupling between the Streams engine and every store, preventing stores (e.g., RocksDB with remote storage) from atomically persisting their data and the corresponding changelog offset together. |
Streams | Accepted | Nicholas Telford | 2024-04-07 | 2026-02-21 | KAFKA-17411 | 4.3 | |
| 1034 | Dead letter queue in Kafka Streams TLDR: Introduces a dead-letter queue (DLQ) mechanism for Kafka Streams, routing records that trigger processing exceptions to a configurable Kafka topic instead of failing the stream or silently dropping them. The existing exception handlers offer only two choices—fail the application or skip the record—neither of which is suitable for production workloads where faulty records should be isolated for inspection and reprocessing. |
Streams | Discussion | Damien Gasparina | 2024-04-04 | 2026-02-24 | KAFKA-16505 | 4.2 | |
| 1033 | Add Kafka Streams exception handler for exceptions occurring during processing TLDR: KIP-1033 adds a `ProcessingExceptionHandler` interface to Kafka Streams (configured via `processing.exception.handler`) that intercepts uncaught exceptions thrown during `process()` calls in DSL or Processor API topologies, with built-in `LogAndFail` (default) and `LogAndContinue` implementations. Previously, unhandled processing exceptions terminated the `StreamThread`, requiring developers to wrap all processing logic in try-catch blocks—an error-prone approach that does not scale across large topologies. |
Streams | Accepted | Damien Gasparina | 2024-03-29 | 2024-08-30 | KAFKA-16448 | 3.9 | |
| 1032 | Upgrade to Jakarta and JavaEE 10 in Kafka 4.0 TLDR: Upgrades Kafka's REST API layer (Connect, broker REST endpoints) from legacy javax.* (JavaEE) packages to jakarta.* (Jakarta EE 10), enabling compatibility with Jetty 12, Spring 6, and other modern frameworks that have dropped javax support. Kafka 4.0 is a major version bump and an appropriate milestone to make this breaking package namespace change; continuing to use javax blocked dependency upgrades that delivered security fixes. |
Connect Broker | Accepted | Christopher L. Shannon | 2024-03-27 | 2024-07-22 | KAFKA-16437 | 4.0 | |
| 1031 | Control offset translation in MirrorSourceConnector TLDR: Adds a configuration flag `offset.syncs.enabled` to `MirrorSourceConnector` to let operators disable the writing of offset translation records to the `mm2-offset-syncs` internal topic during mirroring. This is useful when MirrorCheckpointConnector is not deployed and offset sync overhead is unnecessary. |
MirrorMaker | Accepted | Omnia Ibrahim | 2024-03-14 | 2024-04-24 | KAFKA-16254 | 3.9 | |
| 1030 | Change constraints and default values for various configurations TLDR: KIP-1030 revises the constraints and default values for multiple broker and topic configurations in the Kafka 4.0 and 5.0 releases, such as tightening the minimum value for `segment.ms` and updating defaults to reflect modern operational best practices. The 4.0 major release provides a compatibility break point to fix defaults and constraints that have been suboptimal but could not be changed in minor releases. |
Broker Admin | Accepted | Divij Vaidya | 2024-03-13 | 2025-02-05 | KAFKA-16368 | 4.0 | |
| 1028 | Docker Official Image for Apache Kafka TLDR: Publishes a Docker Official Image (DOI) for Apache Kafka under Docker Hub's `kafka` namespace, complementing the ASF-sponsored `apache/kafka` image introduced by KIP-975. This gives users a Docker-community-maintained, standards-compliant image alongside the existing ASF one. |
Broker | Accepted | Vedarth Sharma | 2024-03-12 | 2024-05-17 | KAFKA-16373 | 3.8 | |
| 1027 | Add MockFixedKeyProcessorContext and TestFixedKeyRecordFactory TLDR: Adds `MockFixedKeyProcessorContext` and `TestFixedKeyRecordFactory` to Kafka Streams test-utils to enable unit testing of `FixedKeyProcessor` topologies introduced in KIP-478/KIP-820. The existing `MockProcessorContext` only supports the general `Processor` API; `FixedKeyProcessor` uses a different context type (`FixedKeyProcessorContext`) which had no test-utils analog, making unit tests for fixed-key topologies impossible without an integration test setup. |
Streams Testing | Discussion | Matthias J. Sax | 2024-03-09 | 2024-07-07 | KAFKA-15143 | ||
| 1025 | Optionally URL-encode clientID and clientSecret in authorization header TLDR: Adds a sasl.oauthbearer.header.urlencode client configuration to opt into URL-encoding of clientID and clientSecret when constructing the Basic Authorization header for OIDC token endpoint requests. RFC 6749 §2.3.1 mandates URL-encoding these credentials in the Authorization header, but the KIP-768 implementation omitted this encoding, risking authentication failures with strict OIDC providers. |
Security | Accepted | Nelson B. | 2024-03-06 | 2024-07-01 | KAFKA-16345 | 3.9 | |
| 1023 | Follower fetch from tiered offset TLDR: Enables follower replicas to start fetching from the tiered storage offset (the remote log start offset) rather than always from local log start, when rebuilding auxiliary state (leader epoch cache, producer snapshots) from the leader. Without this, followers on tiered-storage-enabled topics redundantly fetch local segments that overlap with remote storage, slowing restoration. |
Tiered Storage Broker | Accepted | Abhijeet Kumar | 2024-02-26 | 2024-04-29 | KAFKA-15433 | 4.2 | ListOffsets v11 |
| 1022 | Formatting and Updating Features TLDR: Extends the kafka-storage format and kafka-features upgrade tools to accept feature-specific flags (--feature transaction.version=X, --feature group.version=X, etc.) and introduces Transaction Version and Group Version as named features with their own upgrade lifecycle. Feature initialization was limited to metadata version, making it impossible to set or upgrade other feature levels independently, blocking the phased rollout of EOS v2 (KIP-890) and new consumer group protocol (KIP-848). |
KRaft Admin | Accepted | Justine Olshan | 2024-02-26 | 2025-03-06 | KAFKA-16308 | 4.0 | ApiVersions UpdateFeatures BrokerRegistration |
| 1021 | Allow to get last stable offset (LSO) in kafka-get-offsets.sh TLDR: Adds an --isolation flag to kafka-get-offsets.sh and extends OffsetSpec with an LSOSpec so that the last stable offset (LSO) can be fetched instead of the high watermark. Without this, the tool always uses IsolationLevel.READ_UNCOMMITTED, making it impossible to observe the true consumer-visible offset boundary for transactional topics. |
Admin Consumer | Discussion | Ahmed Sobeh | 2024-02-24 | 2024-03-20 | KAFKA-16282 | ||
| 1019 | Expose method to determine Metric Measurability TLDR: Adds a public `isMeasurable()` method to the `KafkaMetric` interface so callers can check measurability without relying on exception handling or Java reflection. Existing code either caught `ClassCastException` from `metric.measurableValue()` or used reflection to inspect the internal value provider, both of which are fragile and expensive. |
Metrics | Accepted | Apoorv Mittal | 2024-02-14 | 2024-04-19 | KAFKA-16280 | 3.8 | |
| 1018 | Introduce max remote fetch timeout config for DelayedRemoteFetch requests TLDR: Adds a dedicated remote.fetch.max.wait.ms broker configuration for the DelayedRemoteFetchPurgatory timeout used when fetching from tiered (remote) storage, independent of the consumer's fetch.max.wait.ms. The shared fetch.max.wait.ms default (500ms) was too short for remote storage fetches during degraded conditions, causing excessive purgatory expiration and retry storms against the remote storage layer. |
Tiered Storage Broker | Accepted | Kamal Chandraprakash | 2024-01-30 | 2024-06-05 | KAFKA-15776 | 3.8 | |
| 1017 | Health check endpoint for Kafka Connect TLDR: KIP-1017 adds a `/health` REST endpoint to Kafka Connect workers that returns HTTP 200 only when the worker has fully completed startup (internal topic creation, catch-up read, cluster join) and is ready to handle requests. Kubernetes readiness probes and load balancers have no reliable way to determine whether a Connect worker has finished its multi-step startup sequence before routing traffic to it. |
Connect | Accepted | Chris Egerton | 2024-01-25 | 2024-06-20 | KAFKA-10816 | 3.9 | |
| 1016 | Make MM2 heartbeats topic name configurable TLDR: Makes the MirrorMaker2 heartbeats topic name configurable via a new deafult.replication.policy.heartbeats.topic.name MirrorConnectorConfig property, defaulting to the existing heartbeats value. The hard-coded heartbeats topic name in DefaultReplicationPolicy caused naming collisions when multiple MM2 instances or other systems used the same topic name in the same cluster. |
MirrorMaker | Discussion | Bertlan Kondrat | 2024-01-16 | 2024-07-31 | KAFKA-15992 | ||
| 1015 | Limit number of ssl connections in brokers TLDR: Adds a new `max.connections.ssl` broker and listener configuration to cap the number of active TLS connections independently of plaintext connections. An SSL connection consumes ~100KB of memory vs ~250 bytes for plaintext, so the existing unified `max.connections` limit does not prevent OOM from TLS-heavy workloads. |
Security Broker | Discussion | Jimmy Wang | 2024-01-06 | 2024-01-08 | KAFKA-16081 | ||
| 1014 | Managing Unstable Features in Apache Kafka TLDR: KIP-1014 introduces an "unstable feature" framework for Apache Kafka where in-progress features are gated behind a feature flag and isolated from production traffic until fully complete and stabilized. Large features developed across many PRs produce a cluster in a partially implemented state where incompatible RPC or metadata record changes can break mixed-version clusters or expose unfinished functionality to users. |
Broker | Discussion | Proven Provenzano | 2024-01-04 | 2024-06-26 | KAFKA-15922 | 3.7 | Fetch |
| 1012 | The need for a Kafka 3.8 and 3.9 release TLDR: Proposes releasing Kafka 3.8 and 3.9 as minor releases before the 4.0 major release to allow time to complete KRaft feature parity (e.g., JBOD support) that the community depends on before removing ZooKeeper entirely. Without these intermediary releases, operators would be forced to upgrade to 4.0 before critical features are ready. |
Broker | Accepted | Josep Prat | 2023-12-22 | 2024-07-30 | |||
| 1011 | Use incrementalAlterConfigs when updating broker configs by kafka-configs.sh TLDR: Switches `kafka-configs.sh` to use `IncrementalAlterConfigs` instead of the deprecated `AlterConfigs` RPC when modifying broker configs, ensuring per-key updates rather than full config replacement. The old `AlterConfigs` semantics require sending all configs atomically, risking accidental resets of configs not included in the request. |
Admin Protocol | Accepted | Deng Ziming | 2023-12-18 | 2024-12-02 | KAFKA-16181 | 4.0 | |
| 1010 | Topic Partition Quota TLDR: Introduces per-topic-partition bandwidth quotas as a new quota entity type, complementing the existing per-user/per-client quotas from KIP-13. When multiple partitions of a high-throughput topic land on the same broker, per-client quotas cannot cap the aggregate write rate to that topic across clients, allowing a topic to saturate broker I/O regardless of how the quota is configured. |
Broker Admin | Discussion | Afshin Moazami | 2023-12-11 | 2023-12-21 | KAFKA-16042 | Produce Fetch | |
| 1009 | Add Broker-level Throttle Configurations TLDR: Adds broker-level throttle configurations (`replication.quota.throttled.rate`) that apply globally across all partitions undergoing replication catch-up or reassignment, without requiring per-topic quota configuration. Without global replication throttles, large-scale rebalances or broker recoveries saturate inter-broker bandwidth and spike producer latency. |
Broker Admin | Discussion | Ria Pradeep | 2023-11-21 | 2023-11-22 | KAFKA-7983 | ||
| 1008 | ParKa - the Marriage of Parquet and Kafka TLDR: Proposes a `ParKa` format that serializes Kafka record batches in Apache Parquet columnar format rather than row-by-row Avro/JSON/Protobuf. Columnar encoding yields dramatically better compression ratios for large batches and enables predicate pushdown for consumers reading from tiered storage. |
Protocol Broker | Discussion | Xinli Shang | 2023-11-21 | 2023-12-02 | |||
| 1006 | Remove SecurityManager Support TLDR: Removes Kafka's compile-time and runtime dependencies on the Java `SecurityManager` API (including `AccessController`) from clients, core, and connect-runtime modules. JEP-411 deprecated `SecurityManager` in Java 17 and Java 23 already throws `UnsupportedOperationException` for legacy APIs when no `SecurityManager` is configured, making these calls a forward-compatibility hazard. |
Broker Security | Discussion | Greg Harris | 2023-11-20 | 2024-09-26 | KAFKA-15862 | ||
| 1005 | Expose EarliestLocalOffset and TieredOffset TLDR: Exposes the `EARLIEST_LOCAL_OFFSET` (`-4`) and `TIERED_OFFSET` ListOffsets special integers in the `kafka-get-offsets` CLI tool, and makes `TieredOffset` an accessible public constant via `UnifiedLog`. KIP-405's tiered storage introduced a local log start offset exposed by the `ListOffsets` API, but the `kafka-get-offsets` tool had no way to request it, and the offset constant was not publicly accessible. |
Tiered Storage Admin | Accepted | Christo Lolov | 2023-11-20 | 2024-01-17 | KAFKA-15857 | 3.9 | ListOffsets v9 |
| 1004 | Enforce tasks.max property in Kafka Connect TLDR: Enforces the `tasks.max` connector configuration so that a connector can never run more tasks than the configured upper bound, even if its `taskConfigs()` method returns more entries than allowed. Kafka Connect previously passed all generated task configs to the connector regardless of `tasks.max`, allowing connectors to silently exceed their resource quota. |
Connect | Accepted | Chris Egerton | 2023-11-11 | 2024-01-11 | KAFKA-15575 | 3.8 | |
| 1003 | Signal next segment when remote fetching TLDR: Extends RemoteStorageManager#fetchLogSegment with an optional nextRemoteLogSegmentMetadata parameter, allowing RSM implementations to begin pre-fetching the next segment before the current one is fully consumed. Without segment boundary signaling, RSM pre-fetching was limited to within the current segment, causing latency spikes at segment transitions during sequential remote reads. |
Tiered Storage | Discussion | Jorge Esteban Quilcate Otoya | 2023-11-10 | 2023-11-10 | KAFKA-15806 | ||
| 1002 | Fetch remote segment indexes at once TLDR: Adds a `RemoteStorageManager.fetchLogSegmentData(metadata, indexes)` batch API to retrieve all segment indexes (offset, time, transaction, leader epoch) in a single remote storage call alongside the log data. Currently each index type is fetched with a separate `fetchIndex()` call, multiplying remote storage round-trips per consumer fetch. |
Tiered Storage | Discussion | Jorge Esteban Quilcate Otoya | 2023-11-10 | 2023-11-13 | KAFKA-15805 | ||
| 1001 | Add CurrentControllerId Metric TLDR: Adds a `CurrentControllerId` metric exposed on every broker and controller node that reports the node ID of the active KRaft (or ZooKeeper) controller, or -1 if unknown. Identifying the current active controller requires either checking each node's `ActiveControllerCount` metric individually or parsing metadata logs; there is no single metric queryable from any node that identifies which node is the controller. |
KRaft Metrics | Accepted | Colin McCabe | 2023-11-09 | 2023-12-06 | KAFKA-15980 | 3.7 | |
| 1000 | List Client Metrics Configuration Resources TLDR: Adds a `ListClientMetricsResources` admin API and corresponding `kafka-configs.sh` support to enumerate all client metrics subscription resources created by KIP-714. Client metrics resources are named configurations unassociated with any other Kafka entity, so without this there is no way to discover what subscriptions exist. |
Metrics Admin | Accepted | Andrew Schofield | 2023-11-06 | 2023-11-20 | KAFKA-15831 | 3.7 | |
| 999 | Server-side Consumer Lag Metrics TLDR: Proposes exposing server-side consumer lag metrics (committed offset vs. log end offset per group-partition) directly from brokers via the metrics framework. This eliminates the need for external lag monitoring tools that must independently query both consumer group offsets and partition end offsets. |
Consumer Metrics | Discussion | Qichao Chu | 2023-11-03 | 2023-11-03 | |||
| 997 | update WindowRangeQuery and unify WindowKeyQuery and WindowRangeQuery TLDR: Extends `WindowRangeQuery` in the Kafka Streams Interactive Queries API to support dual-key range queries (`keyFrom`, `keyTo`, `timeFrom`, `timeTo`), unifying it with `WindowKeyQuery`. The existing `WindowRangeQuery.fetchAll(timeFrom, timeTo)` fetches all keys in a time window but cannot restrict the key range, forcing clients to post-filter. |
Streams | Discussion | Pengcheng Zheng | 2023-11-01 | 2023-12-04 | KAFKA-15795 | ||
| 996 | Pre-Vote TLDR: Implements Pre-Vote in KRaft: a partitioned controller first canvasses the cluster to check whether it would win a majority vote before incrementing its epoch and sending a real VoteRequest. Without Pre-Vote, a follower partitioned from the leader continuously increments its epoch and sends disruptive VoteRequests upon reconnection, forcing unnecessary leader elections and causing flip-flopping leadership in certain network partition scenarios. |
KRaft | Discussion | Alyssa Huang | 2023-10-13 | 2025-01-02 | KAFKA-16164 | 4.0 | Vote |
| 995 | Allow users to specify initial offsets while creating connectors TLDR: Extends the POST /connectors REST endpoint with an optional initial_offsets field, allowing connector creation and offset initialization in a single atomic request. Previously, setting initial offsets required a three-step sequence: create in STOPPED state, PATCH offsets, then set RUNNING. |
Connect | Accepted | Ashwin Pankaj | 2023-10-05 | 2024-03-11 | KAFKA-15976 | ||
| 994 | Minor Enhancements to ListTransactions and DescribeTransactions APIs TLDR: Enhances `ListTransactions` and `DescribeTransactions` APIs with new filter options (filter by producer ID, duration threshold) and adds a `lastTransactionStartTimeMs` field to transaction metadata. The existing tooling introduced by KIP-664 lacked the ability to narrow responses by producer or flag suspiciously long-running transactions. |
Transactions Admin | Accepted | Raman Verma | 2023-10-31 | 2024-01-17 | KAFKA-15923 | 3.8 | ListTransactions v1 DescribeTransactions |
| 993 | Allow restricting files accessed by File and Directory ConfigProviders TLDR: Adds an allowed.paths configuration to FileConfigProvider and DirectoryConfigProvider, restricting which filesystem paths those providers may read. In security-sensitive Connect deployments, unrestricted file access allows any connector configuration to read arbitrary files on the worker host. |
Admin Security | Accepted | Gantigmaa Selenge | 2023-10-24 | 2023-12-19 | KAFKA-14822 | 3.8 | |
| 992 | Proposal to introduce IQv2 Query Types: TimestampedKeyQuery and TimestampedRangeQuery TLDR: Introduces TimestampedKeyQuery and TimestampedRangeQuery as explicit IQv2 query types for timestamped key-value stores, and fixes KeyQuery/RangeQuery to always return plain V from both kv-store and ts-kv-store. Previously, KeyQuery returned ValueAndTimestamp<V> against a ts-kv-store, creating type-safety issues and unintuitive API behavior. |
Streams | Discussion | Pengcheng Zheng | 2023-10-17 | 2023-12-07 | KAFKA-15629 | 3.7 | |
| 989 | Improved StateStore Iterator metrics for detecting leaks TLDR: Adds num-open-iterators and thread-open-iterators metrics per StateStore in Kafka Streams to expose the count of currently open (potentially leaked) store iterators. Iterator leaks on RocksDB stores pin in-memory blocks and cause memory growth, but the existing block-cache-pinned-usage metric reports shared block cache usage without identifying which store is responsible. |
Streams Metrics | Accepted | Nicholas Telford | 2023-10-04 | 2024-05-16 | KAFKA-15541 | 3.8 | |
| 988 | Streams Standby Task Update Listener TLDR: Adds a `StandbyUpdateListener` interface to Kafka Streams that fires callbacks when standby tasks begin, complete, or make progress restoring state from changelog topics. Currently, `StateRestoreListener` only covers active task restoration, leaving standby task state opaque to operators. |
Streams | Accepted | Colt McNealy | 2023-10-03 | 2023-12-04 | KAFKA-15448 | 3.7 | |
| 987 | Connect Static Assignments TLDR: Proposes adding static (manual) task assignment to Kafka Connect's distributed mode, allowing operators to pin connectors/tasks to specific worker nodes instead of relying on the existing rebalancing algorithms. The current eager and cooperative scheduling algorithms rebalance all tasks on any cluster membership change, causing unnecessary task restarts when operators only need to change a single task's placement. |
Connect | Discussion | Greg Harris | 2023-10-03 | 2023-10-18 | KAFKA-15559 | ||
| 986 | Cross-Cluster Replication TLDR: Proposes building cross-cluster replication natively inside the Kafka broker (as opposed to external MirrorMaker processes), using new internal replication protocol and management APIs. MirrorMaker 1 and 2 run as separate processes that can independently fail, require separate deployment, and cannot leverage broker-internal optimizations (zero-copy transfers, direct partition assignment knowledge). |
MirrorMaker Broker | Discussion | Greg Harris | 2023-10-02 | 2024-03-14 | KAFKA-15528 | ||
| 985 | Add reverseRange and reverseAll query over kv-store in IQv2 TLDR: Adds withDescendingKeys() and withAscendingKeys() methods to RangeQuery and introduces the ResultOrder enum, enabling reverse-ordered range and all-keys queries against RocksDB and InMemoryKeyValueStore (both of which maintain sorted key order). IQv2 previously had no mechanism to retrieve results in descending key order. |
Streams | Accepted | Pengcheng Zheng | 2023-10-02 | 2023-12-07 | KAFKA-15527 | 3.7 | |
| 984 | Add pluggable compression interface to Kafka TLDR: Adds a pluggable compression interface to Kafka by reserving 4 additional bits in the record batch attributes for a compression plugin ID, allowing third-party compression codecs to be loaded at runtime without modifying or rebuilding Kafka. Currently adding a new compression algorithm requires forking Kafka source and recompiling the distribution. |
Broker Producer | Discussion | Assane Diop | 2023-10-02 | 2024-04-24 | |||
| 983 | Full speed async processing during rebalance TLDR: Proposes a new Java consumer API mode where partitions not being revoked during a cooperative rebalance continue processing at full speed while the application asynchronously drains revoked partitions before committing. The current asynchronous processing pattern blocks the entire poll loop during rebalance to prevent offset commit races, stalling throughput on all partitions even those unaffected by the rebalance. |
Consumer | Discussion | Erik van Oosten | 2023-09-23 | 2023-10-14 | |||
| 982 | Access SslPrincipalMapper and kerberosShortNamer in Custom KafkaPrincipalBuilder TLDR: Injects `SslPrincipalMapper` and `KerberosShortNamer` into custom `KafkaPrincipalBuilder` implementations via a new `configure()` callback with broker security configs. Currently these mappers are null inside custom principal builders, preventing them from applying the SSL principal mapping rules configured on the broker. |
Security | Discussion | Raghu Baddam | 2023-09-18 | 2023-10-19 | KAFKA-15452 | ||
| 981 | Manage Connect topics with custom implementation of Admin TLDR: Extends Kafka Connect's topic management to accept a custom Admin implementation (via ForwardingAdmin), routing topic creation through organization-managed resource management systems instead of calling AdminClient directly. Organizations that enforce centralized capacity controls cannot allow Connect to bypass those systems by creating topics with direct AdminClient calls. |
Connect Admin | Discussion | Omnia Ibrahim | 2023-09-19 | 2023-09-19 | KAFKA-15478 | ||
| 980 | Allow creating connectors in a stopped state TLDR: Adds a `stopped` initial state option to the Kafka Connect `POST /connectors` REST endpoint so newly created connectors start in the `STOPPED` state rather than immediately running. This enables zero-downtime connector migration between Connect clusters by creating the connector on the target cluster before starting it. |
Connect | Accepted | Yash Mayya | 2023-09-15 | 2023-10-16 | KAFKA-15470 | 3.7 | |
| 979 | Allow independently stop KRaft processes TLDR: Adds separate PID files and stop scripts for KRaft controller and broker processes so they can be independently stopped when co-located on the same machine. Previously a single `kafka-server-stop.sh` script signaled all Kafka processes indiscriminately, making it impossible to stop just the controller without killing the co-located broker. |
KRaft Admin | Accepted | Hailey Ni | 2023-08-22 | 2023-11-17 | KAFKA-15471 | 3.7 | |
| 977 | Partition-level Throughput Metrics TLDR: Adds per-partition metrics for `MessagesInPerSec`, `BytesInPerSec`, and `BytesOutPerSec` alongside existing topic-level metrics. Topic-level throughput metrics cannot reveal partition hot-spots, making it impossible to detect key skew or diagnose which specific partition is causing ISR shrinkage or broker overload. |
Metrics Broker | Accepted | Qichao Chu | 2023-09-09 | 2024-01-14 | KAFKA-15447 | Produce Fetch | |
| 976 | Cluster-wide dynamic log adjustment for Kafka Connect TLDR: Adds a scope=cluster query parameter to PUT /admin/loggers/{logger} in Kafka Connect's REST API, broadcasting log-level changes to every worker in the cluster in a single request. The existing per-worker API required one request per worker, making cluster-wide log-level changes during debugging cumbersome and impossible through a load balancer. |
Connect Admin | Accepted | Chris Egerton | 2023-09-01 | 2023-10-12 | KAFKA-15428 | 3.7 | |
| 975 | Docker Image for Apache Kafka TLDR: Publishes an official Apache Kafka Docker image to Docker Hub under the ASF Docker profile, targeting JVM-based Kafka. Apache Kafka had no official Docker image, leaving users to rely on third-party images with inconsistent configurations and no ASF endorsement. |
Broker | Accepted | Krishna Agarwal | 2023-09-01 | 2024-05-13 | KAFKA-15445 | 3.7 | |
| 974 | Docker Image for GraalVM based Native Kafka Broker TLDR: Introduces an experimental Docker image that packages a GraalVM ahead-of-time compiled Kafka broker binary, achieving sub-second startup (~110–140 ms) and ~250–540 MB RAM usage versus ~1150 ms and ~1 GB for the JVM image. Long JVM startup times and memory overhead slow down developer iteration cycles involving ephemeral broker instances in unit tests. |
Broker | Accepted | Krishna Agarwal | 2023-08-31 | 2024-03-13 | KAFKA-15444 | 3.8 | |
| 973 | Expose per topic replication rate metrics TLDR: Corrects ReplicationBytesInPerSec and ReplicationBytesOutPerSec MBeans to report per-topic metrics (adding a topic tag) rather than only aggregate broker-level metrics, aligning implementation with documentation. Tools like Cruise Control assumed these metrics were per-topic but the implementation only tracked broker-level aggregates, producing incorrect workload models. |
Metrics Broker | Discussion | Nelson B. | 2023-08-30 | 2023-08-31 | |||
| 972 | Add the metric of the current running version of kafka TLDR: Adds version and commitId as tags on the existing kafka.server:type=app-info metric's start-time-ms gauge, so Prometheus and similar numeric-metric systems can extract the running broker version via label queries. The existing app-info metric exposed version as a string attribute, which Prometheus cannot scrape as a numeric time series. |
Metrics Broker | Discussion | hudeqi | 2023-08-28 | 2023-10-20 | KAFKA-15396 | ||
| 971 | Expose replication-record-lag MirrorMaker2 metric TLDR: Adds replication-record-lag metrics (max/min/avg) to MirrorMaker2's MirrorSourceTask, computed as the difference between the source partition's LEO and the last replicated offset (LRO). No built-in metric existed to directly quantify how many records remained to be replicated, forcing operators to infer lag from byte-lag proxies. |
MirrorMaker Metrics | Discussion | Elkhan Eminov | 2023-08-13 | 2024-02-20 | KAFKA-14112 | ||
| 970 | Deprecate and remove Connect's redundant task configurations endpoint TLDR: Deprecates GET /connectors/{connector}/tasks-config (added by KIP-661) in Kafka 3.7 and removes it in 4.0, as GET /connectors/{connector}/tasks already returns task configurations. Two endpoints exposing functionally identical data creates confusion and unnecessary API surface. |
Connect Protocol | Accepted | Yash Mayya | 2023-08-21 | 2023-09-08 | KAFKA-15387 | 4.0 | |
| 969 | Support range Interactive Queries (IQv2) for Versioned State Stores TLDR: Introduces MultiVersionedRangeQuery for IQv2, supporting key-range queries (bounded and unbounded) against versioned KeyValueStores with optional timestamp bounds. The original IQv2 range query types only targeted the latest value and could not retrieve historical versions within a key range from versioned stores. |
Streams | Discussion | Alieh Saeedi | 2023-08-15 | 2023-12-11 | KAFKA-15348 | ||
| 968 | Support single-key_multi-timestamp Interactive Queries (IQv2) for Versioned State Stores TLDR: KIP-968 adds a `MultiVersionedKeyQuery<K, V>` class and `VersionedRecordIterator<V>` interface to Kafka Streams IQv2, enabling single-key queries against versioned state stores that return all versions within a time range (with optional upper/lower bounds and result ordering). KIP-960 added single-timestamp point lookups; this KIP extends IQv2 to support range-over-time queries including tombstone representation via a `validTo` field on `VersionedRecord`. |
Streams | Accepted | Alieh Saeedi | 2023-08-15 | 2023-11-21 | KAFKA-15347 | 3.7 | |
| 966 | Eligible Leader Replicas TLDR: Introduces Eligible Leader Replicas (ELR), a per-partition set of replicas that lost data in an unclean shutdown but are still eligible for leader election under KRaft, providing minISR-1 tolerance to data-loss unclean shutdowns. The last-replica-standing scenario allowed a single replica that suffered an unclean shutdown and lost committed data to be re-elected as leader, causing surviving replicas to truncate their logs and produce cluster-wide committed data loss. |
Broker | Accepted | Calvin Liu | 2023-08-10 | 2025-08-01 | KAFKA-15332 | 3.7 | BrokerRegistration v3/v2 ElectLeaders DescribeTopicPartitions |
| 965 | Support disaster recovery between clusters by MirrorMaker TLDR: Adds sync.full.acl.enabled to MirrorMaker2 configuration, enabling full replication of topic read/write ACLs, consumer group ACLs, and user SCRAM credentials to the target cluster for disaster recovery standby scenarios. By default MM2 downgrades ALLOW ALL ACLs to ALLOW READ to prevent writes to the mirror, which breaks transparent failover when the target cluster needs to become the active cluster. |
MirrorMaker Security | Discussion | hudeqi | 2023-08-08 | 2023-08-09 | KAFKA-15172 | ||
| 963 | Additional metrics in Tiered Storage TLDR: Adds tiered storage operational metrics covering upload rate, deletion rate, segment copy latency, auxiliary state build time, and remote fetch latency to the existing `RemoteStorageMetrics`. Without these, operators cannot tell whether the remote log manager is keeping pace with log growth or falling behind. |
Tiered Storage Metrics | Accepted | Christo Lolov | 2023-08-03 | 2023-12-18 | KAFKA-15147 | 3.7 | |
| 962 | Relax non-null key requirement in Kafka Streams TLDR: Changes Kafka Streams left-join and outer-join operators (KStream-KStream, KStream-KTable, KStream-GlobalTable, KTable-KTable foreign-key) to no longer drop records with null keys, instead passing null to ValueJoiner for the absent side. The existing strict null-key rejection was inconsistent with the defined left/outer join semantics of retaining the record when no match is found. |
Streams | Accepted | Florin Akermann | 2023-07-29 | 2023-12-09 | KAFKA-12317 | 3.7 | |
| 960 | Support single-key_single-timestamp Interactive Queries (IQv2) for Versioned State Stores TLDR: KIP-960 adds a `VersionedKeyQuery<K, V>` class to Kafka Streams IQv2 supporting single-key lookups against versioned state stores (`VersionedKeyValueStore`) with an optional `asOf(Instant)` timestamp to retrieve the value valid at a specific point in time. Versioned state stores (introduced in KIP-889) expose time-varying key-value state but lacked an interactive query interface, making the stored temporal history inaccessible from outside the topology. |
Streams | Accepted | Alieh Saeedi | 2023-07-26 | 2023-10-24 | KAFKA-15346 | 3.7 | |
| 959 | Add BooleanConverter to Kafka Connect TLDR: Adds a `BooleanConverter` class to the `org.apache.kafka.connect.converter` package that serializes/deserializes `BOOLEAN` schema types for Kafka Connect. The Boolean SerDe added in KIP-907 was only available for raw Kafka clients, leaving Connect without a native boolean converter. |
Connect | Accepted | Hector Geraldino | 2023-07-25 | 2023-10-17 | KAFKA-15248 | 3.7 | |
| 956 | Tiered Storage Quotas TLDR: Introduces configurable bandwidth quotas for the Remote Log Manager (RLM) upload and fetch paths in Tiered Storage to bound the CPU and network impact of segment uploads. When Tiered Storage is enabled on existing large topics, the RLM schedules concurrent upload tasks for all eligible segments simultaneously, consuming significant CPU and degrading producer latencies. |
Tiered Storage | Accepted | Abhijeet Kumar | 2023-07-21 | 2024-08-11 | KAFKA-15265 | 3.9 | |
| 954 | expand default DSL store configuration to custom types TLDR: Extends the default.dsl.store StreamsConfig to cover custom state store types and stream-stream joins (both windowed and the internal KeyValueStore used for certain join types), which were previously hardcoded to RocksDB or in-memory stores. Without this, there was no way to use a fully custom store implementation across all DSL operators, especially the additional KeyValueStore in certain stream-stream join topologies. |
Streams | Accepted | A. Sophie Blee-Goldman | 2023-07-19 | 2023-12-22 | KAFKA-14976 | 3.7 | |
| 952 | Regenerate segment-aligned producer snapshots when upgrading to a Kafka version supporting Tiered Storage TLDR: KIP-952 changes the Remote Log Manager to retroactively generate missing producer snapshot files when uploading a log segment to remote storage if the segment predates Kafka 2.8.0 (which introduced segment-aligned snapshot files). Users upgrading from pre-2.8.0 Kafka and enabling Tiered Storage immediately encounter `NullPointerException`s because old segments lack snapshot files that the RSM API contract requires, blocking all remote log uploads. |
Tiered Storage | Discussion | Christo Lolov | 2023-07-14 | 2023-07-17 | KAFKA-15195 | ||
| 951 | Leader discovery optimizations for the client TLDR: Embeds new leader information (leader ID, epoch, host/port) directly in ProduceResponse and FetchResponse error payloads when a broker knows the current leader, eliminating or reducing the need for a separate Metadata RPC round-trip after a leadership change. The existing leader discovery path required an asynchronous Metadata refresh plus retry backoff before the client could resume producing or fetching, significantly increasing end-to-end latency during leader elections and cluster rolls. |
Protocol Client | Accepted | Mayank Shekhar Narula | 2023-07-13 | 2025-07-21 | KAFKA-15868 | 3.7 | Produce v10 Fetch v16 Metadata |
| 950 | Tiered Storage Disablement TLDR: Adds the ability to disable Tiered Storage on a per-topic basis after it has been enabled, including draining remote segments back to local storage before disabling. KIP-405 introduced Tiered Storage enablement but provided no mechanism to reverse it, leaving operators without a path to recover topics from tiered storage or roll back the feature. |
Tiered Storage | Accepted | Mehari Beyene | 2023-07-11 | 2024-08-16 | KAFKA-15132 | 3.9 | |
| 949 | Add flag to enable the usage of topic separator in MM2 DefaultReplicationPolicy TLDR: Adds replication.policy.internal.topic.separator.enabled (default true) to DefaultReplicationPolicy, allowing users to opt out of using replication.policy.separator for naming internal offset sync and checkpoint topics. KIP-690 introduced the separator into internal topic naming, breaking backward compatibility for users who upgraded and had existing internal topics named with the old convention. |
MirrorMaker | Accepted | Omnia Ibrahim | 2023-07-07 | 2023-08-16 | KAFKA-15102 | 3.6 | |
| 948 | Allow custom prefix for internal topic names in Kafka Streams TLDR: Introduces an internal.topics.prefix StreamsConfig that prepends a custom string to all Kafka Streams internal topic names independently of application.id. Multi-tenant clusters often enforce a topic namespace prefix tied to team ACLs, but application.id serves double duty as both consumer group ID and topic prefix, preventing them from being set independently. |
Streams | Discussion | Igor Buzatović | 2023-07-04 | 2023-10-05 | |||
| 946 | Modify exceptions thrown by Consumer APIs TLDR: Proposes standardizing and documenting the exceptions thrown by `KafkaConsumer` public APIs to give callers clear, actionable error types. Currently consumer APIs throw a mix of undocumented runtime exceptions making it difficult for application code to handle errors correctly. |
Consumer | Discussion | Kirk True | 2023-06-29 | 2023-06-29 | |||
| 945 | Update threading model for Consumer TLDR: Refactors the `KafkaConsumer` threading model by moving heartbeating and coordinator communication to a dedicated background thread while keeping application callbacks on the calling thread. The existing implementation mixes application-thread and background-thread execution in ways that make the code fragile, hard to reason about, and prone to subtle concurrency bugs. |
Consumer | Discussion | Kirk True | 2023-06-29 | 2023-07-12 | KAFKA-14246 | 3.8 | |
| 942 | Add Power(ppc64le) support TLDR: Adds CI/CD pipeline support and testing for Apache Kafka on IBM Power (ppc64le) architecture. Kafka had no official support or validated builds for ppc64le, which is widely used in banking and HPC industries running on IBM infrastructure. |
Broker | Accepted | Vaibhav Nazare | 2023-06-19 | 2024-03-22 | KAFKA-15062 | ||
| 941 | Range queries to accept null lower and upper bounds TLDR: Changes RangeQuery.withRange(K lower, K upper) to treat null bounds as absent (using Optional.ofNullable), so null lower/upper bounds implicitly widen to unbounded rather than throwing or using null as a literal key. Callers receiving nullable query parameters from HTTP clients had to write explicit null-checking logic before constructing a RangeQuery. |
Streams | Accepted | Lucia Cerchie | 2023-06-15 | 2023-08-08 | KAFKA-15126 | 3.7 | |
| 940 | Broker extension point for validating record contents at produce time TLDR: Proposes a broker-side RecordValidationPolicy extension point that intercepts records before append and can reject them with InvalidRecordException. There is currently no broker-enforced mechanism to prevent misconfigured clients from bypassing schema registry serializers and producing incorrectly formatted records. |
Broker | Discussion | Edoardo Comar | 2023-06-07 | 2023-07-14 | |||
| 939 | Support Participation in 2PC TLDR: Enables Kafka brokers to act as a participant (not just coordinator) in an external two-phase commit (2PC) protocol by adding a `PrepareTransaction` API phase before the existing `EndTransaction`. Without native 2PC participation support, applications needing atomic cross-system transactions (Kafka + external DB) must implement fragile custom recovery logic outside the broker. |
Transactions Broker | Accepted | Artem Livshits | 2023-06-02 | 2024-07-23 | KAFKA-15370 | 4.1 | InitProducerId v6 |
| 938 | Add more metrics for measuring KRaft performance TLDR: Adds a set of KRaft-specific metrics including TimedOutBrokerHeartbeatCount, EventQueueOperationsStartedCount, NewActiveControllersCount, CurrentMetadataVersion, HandleLoadSnapshotCount, and LatestSnapshotGeneratedBytes/AgeMs. KRaft mode lacked sufficient observability into controller health, quorum performance, and metadata snapshot lifecycle. |
KRaft Metrics | Accepted | Colin McCabe | 2023-06-01 | 2023-07-12 | KAFKA-15183 | 3.7 | |
| 937 | Improve Message Timestamp Validation TLDR: Adds a new log.message.timestamp.before.max.ms config (and optionally log.message.timestamp.after.max.ms) for brokers/topics to reject producer records whose CreateTime timestamp is too far in the past or future, returning INVALID_TIMESTAMP. The existing log.message.timestamp.difference.max.ms config was a symmetric bound that could not distinguish past vs. future violations, and its default of Long.MAX_VALUE allowed producers misconfigured with nanosecond timestamps to corrupt log retention behavior. |
Broker | Accepted | Mehari Beyene | 2023-05-30 | 2025-06-03 | KAFKA-14991 | 3.6 | |
| 936 | Throttle number of active PIDs TLDR: KIP-936 proposes a configurable cap on the maximum number of active producer IDs (PIDs) tracked by `ProducerStateManager` per broker, with throttling or rejection when the limit is reached. Since KIP-679 made idempotent producers the default, every producer instance is assigned a PID; a large number of short-lived or restarting producers can cause `ProducerStateManager` to accumulate unbounded in-memory state, risking OOM errors. |
Broker Transactions | Discussion | Omnia Ibrahim | 2023-05-23 | 2024-06-13 | KAFKA-15063 | Produce | |
| 935 | Extend AlterConfigPolicy with existing configurations TLDR: Extends the `AlterConfigPolicy` interface to receive both the proposed config changes and the current (existing) configuration values, so policies can validate incremental updates in context. After KIP-339 introduced `IncrementalAlterConfigs`, policy implementations only see the delta, not the full resulting configuration, making context-aware validation impossible. |
Admin Broker | Discussion | Jorge Esteban Quilcate Otoya | 2023-05-19 | 2023-07-25 | KAFKA-15014 | ||
| 934 | Add DeleteTopicPolicy TLDR: Introduces a `DeleteTopicPolicy` plugin interface (analogous to `CreateTopicPolicy` from KIP-108) that brokers invoke before executing a topic deletion to allow operators to veto or control the deletion. Without a deletion policy, operators have no server-side safeguard to prevent accidental deletion of critical internal topics (e.g., `__consumer_offsets`, Connect internal topics) even when coarse-grained ACLs grant delete permission. |
Broker Admin | Discussion | Jorge Esteban Quilcate Otoya | 2023-05-19 | 2023-06-06 | KAFKA-15013 | DeleteTopics | |
| 933 | Publish metrics when source connector fails to poll data TLDR: Adds source-record-poll-error-total and source-record-poll-error-rate metrics at task granularity in kafka.connect:type=task-error-metrics to track failures in SourceTask#poll(). No existing metric distinguished between failures during polling from the source system versus failures when producing to Kafka, forcing operators to rely on log inspection. |
Connect Metrics | Discussion | Ravindranath | 2023-05-20 | 2023-06-24 | KAFKA-14952 | ||
| 932 | Queues for Kafka TLDR: Introduces Share Groups to Kafka, providing queue semantics with cooperative multi-consumer consumption, per-message acknowledgment, delivery counts, and redelivery of unacknowledged records. Classic consumer groups couple parallelism to partition count and lack per-message acknowledgment, making them unsuitable for work-queue patterns where tasks are independent, consumers vary dynamically, and failed work must be redelivered. |
Consumer Broker | Accepted | Andrew Schofield | 2023-05-15 | 2026-01-26 | KAFKA-16092 | 3.9 | FindCoordinator v6 ShareGroupHeartbeat v0/v1 ShareGroupDescribe v0/v1 ShareFetch v0/v1 ShareAcknowledge v0/v1 WriteShareGroupState v0 ReadShareGroupStateSummary v0 DescribeShareGroupOffsets v0 ListGroups AlterShareGroupOffsets DeleteShareGroupOffsets DeleteShareGroupState InitializeShareGroupState ReadShareGroupState |
| 931 | Flag to ignore unused message attribute field TLDR: Proposes a batch-header attribute bit to optionally omit the per-message attributes field (currently always present but unused since message format v1) from records, reducing per-record overhead. Discarded because space savings do not outweigh message conversion costs and planned v3 format changes will address this more comprehensively. |
Protocol Broker | Discarded | Luke Chen | 2023-05-12 | 2023-05-17 | |||
| 930 | Rename ambiguous Tiered Storage Metrics TLDR: Renames tiered storage metrics `RemoteBytesIn`/`RemoteBytesOut` to `RemoteBytesRead`/`RemoteBytesWritten` to align with Kafka's standard `BytesIn`/`BytesOut` semantics (in=written-to, out=read-from). The original naming was inverted relative to broker convention, causing operator confusion in dashboards. |
Tiered Storage Metrics | Accepted | Satish Duggana | 2023-05-12 | 2023-09-25 | KAFKA-15236 | 3.6 | Fetch |
| 928 | Making Kafka resilient to log directories becoming full TLDR: Proposes making Kafka (KRaft) resilient to log directories becoming full by detecting the condition, fencing the affected partitions, and alerting operators rather than crashing. Full log directories currently cause uncontrolled broker behavior; controlled fencing prevents data loss and enables operators to take remediation actions. |
Broker | Discussion | Christo Lolov | 2023-05-11 | 2024-10-14 | |||
| 927 | Improve the kafka-metadata-quorum output TLDR: Adds a `--human-readable` flag to the `kafka-metadata-quorum` CLI tool that converts `LastFetchTimestamp` and `LastCaughtUpTimestamp` fields from Unix epoch milliseconds to human-readable relative delay strings (e.g., `366 ms ago`). Raw epoch timestamps in the quorum replication status output require manual conversion to assess voter lag. |
KRaft Admin | Accepted | Federico Valeri | 2023-05-10 | 2023-05-25 | KAFKA-14982 | 3.6 | |
| 926 | introducing acks=min.insync.replicas config TLDR: Proposes a new `acks=min.insync.replicas` producer configuration value that dynamically binds the acks requirement to the broker's current `min.insync.replicas` setting for the target topic. This eliminates the need to keep producer `acks` configuration in sync with per-topic `min.insync.replicas` values as they change. |
Producer Broker | Discussion | Luke Chen | 2023-05-09 | 2023-05-12 | |||
| 925 | Rack aware task assignment in Kafka Streams TLDR: Adds rack-aware task assignment to Kafka Streams' HighAvailabilityTaskAssignor by preferring to assign StreamTasks to Streams clients located in the same rack as the TopicPartition replicas they need to read. Without rack awareness, Streams tasks could be assigned to instances in a different availability zone than the partition leader, incurring cross-rack network costs and higher fetch latency. |
Streams | Accepted | Hao Li | 2023-05-08 | 2024-05-20 | KAFKA-15022 | 3.6 | |
| 924 | customizable task assignment for Streams TLDR: KIP-924 promotes the `task.assignor.class` configuration in Kafka Streams to a public API, allowing users to select between `HighAvailabilityTaskAssignor` and `StickyTaskAssignor` (or supply a custom implementation). KIP-441 introduced `HighAvailabilityTaskAssignor` as the new default but the existing `StickyTaskAssignor` remains preferable in some use cases, and the existing backdoor config was internal with no stability guarantee. |
Streams | Accepted | A. Sophie Blee-Goldman | 2023-05-31 | 2024-06-19 | KAFKA-15045 | 3.8 | |
| 923 | Add A Grace Period to Stream Table Join TLDR: Adds a gracePeriod parameter to Joined to buffer stream-side records in a stream-table join, allowing them to match the correct versioned table entry rather than the latest value at arrival time. Without buffering, stream records that arrive slightly before a table update join with the previous version, producing incorrect results even when a versioned table is used. |
Streams | Accepted | Walker Carlson | 2023-04-25 | 2023-06-06 | KAFKA-14936 | 3.7 | |
| 922 | Add the traffic metric of the partition dimension TLDR: Adds per-partition metrics for `MessagesInPerSec`, `BytesInPerSec`, and `BytesOutPerSec` at the broker level. Topic-level aggregates mask partition-level traffic skew, making hot-spot detection and ISR instability diagnosis require external tooling that cross-references partition leaders with topic metrics. |
Metrics Broker | Discussion | hudeqi | 2023-04-23 | 2023-04-23 | KAFKA-14907 | ||
| 921 | OpenJDK CRaC support TLDR: KIP-921 adds support for OpenJDK CRaC (Coordinated Restore at Checkpoint), allowing Kafka brokers and clients to checkpoint their JVM state to disk and restore it with near-zero startup time. Kafka's long JVM warmup time (connection establishment, metadata loading, JIT compilation) makes it a poor fit for elastic scaling scenarios where fast restart is critical. |
Broker | Discussion | Radim Vansa | 2023-04-21 | 2023-04-21 | |||
| 919 | Allow AdminClient to Talk Directly with the KRaft Controller Quorum and add Controller Registration TLDR: KIP-919 allows `AdminClient` to communicate directly with KRaft controller quorum nodes (not just broker nodes) when a new `bootstrap.controllers` config is specified, enabling admin operations like `DescribeQuorum` and controller-targeted `IncrementalAlterConfigs` without involving brokers. It also introduces a `ControllerRegistration` RPC so KRaft controllers register their endpoints and metadata (including ZK migration readiness) with the active controller, making controller topology discoverable via the admin API. |
KRaft Admin | Accepted | Colin McCabe | 2023-04-18 | 2024-04-15 | KAFKA-15230 | 3.7 | DescribeCluster v1 DescribeConfigs ControllerRegistration Metadata ApiVersions |
| 918 | MM2 Topic And Group Listener TLDR: Adds a `TopicAndGroupListener` API to MM2 that notifies registered listeners when the set of replicated topics or consumer groups changes. Currently there is no way to observe which topics and groups MM2 is actively replicating without parsing remote topic names or using monitoring heuristics that break with `IdentityReplicationPolicy`. |
MirrorMaker | Discussion | Daniel Urban | 2023-04-13 | 2023-04-18 | KAFKA-14903 | ||
| 917 | Additional custom metadata for remote log segment TLDR: Allows RemoteStorageManager.copyLogSegmentData() to return an optional opaque byte[] of custom metadata that is stored in RemoteLogSegmentMetadata and returned on every fetch. RSM implementations need to persist implementation-specific data (e.g., bucket name for load-balanced S3 placement) alongside standard segment metadata, but the existing API had no extension point for this. |
Tiered Storage | Accepted | Ivan Yurchenko | 2023-04-06 | 2023-08-03 | KAFKA-15107 | 3.6 | |
| 916 | MM2 distributed mode flow log context TLDR: Adds a flow.context MDC key (e.g., [primary→backup]) and embeds the replication flow in Connect thread names when running MirrorMaker2 in dedicated mode. Multiple MM2 replication flows share identical connector and task names, making log lines and thread names indistinguishable across flows during troubleshooting. |
MirrorMaker | Discussion | Daniel Urban | 2023-03-30 | 2023-06-12 | KAFKA-14652 | ||
| 915 | Txn and Group Coordinator Downgrade Foundation TLDR: Introduces versioned record types and downgrade support for the transaction coordinator (`__transaction_state`) and group coordinator (`__consumer_offsets`) by adding version headers and unknown-record skipping logic. Without this foundation, adding any new record field or type to these internal topics permanently prevents downgrading to an older broker version. |
Transactions Consumer | Accepted | Jeff Kim | 2023-03-15 | 2024-01-23 | KAFKA-14869 | 3.5 | |
| 914 | DSL Processor Semantics for Versioned Stores TLDR: Enables versioned key-value stores (KIP-889) to be used as backing stores for `KTable` in DSL stream-table joins, so out-of-order stream records are joined against the table value at the record's event timestamp rather than the latest value. Without versioned stores, stream-table joins always use the latest table value regardless of event time, producing incorrect results for late-arriving stream records. |
Streams | Accepted | Victoria Xia | 2023-03-10 | 2023-04-13 | KAFKA-14834 | 3.5 | |
| 913 | Support decreasing send's block time without worrying about metadata's fetch TLDR: KIP-913 proposes adding a `KafkaProducer.getCluster(String topic, long maxBlockTimeMs)` public method that triggers an immediate metadata fetch and returns the resulting `Cluster` object, allowing applications to warm up the metadata cache before handling their first request. On cold start, the first `KafkaProducer.send()` incurs the full metadata-fetch latency because the local cache is empty; pre-fetching metadata during application initialization moves this cost out of the critical path. |
Producer | Discussion | fu.jian | 2023-03-06 | 2023-03-07 | |||
| 911 | Add source tag to Mirror source metric TLDR: Adds a `source` tag to `MirrorSourceConnector` metrics (alongside the existing `target`, `topic`, `partition` tags) to identify the source cluster alias. With `IdentityReplicationPolicy`, topic names do not contain the source alias, making source attribution impossible in multi-cluster mirroring setups. |
MirrorMaker Metrics | Accepted | Mickael Maison | 2023-02-23 | 2023-03-21 | KAFKA-14740 | 3.5 | |
| 910 | Update Source offsets for Source Connectors without producing records TLDR: Adds a SourceTask#updateOffsets() method allowing source connectors to commit updated offsets to the Connect offset store without producing any records to Kafka. Without this, offsets only advance when records are produced, causing offset staleness during quiet periods in the source system and risking inability to resume from the correct position if the source purges its change log. |
Connect | Discussion | Sagar Rao | 2023-02-23 | 2023-09-06 | KAFKA-3821 | ||
| 909 | DNS Resolution Failure Should Not Fail the Client TLDR: Changes Kafka client construction to tolerate transient DNS resolution failures by deferring the fatal ConfigException and retrying resolution in the background during the connection establishment phase. A client whose bootstrap.servers hostname cannot be resolved at construction time immediately throws a ConfigException and dies, even if the DNS entry is about to become available—an especially severe problem in containerized environments with slow DNS propagation. |
Client | Accepted | Philip Nee | 2023-02-22 | 2026-03-21 | KAFKA-14648 | 4.3 | |
| 907 | Add Boolean Serde to public interface TLDR: Adds `BooleanSerializer`, `BooleanDeserializer`, and `Serdes.Boolean()` to the public `org.apache.kafka.common.serialization` package. Boolean was the only primitive type missing from Kafka's built-in serialization library, requiring users to implement their own or convert to byte/integer. |
Streams Client | Accepted | Spacrocket | 2023-02-20 | 2023-03-06 | KAFKA-14491 | 3.5 | |
| 906 | Tools migration guidelines TLDR: Establishes migration guidelines and compatibility rules for moving Kafka CLI tools from the `kafka.tools`/`kafka.admin` packages in the `core` module to `org.apache.kafka.tools` in the dedicated `tools` module. The ongoing module-split initiative (KAFKA-14524/14525) risks breaking users who invoke tools by their fully-qualified class name or via SPI arguments, so explicit compatibility contracts are needed. |
Admin | Accepted | Federico Valeri | 2023-02-15 | 2024-10-07 | KAFKA-14720 | 3.5 | |
| 905 | Broker interceptors TLDR: Proposes a `BrokerInterceptor` plugin point on the broker side that allows platform teams to inject cross-cutting logic (schema validation, serialization enforcement, privacy checks) into the produce/consume path without requiring changes to polyglot client libraries. Enforcing platform standards across diverse client languages today requires maintaining duplicate library extensions in every supported language, which is operationally expensive and error-prone. |
Broker | Discussion | David Mariassy | 2023-02-09 | 2023-02-12 | KAFKA-14700 | Produce | |
| 904 | Kafka Streams - Guarantee subtractor is called before adder if TLDR: KIP-904 changes `KTable.groupBy()` to emit a single combined event when the grouping key is unchanged (carrying both the old and new value together) instead of two separate tombstone/insert events, guaranteeing the subtractor is applied before the adder in the downstream aggregate. With the two-event approach, a stream-table join could observe the intermediate state after the subtractor fires but before the adder fires, producing a transiently inconsistent aggregate value that is visible to downstream consumers. |
Streams | Accepted | Farooq Qaiser | 2023-02-05 | 2023-04-15 | KAFKA-12446 | 3.5 | |
| 903 | Replicas with stale broker epoch should not be allowed to join the ISR TLDR: Enforces in KRaft mode that an AlterPartition request is rejected (INELIGIBLE_REPLICA) if any proposed ISR member has a broker epoch older than the epoch tracked by the leader at the time of the Fetch request. A race where a broker reboots with a new empty disk, gets re-registered, and is then added to the ISR by a stale AlterPartition from before the fence was lifted could allow that broker to become leader with an empty log, causing data loss. |
Broker KRaft | Accepted | Calvin Liu | 2023-01-12 | 2023-03-17 | KAFKA-14139 | 3.5 | Fetch v15 AlterPartition v3 |
| 902 | Upgrade Zookeeper to 3.8.2 TLDR: Upgrades Kafka's ZooKeeper dependency from 3.6.3 (EOL December 2022) to 3.8.2. ZooKeeper 3.6.3 no longer receives security or bug-fix patches; continuing to ship it exposes Kafka deployments to unpatched vulnerabilities and prevents adoption of improvements in the 3.8.x release line. |
Broker | Accepted | Christo Lolov | 2023-02-01 | 2023-09-20 | KAFKA-14661 | 3.6 | |
| 900 | KRaft kafka-storage.sh API additions to support SCRAM for Kafka Brokers TLDR: Extends kafka-storage.sh format with an --add-scram option that writes UserScramCredentialsRecord entries into the bootstrap.checkpoint, enabling SCRAM inter-broker authentication from the first cluster startup in KRaft mode. In ZooKeeper mode, SCRAM credentials could be seeded into ZooKeeper before brokers started; no equivalent mechanism existed for KRaft's __cluster_metadata bootstrap. |
KRaft Security | Accepted | Proven Provenzano | 2023-01-19 | 2023-02-28 | KAFKA-14084 | 3.5 | |
| 899 | Allow clients to rebootstrap TLDR: Allows Kafka clients to fall back to the bootstrap servers list and re-fetch cluster metadata when all currently known brokers become unreachable, rather than failing permanently. This solves the case where the entire broker set is replaced (e.g., IP changes, full cluster migration) and the cached metadata is entirely stale. |
Client Broker | Accepted | Ivan Yurchenko | 2023-01-18 | 2024-05-15 | KAFKA-8206 | 3.8 | |
| 898 | Modernize Connect plugin discovery TLDR: Replaces reflective class scanning during Connect worker startup with a manifest-based plugin discovery mechanism where plugin JARs include a pre-generated index of their classes. Reflective scanning loads every class on the classpath to find connectors, significantly delaying worker startup and provisioning in cloud environments. |
Connect | Accepted | Greg Harris | 2023-01-17 | 2023-10-11 | KAFKA-14627 | 3.6 | |
| 896 | Remove old client protocol API versions in Kafka 4.0 TLDR: Sets a new minimum baseline for Kafka wire protocol API versions at Apache Kafka 2.1 (released 2018), removing support for all older protocol versions in Kafka 4.0. Maintaining backward compatibility with 9+ years of protocol API versions added significant code complexity and testing matrix size with diminishing value as modern clients stopped using them. |
Protocol Admin | Accepted | Ismael Juma | 2022-12-28 | 2025-07-10 | KAFKA-19444 | 3.7 | AlterReplicaLogDirs CreateAcls CreateDelegationToken CreateTopics DeleteAcls DeleteGroups DeleteTopics DescribeAcls DescribeConfigs DescribeDelegationToken DescribeLogDirs ExpireDelegationToken Fetch FindCoordinator JoinGroup ListOffsets Metadata OffsetCommit OffsetFetch OffsetForLeaderEpoch Produce RenewDelegationToken SaslHandshake |
| 895 | Dynamically refresh partition count of __consumer_offsets TLDR: Proposes dynamically reloading the partition count of `__consumer_offsets` at runtime after it is increased, without requiring a rolling broker restart. Currently the new partition count is only picked up on broker startup, so expanding the consumer offsets topic requires a disruptive rolling restart to redistribute group coordinators. |
Consumer Broker | Discussion | Christo Lolov | 2022-12-29 | 2023-01-02 | |||
| 894 | Use incrementalAlterConfig for syncing topic configurations TLDR: Migrates MirrorMaker2 topic configuration sync from the deprecated AlterConfigs RPC to IncrementalAlterConfigs, controlled by a use.incremental.alter.configs setting (default requested). The AlterConfigs API replaces the entire topic config, causing MirrorMaker2 to clear configurations it was not supposed to replicate (e.g., throttling replicas set by Cruise Control on the target cluster). |
MirrorMaker Admin | Accepted | Gantigmaa Selenge | 2022-12-15 | 2023-02-27 | KAFKA-14420 | 3.5 | |
| 893 | The Kafka protocol should support nullable structs TLDR: Adds nullable struct support to the Kafka RPC protocol schema, enabling struct fields to be null on the wire (serialized as INT8 -1/VARINT 0 presence byte before the struct bytes). This is needed for KIP-848's ConsumerGroupHeartbeat response, which must be able to return null assignment to indicate no change, rather than re-sending the full assignment every heartbeat. |
Protocol | Accepted | David Jacot | 2022-12-01 | 2022-12-08 | KAFKA-14425 | 3.5 | |
| 892 | Transactional Semantics for StateStores TLDR: Introduces transactional semantics for RocksDB-backed Kafka Streams state stores, ensuring state is only persisted to disk once the corresponding changelog offset is durably committed, bounding state restore under EOS to under 1 second. Under the current design, state is written to the local store before changelog commits complete, so EOS crash recovery must replay the entire changelog from scratch, causing restore times proportional to changelog size that in production can span multiple days. |
Streams Transactions | Accepted | Nicholas Telford | 2022-11-21 | 2024-11-26 | KAFKA-14412 | ||
| 891 | Running multiple versions of Connector plugins TLDR: Allows Kafka Connect workers to load and run multiple versions of the same connector plugin simultaneously, decoupling connector version installation from migration of individual connector instances. Without multi-version support, upgrading a connector across all tasks requires a simultaneous cutover with no incremental rollback path, and running two connector versions requires two separate Connect clusters. |
Connect | Accepted | Snehashis Pal | 2022-11-20 | 2025-01-06 | KAFKA-14410 | 4.0 | |
| 890 | Transactions Server-Side Defense TLDR: Adds server-side transaction defense in the broker to reject out-of-order or delayed transactional writes that could cause hanging transactions (stalled LSO), incorrect transaction completion, or EOS violations. Network partitions and client bugs can deliver delayed `AddPartitionsToTxn` or `EndTxn` requests after epoch boundaries, leading to partially-committed state that blocks read-committed consumers and prevents log compaction. |
Transactions Broker | Accepted | Justine Olshan | 2022-11-18 | 2025-02-13 | KAFKA-14402 | 3.8 | Produce v11/v12 FindCoordinator v5 InitProducerId v5 AddPartitionsToTxn v5 AddOffsetsToTxn v4 EndTxn v4/v5 TxnOffsetCommit v4/v5 OffsetCommit |
| 889 | Versioned State Stores TLDR: KIP-889 introduces versioned (multi-version) state stores in Kafka Streams that retain multiple timestamped values per key, enabling the store to answer point-in-time lookups. Existing state stores keep only the latest value per key, making stream-table joins produce incorrect results for out-of-order records because the correct historical table value is no longer available. |
Streams | Accepted | Victoria Xia | 2022-11-16 | 2023-03-17 | KAFKA-14491 | 3.5 | |
| 888 | Batch describe ACLs and describe client quotas TLDR: Extends DescribeAclsRequest (v4) and DescribeClientQuotasRequest (v2) to accept multiple filter sets in a single request, returning batched results. Management tooling that operates on many users/groups had to issue one RPC per entity, incurring unnecessary per-request overhead and serializing what could be a parallel server-side operation. |
Security Admin | Discussion | Mickael Maison | 2022-11-15 | 2023-02-01 | KAFKA-14357 | DescribeAcls DescribeClientQuotas | |
| 887 | Add ConfigProvider to make use of environment variables TLDR: Adds EnvVarConfigProvider, a new ConfigProvider implementation that reads values from environment variables with an optional allowlist.pattern regex filter. No built-in mechanism existed to inject secrets or environment-specific values from process environment variables into Kafka client or Connect configurations. |
Admin Client | Accepted | Roman Schmitz | 2022-11-12 | 2023-04-06 | KAFKA-14376 | 3.5 | |
| 886 | Add Client Producer and Consumer Builders TLDR: Introduces a fluent builder API for `KafkaProducer` and `KafkaConsumer` with named methods, typed enum parameters, and Javadoc for each configuration option. The existing constructor takes an untyped `Properties` map, offering no IDE autocomplete, no type safety, and no discoverability of valid configuration values. |
Client Producer Consumer | Discussion | Daniel Scanteianu | 2022-11-10 | 2023-04-14 | KAFKA-14373 | ||
| 885 | Expose Broker's Name and Version to Clients TLDR: Proposes a new `DescribeBroker` or extended `ApiVersions` response that explicitly includes the broker's Kafka version string and name. Currently clients infer broker version by correlating `ApiVersions` response ranges with known per-version API changes, which breaks when a release adds no new API versions. |
Protocol Broker | Discussion | Travis Bischel | 2022-11-10 | 2022-12-02 | KAFKA-14377 | ApiVersions DescribeCluster | |
| 884 | Add config to configure KafkaClientSupplier in Kafka Streams TLDR: Adds default.client.supplier to StreamsConfig, allowing a custom KafkaClientSupplier class to be specified via configuration rather than requiring code changes to the KafkaStreams constructor call. Existing applications that want to swap the client supplier (e.g., for tracing or testing) must modify every instantiation site; a config-driven approach enables the change without touching application code. |
Streams | Accepted | Hao Li | 2022-11-07 | 2022-12-06 | KAFKA-14395 | 3.5 | |
| 882 | Kafka Connect REST API configuration validation timeout improvements TLDR: Modifies the Connect REST API's connector create/update endpoints to abort (rather than proceed) when configuration validation exceeds the request timeout, and proposes a per-request timeout override. The existing behavior would create/update the connector after validation completed even if the REST response had already timed out, resulting in phantom connectors that users did not know existed. |
Connect | Discussion | Yash Mayya | 2022-10-28 | 2023-07-10 | KAFKA-14353 | ||
| 881 | Rack-aware Partition Assignment for Kafka Consumers TLDR: Extends the Kafka consumer partition assignment algorithms to prefer assigning partitions to consumers whose rack matches the replica rack, minimizing cross-AZ fetches when the replication factor is lower than the number of AZs. The existing rack-aware fetch (KIP-392) only helped when every AZ had a local replica; with fewer replicas than AZs some consumers were always assigned partitions with no local replica. |
Consumer | Accepted | Rajini Sivaram | 2022-11-02 | 2023-03-29 | KAFKA-14352 | 3.4 | ConsumerProtocolAssignment ConsumerProtocolSubscription |
| 880 | X509 SAN based SPIFFE URI ACL within mTLS Client Certificates TLDR: KIP-880 proposes a `KafkaPrincipalBuilder` implementation that extracts SPIFFE URIs from the X.509 SAN (Subject Alternative Name) extension of mTLS client certificates and returns them as `KafkaPrincipal` objects for use in ACL rules. Istio-managed microservices in Kubernetes use SPIFFE-based SVID certificates for workload identity, but Kafka's existing mTLS principal extraction only reads the `CN` field, forcing operators to configure separate authentication mechanisms rather than reusing the Istio-provided identity. |
Security | Discussion | Bart Van Bos | 2022-10-29 | 2022-10-29 | KAFKA-14340 | ||
| 879 | Multi-level Rack Awareness TLDR: Extends Kafka's rack awareness to support hierarchical multi-level rack topologies (e.g., host → rack → zone → datacenter), so partition replicas are distributed across the most fault-isolated units possible. The current single-level `broker.rack` model cannot express datacenter or zone boundaries for stretch cluster deployments. |
Broker | Discussion | Viktor Somogyi | 2022-10-05 | 2022-11-29 | KAFKA-14281 | ||
| 878 | Autoscaling for Stateless & Statically Partitioned Streams TLDR: Enables Kafka Streams to detect when user input topic partition counts have changed and automatically resize internal repartition and changelog topics to match. Streams currently treats a partition count change as a fatal topology mismatch and shuts down, preventing any horizontal scaling of partitioned topologies without a full state migration. |
Streams | Accepted | A. Sophie Blee-Goldman | 2022-10-18 | 2023-02-23 | KAFKA-14318 | 3.5 | |
| 877 | Mechanism for plugins and connectors to register metrics TLDR: Adds a MetricsRegistry to the plugin configuration context so that Kafka plugins (partitioners, interceptors, Connect connectors, etc.) can register metrics that inherit parent component tags and reporters. Plugins that created their own Metrics instances got no parent tags, no reporter integration, and caused failures with singleton reporters like CruiseControlMetricsReporter when instantiated multiple times. |
Metrics Connect | Accepted | Mickael Maison | 2022-10-10 | 2025-09-29 | KAFKA-15995 | 4.1 | |
| 876 | Time based cluster metadata snapshots TLDR: Adds time-based (wall-clock interval) triggering for KRaft metadata snapshot generation, complementing the existing byte-threshold trigger. Without a time-based trigger, clusters with low metadata change rates may go very long periods without snapshots, causing slow controller startup due to full metadata log replay on boot. |
KRaft | Accepted | Jose Armando Garcia Sancio | 2022-10-07 | 2022-10-31 | KAFKA-14286 | 3.4 | |
| 875 | First-class offsets support in Kafka Connect TLDR: Adds GET/PATCH/DELETE /connectors/{connector}/offsets REST endpoints to Kafka Connect, enabling administrators to read, modify, and reset connector offsets without direct access to the internal offset topic. No official API existed for offset inspection or manipulation, forcing operators to write custom tooling against the internal __connect-offsets topic. |
Connect | Accepted | Chris Egerton | 2022-10-07 | 2023-04-11 | KAFKA-4107 | 3.5 | |
| 873 | ExceptionHandlingDeserializer, RetryDeserializer, PipeSerializer, PipeDeserializer TLDR: Proposes `ExceptionHandlingDeserializer`, `RetryDeserializer`, and `PipeDeserializer`/`PipeSerializer` wrappers that compose deserialization with retry logic, dead-letter routing, and chained format conversion. Consumer applications today have no standard mechanism to handle deserialization exceptions without catching raw exceptions and implementing retry/skip logic manually. |
Client Consumer | Discussion | Alex Collins | 2022-09-30 | 2022-09-30 | |||
| 870 | Retention policy based on record event time TLDR: Proposes adding retention.max.eventtime.ms, a new retention policy that truncates log segments once the maximum event timestamp in the segment exceeds the configured age, independent of broker wall-clock time. Existing time-based retention compares append time to wall clock, which aggressively deletes data during reprocessing scenarios when Kafka Streams re-ingests old events. |
Broker | Discussion | Nikolay Izhikov | 2022-09-20 | 2022-09-21 | KAFKA-13866 | ||
| 869 | Improve Streams State Restoration Visibility TLDR: Enriches Kafka Streams state restoration with additional metrics (bytes/records restored, restoration lag) and new callbacks in `StateRestoreListener` covering batch-level progress for active tasks. Previously, visibility into restoration progress was minimal, making it hard to estimate restoration completion time or trigger operational actions. |
Streams Metrics | Accepted | Guozhang Wang | 2022-09-16 | 2024-04-23 | KAFKA-10199 | 3.8 | |
| 868 | Metadata Transactions TLDR: Introduces metadata transactions in KRaft so the controller can atomically commit a set of metadata records larger than the Raft fetch size limit (currently hard-coded at 8 KB) by spanning multiple batches with transaction start/end markers. Creating a topic with thousands of partitions generates a `TopicRecord` + N `PartitionRecord`s that can exceed 8 KB, but partial commits during a controller failover would leave the metadata store in an inconsistent state. |
KRaft Protocol | Accepted | David Arthur | 2022-09-07 | 2022-10-14 | KAFKA-14305 | 3.6 | |
| 866 | ZooKeeper to KRaft Migration TLDR: Defines the migration path and new broker/controller APIs to move a Kafka cluster from ZooKeeper-based metadata to KRaft quorum without partition availability impact, writing metadata to both stores simultaneously during the bridge period to allow rollback. Completing KIP-500 required a concrete, reversible migration procedure since running ZooKeeper in production clusters has significant operational overhead and single-point-of-failure concerns. |
KRaft | Accepted | David Arthur | 2022-09-08 | 2024-11-12 | KAFKA-14304 | 3.4 | BrokerRegistration LeaderAndIsr StopReplica UpdateMetadata Metadata ApiVersions |
| 865 | Support --bootstrap-server in kafka-streams-application-reset TLDR: Adds --bootstrap-server (singular) as an alias to the existing --bootstrap-servers parameter in kafka-streams-application-reset.sh, deprecating the non-standard plural form. All other Kafka CLI tools use --bootstrap-server; the inconsistency in the streams reset tool causes user confusion. |
Streams Admin | Accepted | Nikolay Izhikov | 2022-08-31 | 2022-09-13 | KAFKA-12878 | 3.4 | |
| 864 | Add End-To-End Latency Metrics to Connectors TLDR: Adds end-to-end latency metrics for source connectors (transform-chain-source-record-time, convert-source-record-time) and sink connectors (record wall-clock lag, convert/transform time), providing a complete per-stage latency breakdown. Existing metrics only covered interaction with the external system (poll-batch-time, put-batch-latency), leaving the transformation and serialization pipeline unobservable. |
Connect Metrics | Discussion | Jorge Esteban Quilcate Otoya | 2022-08-30 | 2023-01-26 | KAFKA-14191 | ||
| 862 | Self-join optimization for stream-stream joins TLDR: Optimizes KStream-KStream inner self-joins (where both sides read from the same topic) to use a single state store instead of two identical stores. Self-joins in Kafka Streams were treated as regular joins, creating two stores that always contained identical data, doubling state storage and write amplification unnecessarily. |
Streams | Accepted | Vicky Papavasileiou | 2022-07-29 | 2022-09-21 | KAFKA-14209 | 3.8 | |
| 860 | Add client-provided option to guard against replication factor change during partition reassignments TLDR: Adds a validateOnly option to the AlterPartitionReassignments AdminClient API that lets callers verify a reassignment request (including replication factor impact) without executing it, and adds a targetReplicationFactor field to prevent accidental RF changes. AdminClient#describeTopics returns the full intermediate replica set during ongoing reassignments, making it impossible for callers to reliably compute the current RF without a TOCTOU race condition. |
Admin Broker | Accepted | Stanislav Kozlovski | 2022-07-28 | 2025-03-05 | KAFKA-14121 | 4.1 | Metadata UpdateMetadata AlterPartitionReassignments |
| 859 | Add Metadata Log Processing Error Related Metrics TLDR: Adds metadata-apply-error-count and metadata-load-error-count metrics to broker metadata processing, and MetadataErrorCount to the controller, tracking failures during MetadataDelta generation and MetadataImage application in KRaft mode. Errors during metadata log processing can leave the node's in-memory state inconsistent with no observable signal, making diagnosis impossible without grepping logs. |
Metrics KRaft | Accepted | Niket Goel | 2022-07-26 | 2022-08-04 | KAFKA-14114 | 3.3 | |
| 858 | Handle JBOD broker disk failure in KRaft TLDR: Extends KRaft mode to handle individual JBOD log directory failures by having the broker signal the controller via BrokerHeartbeat when a directory goes offline, triggering LeaderAndIsr updates for affected replicas. In ZooKeeper mode the broker directly notified the controller; in KRaft the broker's continued heartbeats masked per-directory failures, leaving partitions in that directory without leader failover. |
KRaft Broker | Accepted | Igor Soarez | 2022-07-26 | 2024-04-02 | KAFKA-9837 | 3.7 | BrokerRegistration v2 Heartbeat BrokerHeartbeat AssignReplicasToDirs |
| 857 | Streaming recursion in Kafka Streams TLDR: Adds a `KStream.recursively(UnaryOperator<KStream> op)` DSL operator that pipes a stream's output back into an earlier stage of the same topology, enabling iterative graph traversal and recursive algorithms natively. Recursive patterns currently require separate loopback topics and a second `KafkaStreams` instance, incurring extra serialization, latency, and operational complexity. |
Streams | Discussion | Nicholas Telford | 2022-07-26 | 2022-09-06 | KAFKA-14110 | ||
| 855 | Add schema.namespace parameter to SetSchemaMetadata SMT in Kafka Connect TLDR: Adds schema.namespace as a distinct parameter to the SetSchemaMetadata SMT, allowing the Avro schema namespace to be set independently from the schema name. Previously the only way to set a namespace was to embed it in schema.name as a fully qualified name, preventing users from reusing the original table/topic name while only changing the namespace. |
Connect | Discussion | Michael Negodaev | 2022-07-22 | 2022-08-23 | KAFKA-7883 | ||
| 854 | Separate configuration for producer ID expiry TLDR: Introduces a separate `producer.id.expiration.ms` configuration distinct from `transactional.id.expiration.ms` so idempotent-only producers (without a `transactional.id`) have their producer IDs expired independently and on a shorter horizon. Previously both expiration windows were conflated, preventing fine-grained control over producer state cleanup for non-transactional producers. |
Producer Transactions | Accepted | Justine Olshan | 2022-07-21 | 2022-09-14 | KAFKA-14097 | 3.4 | |
| 853 | KRaft Controller Membership Changes TLDR: Introduces dynamic KRaft controller membership changes—adding and removing voters from the Raft quorum—without requiring all controllers to be shut down and manually updated. Before this KIP, changing the controller quorum composition required a full cluster shutdown and manual on-disk state manipulation, making controller node replacement and cluster scaling operationally risky and disruptive. |
KRaft | Accepted | José Armando García Sancio | 2022-07-18 | 2025-12-03 | KAFKA-14094 | 3.9 | Fetch v17 Vote v1 BeginQuorumEpoch v1 EndQuorumEpoch v1 DescribeQuorum v2 FetchSnapshot v1 |
| 852 | Optimize calculation of size for log in remote tier TLDR: Adds a RemoteLogMetadataManager#remoteLogSize(TopicIdPartition) API and a RemoteLogSizeBytes metric, allowing RLMM implementations to compute total remote tier size without listing all segment metadata. The existing O(num_remote_segments) scan via listRemoteLogSegments() becomes prohibitively expensive as the number of remote segments grows, slowing the retention enforcement loop. |
Tiered Storage Metrics | Accepted | Divij Vaidya | 2022-07-01 | 2023-07-26 | KAFKA-14038 | 3.6 | |
| 851 | Add requireStable flag into ListConsumerGroupOffsetsOptions TLDR: Adds a `requireStable` flag to `AdminClient.listConsumerGroupOffsets()` so callers can opt into seeing only stable (non-pending-transactional) committed offsets, consistent with the behavior available in `KafkaConsumer.committed()`. Previously the admin API always returned the latest committed offset regardless of in-flight transactions, making it impossible to observe only fully committed group state. |
Admin Transactions | Accepted | Guozhang Wang | 2022-07-01 | 2022-07-07 | 3.3 | Fetch OffsetFetch | |
| 849 | Expose logdirs total and usable space via kafka-log-dirs.sh TLDR: Adds totalBytes and usableBytes fields to every logDirs entry in the kafka-log-dirs.sh JSON output (version 2), exposing log directory disk capacity alongside replica offset data. KIP-827 added this data to the broker API but the CLI tool did not surface it, forcing operators to separately inspect disk usage. |
Admin | Discussion | Deng Ziming | 2022-06-21 | 2022-06-21 | |||
| 848 | The Next Generation of the Consumer Rebalance Protocol TLDR: Replaces the Classic stop-the-world consumer rebalance protocol with an incremental, server-side protocol where each consumer independently heartbeats and the broker's group coordinator manages assignment reconciliation without a synchronization barrier. The Classic protocol's full-group sync (all members stop, all rejoin) causes throughput disruption proportional to group size, and its thick client-side assignor design creates complex failure modes and slow recovery. |
Consumer | Accepted | David Jacot | 2022-06-14 | 2026-01-20 | KAFKA-14048 | 3.6 | OffsetCommit v9/v10 OffsetFetch v9/v10 ListGroups v5 ConsumerGroupHeartbeat v1 ConsumerGroupDescribe |
| 847 | Add ProducerIdCount metrics TLDR: KIP-847 adds a `ProducerIdCount` JMX metric to each broker that reports the number of active producer IDs tracked by `ProducerStateManager`. Operators had no lightweight way to monitor idempotent and transactional producer count; the only existing visibility (from KIP-360) required parsing log or snapshot data rather than querying a metric. |
Metrics Producer | Accepted | Artem Livshits | 2022-06-15 | 2022-07-18 | KAFKA-13999 | 3.5 | |
| 843 | Adding addMetricIfAbsent method to Metrics TLDR: Adds `Metrics.addMetricIfAbsent(MetricName, Measurable)` as an atomic get-or-create operation on the metrics registry. The existing two-step pattern (check existence, then `addMetric`) is not thread-safe: concurrent threads registering the same instance-level metric race and one throws `IllegalArgumentException`. |
Metrics | Accepted | Sagar Rao | 2022-05-23 | 2022-06-16 | KAFKA-13846 | 3.3 | |
| 842 | Add richer group offset reset mechanisms TLDR: Extends group offset reset with richer InitialOffsetResetStrategy options (e.g., EARLIEST_LOCAL, BY_DURATION) to handle scenarios such as consumer groups encountering newly added partitions after a topic partition count increase. The default auto.offset.reset=latest causes newly added partitions to start at the latest offset, silently skipping data that was produced into those partitions before the consumer group's rebalance detected the new partition count. |
Consumer | Discussion | hudeqi | 2022-05-23 | 2024-05-31 | KAFKA-12478 | ||
| 841 | Fenced replicas should not be allowed to join the ISR in KRaft TLDR: Enforces in KRaft mode that fenced or in-controlled-shutdown replicas cannot join the ISR (new INELIGIBLE_REPLICA error on AlterPartition) or be elected leader, and persists the controlled-shutdown state in the metadata log. Without this enforcement, a leader could add a fenced replica to the ISR via a stale AlterPartition, allowing a broker with missing data to serve as leader and cause data loss. |
KRaft Broker | Accepted | David Jacot | 2022-05-17 | 2022-06-16 | KAFKA-13916 | 3.3 | AlterPartition v2 |
| 838 | Simulate batching and compression TLDR: KIP-838 proposes a tool that reads a sample of segment log data and simulates the compression ratio that would be achieved if batching and compression were enabled, outputting the results as JSON. Cluster operators serving producers they do not control have no way to demonstrate concrete byte-reduction numbers to those producers before asking them to change their configuration. |
Broker | Discussion | Sergio Troiano | 2022-05-16 | 2022-05-16 | |||
| 836 | Expose replication information of the cluster metadata TLDR: Exposes the `DescribeQuorum` API (KIP-595) through the `AdminClient` and adds voter lag computation in units of committed offset delta rather than raw log end offset. The API existed at the protocol level but was not accessible via the Admin client, and raw `LogEndOffset` comparison does not accurately reflect replication lag. |
Admin KRaft | Accepted | Niket Goel | 2022-05-06 | 2022-09-28 | KAFKA-13888 | 3.3 | DescribeQuorum v1 |
| 835 | Monitor KRaft Controller Quorum Health TLDR: Introduces a `NoOpRecord` written periodically to the `__cluster_metadata` log to advance the high-watermark, and exposes KRaft controller quorum health metrics so monitoring systems can verify that committed offsets are progressing. Without this, a quorum that has stalled (e.g. due to a missing majority) cannot be detected by watching offset advancement alone. |
Metrics KRaft | Accepted | Jose Armando Garcia Sancio | 2022-05-06 | 2022-05-20 | KAFKA-13883 | 3.3 | |
| 833 | Mark KRaft as Production Ready TLDR: Formally marks KRaft mode as production-ready for new clusters starting in Kafka 3.3 and deprecates ZooKeeper mode. This closes the multi-year KIP-500 effort by declaring the ZooKeeper dependency unnecessary for all production use cases covered by the new self-managed metadata quorum. |
KRaft | Accepted | Colin McCabe | 2022-05-03 | 2024-02-09 | KAFKA-14127 | 3.7 | |
| 831 | Add metric for log recovery progress TLDR: Adds a `log-recovery-progress` metric (bytes recovered, total bytes, estimated completion percentage) to broker startup so operators can monitor log recovery progress. Log recovery on unclean shutdown is currently opaque: the broker logs nothing about progress, making it impossible to distinguish a slow recovery from a hung broker. |
Metrics Broker | Accepted | Luke Chen | 2022-04-19 | 2022-06-25 | KAFKA-13919 | 3.3 | |
| 830 | Allow disabling JMX Reporter TLDR: Adds an `auto.include.jmx.reporter` configuration (deprecated in 3.x, removed in 4.0) to allow disabling the `JmxReporter` that was previously always injected regardless of the `metric.reporters` setting, and makes `JmxReporter` a standard entry in `metric.reporters` default in Kafka 4.0. Without this, operators in environments that do not use JMX cannot eliminate the JmxReporter overhead or consolidate metrics to a single explicitly configured reporter. |
Metrics | Accepted | Mickael Maison | 2022-04-12 | 2022-06-08 | KAFKA-10360 | 3.4 | |
| 829 | (console-consumer) add print.topic property TLDR: Adds a `print.topic` property to `kafka-console-consumer` that prefixes each consumed message with the source topic name. When consuming from multiple topics via `--include`, there is no existing way to identify which topic a given record came from. |
Consumer Admin | Discussion | Evans Jahja | 2022-04-09 | 2022-04-22 | |||
| 828 | Add the corresponding validator to the configuration where the validator is missing TLDR: Adds missing validators to configuration parameters across AdminClientConfig, ConsumerConfig, ProducerConfig, SaslConfigs, and broker configs (e.g., NonNullValidator for serializer/deserializer classes, valid-value checks for security.protocol, compression.type, ssl.client.auth). Without validators, invalid values for these configs are only discovered at runtime when the misconfigured component attempts to use them. |
Admin | Discussion | RivenSun | 2022-04-08 | 2022-04-08 | KAFKA-13793 | 3.3 | |
| 827 | Expose logdirs total and usable space via Kafka API TLDR: KIP-827 adds `totalBytes` and `usableBytes` fields to the `DescribeLogDirs` API response (v4) and exposes them via `LogDirDescription` in the `AdminClient`. Storage metrics are available via JMX for trend monitoring and alerting, but automation and tooling that need a point-in-time disk capacity check (e.g. to verify a resize or confirm topic deletion freed space) previously had no Kafka-native API to query disk sizes. |
Admin Broker | Accepted | Mickael Maison | 2022-04-06 | 2022-09-29 | KAFKA-13958 | 3.3 | DescribeLogDirs |
| 826 | Define platforms supported TLDR: KIP-826 formally documents the set of operating systems, hardware architectures, and JVM versions that Apache Kafka officially supports and tests against. Without an explicit support matrix, users running Kafka on non-x86_64 platforms (e.g., aarch64, ppc64le) or with native dependencies like RocksDB and zstd have no authoritative guidance on whether their environment is supported. |
Broker | Discussion | Mickael Maison | 2022-03-29 | 2022-03-30 | |||
| 825 | introduce a new API to control when aggregated results are produced TLDR: Adds an EmitStrategy API to TimeWindowedKStream and SessionWindowedKStream with ON_WINDOW_CLOSE (emit only the final result when the window closes) and ON_WINDOW_UPDATE (current behavior: emit for every record). The existing suppress(Suppressed.untilWindowCloses()) approach uses a redundant in-memory buffer with its own changelog topic, adding CPU, memory, disk, and operational overhead to achieve what is logically a windowing-level concern. |
Streams | Accepted | Hao Li | 2022-03-11 | 2022-11-16 | KAFKA-13785 | 3.2 | |
| 824 | Allowing dumping segmentlogs limiting the batches in the output TLDR: Adds a `--max-batches` flag to `kafka-dump-log.sh` to limit the number of record batches printed when dumping segment logs. Without this, operators have no visibility into producer batching behavior in near-real-time without drowning in unbounded output, making it impossible to detect poor batching or compression configurations at scale. |
Admin Broker | Accepted | Sergio Troiano | 2022-03-03 | 2022-04-04 | KAFKA-13687 | 3.3 | |
| 823 | Update Admin::describeConfigs to allow fetching specific configurations TLDR: Extends `AdminClient.describeConfigs()` to accept an optional set of specific config key names within a `ConfigResource`, returning only those keys rather than all configurations. Currently the API always returns the entire config set for a resource, which is wasteful over large broker or topic configs when callers only need a small subset. |
Admin | Discussion | Vikas Singh | 2022-02-20 | 2022-02-21 | KAFKA-13517 | DescribeConfigs | |
| 821 | Connect Transforms support for nested structures TLDR: Extends Kafka Connect Single Message Transforms (SMTs) to support nested field access and modification using dot-notation paths (e.g., `field.subfield`). Existing SMTs only operate on root-level fields, requiring custom SMT implementations for any transformation involving nested Struct or Map fields. |
Connect | Discussion | Jorge Esteban Quilcate Otoya | 2022-02-08 | 2023-08-29 | KAFKA-13656 | ||
| 820 | Extend KStream process with new Processor API TLDR: Extends KStream with a process(ProcessorSupplier, Named, String...) method that integrates the new strongly-typed Processor API (KIP-478) directly into the DSL, replacing the legacy transform() and transformValues() methods with a unified processing operator. The existing Transformer-based operators were tied to the old ProcessorContext, lacked support for one-to-many output, and required multiple overlapping DSL methods to cover different combinations of value/key/context access. |
Streams | Accepted | Jorge Esteban Quilcate Otoya | 2022-02-08 | 2024-06-13 | KAFKA-13654 | 3.3 | |
| 819 | Merge multiple KStreams in one operation TLDR: Adds `KStream.merge(Collection<KStream>)` overloads so multiple streams can be merged in a single DSL call instead of chaining repeated binary `merge()` calls. Merging N streams via repeated binary merges creates N-1 merge nodes in the topology graph, complicating debugging and monitoring. |
Streams | Discussion | Nicholas Telford | 2022-01-31 | 2022-01-31 | KAFKA-13633 | ||
| 817 | Fix inconsistency in dynamic application log levels TLDR: Fixes inconsistencies in the dynamic log-level adjustment API introduced by KIP-412, specifically adding support for `OFF` and `ALL` log levels and returning an error instead of silently falling back to `DEBUG` for unrecognized level strings. The JMX-based approach and the Admin API approach behave differently for these edge cases, making dynamic log management unreliable. |
Admin | Discussion | Dongjin Lee | 2022-01-28 | 2022-01-28 | KAFKA-13625 | ||
| 816 | Topology changes without local state reset TLDR: Proposes stable topology identifiers so that adding, removing, or reordering subtopologies does not invalidate task IDs and local state stores for unchanged subtopologies. Currently any topology change that shifts topic-group ordinals invalidates all existing task directories, forcing a full state rebuild even for unaffected stores. |
Streams | Discussion | Nicholas Telford | 2022-01-25 | 2024-03-28 | KAFKA-13627 | ||
| 815 | Support max-timestamp in GetOffsetShell TLDR: Migrates `kafka-get-offsets.sh` from using `KafkaConsumer` to `AdminClient` and extends the tool to accept all `OffsetSpec` types (earliest, latest, max-timestamp, timestamp). The consumer-based implementation could not support the `max-timestamp` spec introduced in KIP-734 since that spec is only available via `AdminClient`. |
Admin | Accepted | Deng Ziming | 2022-01-14 | 2022-03-01 | KAFKA-13509 | 3.2 | ListOffsets Metadata |
| 814 | Static membership protocol should let the leader skip assignment TLDR: Extends the static membership protocol so that the group leader can return a `JOIN_GROUP` response indicating it should skip recomputing the partition assignment when it knows the assignment is unchanged. Currently, when a static leader rejoins the group, the coordinator silently designates it leader without informing it, causing the leader to send a stale assignment that wastes CPU and can trigger unnecessary rebalances. |
Consumer | Accepted | David Jacot | 2022-01-19 | 2022-03-02 | KAFKA-13435 | 3.2 | JoinGroup |
| 813 | Shareable State Stores TLDR: KIP-813 introduces shareable state stores in Kafka Streams, allowing multiple Streams applications to read from a single physical state store without each application maintaining its own changelog-backed replica. Today each application that needs the same state must independently replicate the full changelog topic, duplicating storage and reprocessing costs. |
Streams | Accepted | Daan Gerits | 2022-01-17 | 2022-04-19 | KAFKA-10892 | 3.8 | |
| 811 | Add config repartition.purge.interval.ms to Kafka Streams TLDR: Rate-limits the explicit log-truncation requests that Kafka Streams sends for repartition topics so they are not issued on every commit. When `commit.interval.ms` is very low (e.g. under `exactly_once_v2`) the resulting flood of `DeleteRecords` RPCs creates significant broker overhead. |
Streams | Accepted | Nicholas Telford | 2021-12-20 | 2022-01-18 | KAFKA-13549 | 3.2 | |
| 810 | Allow producing records with null values in Kafka Console Producer TLDR: Adds a `null.marker` property to `kafka-console-producer.sh` that, when set, treats any matching input string as a null value, enabling production of tombstone records from the CLI. There is currently no way to produce null-valued records (tombstones) via the console producer, requiring users to rely on external tools like `kcat`. |
Producer Admin | Accepted | Mickael Maison | 2021-12-14 | 2022-02-01 | KAFKA-13595 | 3.2 | |
| 808 | Add support for different unix precisions in TimestampConverter SMT TLDR: Extends the `TimestampConverter` SMT to support unix timestamp precision other than milliseconds—specifically seconds, microseconds, and nanoseconds—via a new `unix.precision` config. The existing SMT hard-codes millisecond precision for Unix Long types, breaking interoperability with external systems that produce or expect timestamps at different granularities. |
Connect | Accepted | Julien Chanaud | 2021-12-09 | 2022-02-12 | KAFKA-13511 | 3.2 | |
| 807 | Refactor KafkaStreams exposed metadata hierarchy TLDR: Unifies the disparate Kafka Streams metadata APIs (`StreamsMetadata`, `TaskMetadata`, `ThreadMetadata`, `KeyQueryMetadata`) into a single, consistent query interface. Users currently must call multiple different methods to gather state across metadata classes, with no single entry point to retrieve combined metadata. |
Streams | Discussion | Josep Prat | 2021-12-03 | 2021-12-03 | KAFKA-12370 | ||
| 806 | Add session and window query over kv-store in IQv2 TLDR: Adds `WindowKeyQuery` and `WindowRangeQuery` implementations of the `Query` interface from KIP-796 (Interactive Queries v2) for querying window and session stores by key and time range. IQv2 launched without session/window query support, meaning Streams applications using window/session stores could not use the new unified IQ API. |
Streams | Accepted | Patrick Stuedi | 2021-12-01 | 2022-01-03 | KAFKA-13494 | 3.2 | |
| 805 | Add range and scan query over kv-store in IQv2 TLDR: Adds a `RangeQuery` implementation of the `Query` interface from KIP-796 (Interactive Queries v2) to support bounded range queries and full scans over key-value state stores. IQv2 launched without range/scan query support, leaving key-value stores accessible only via the deprecated point-lookup `KeyQuery`. |
Streams | Accepted | Vicky Papavasileiou | 2021-11-26 | 2021-12-17 | KAFKA-13492 | 3.2 | |
| 804 | OfflinePartitionsCount Tagged by Topic TLDR: KIP-804 tags the `OfflinePartitionsCount` controller metric with the topic name of the offline partition(s). The existing metric is an aggregate scalar with no topic label, preventing operators from building topic-scoped alerts, routing alerts to the right team, or filtering out noise from non-critical test topics. |
Metrics Broker | Discussion | Mason Joseph Legere | 2021-11-25 | 2021-11-26 | KAFKA-13484 | ||
| 803 | Add Task ID and Connector Name to Connect Task Context TLDR: Exposes the task ID and connector name to `SourceTask` and `SinkTask` implementations via the `TaskContext` interface so connectors can use them for custom metrics, logging, or debugging. While KIP-449 added these values to the SLF4J MDC for log enrichment, there was no programmatic API for connectors to access them directly. |
Connect Metrics | Discussion | Sarah Story | 2021-11-24 | 2021-11-24 | KAFKA-13477 | ||
| 802 | Validation Support for Kafka Connect SMT Options TLDR: Extends the Connect REST API (`PUT /connector-plugins/{name}/config/validate`) to also validate the configuration of SMTs, SMT predicates, and converters in addition to connector configurations. Currently there is no way to validate SMT or converter config before registering a connector—invalid configs are only discovered at connector startup. |
Connect | Discussion | Gunnar Morling | 2021-11-24 | 2021-12-21 | KAFKA-13478 | ||
| 801 | Implement an Authorizer that stores metadata in __cluster_metadata TLDR: Introduces `StandardAuthorizer`, a new built-in KRaft-native authorizer that stores ACLs in the `__cluster_metadata` topic instead of ZooKeeper. In KRaft mode the existing `AclAuthorizer` still required a ZooKeeper instance, which defeated the goal of removing ZooKeeper entirely. |
Security KRaft | Accepted | Colin McCabe | 2021-11-23 | 2022-02-04 | KAFKA-13646 | 3.2 | Envelope |
| 800 | Add reason to JoinGroupRequest and LeaveGroupRequest TLDR: Adds an optional `Reason` string field to `JoinGroupRequest` and `LeaveGroupRequest` so the broker can log why a consumer joined or left the group. Broker-side rebalance troubleshooting was hampered by the absence of a human-readable reason in these requests, since only the client side logged the reason. |
Consumer | Accepted | David Jacot | 2021-11-11 | 2022-04-04 | KAFKA-13451 | 3.2 | JoinGroup v8 LeaveGroup v5 |
| 799 | Align behaviour for producer callbacks with documented behaviour TLDR: Fixes an inconsistency in `KafkaProducer` callback behavior where `Callback.onCompletion` passes a non-null placeholder `RecordMetadata` in some error paths but null in others, contrary to the Javadoc contract specifying that exactly one of `metadata` or `exception` is null. The inconsistency was introduced in PR #4188 (KAFKA-6180) and can cause NPEs or incorrect error handling in callback implementations. |
Producer | Discussion | Seamus O Ceanainn | 2021-11-11 | 2021-12-03 | KAFKA-13448 | 3.2 | |
| 798 | Add possibility to write kafka headers in Kafka Console Producer TLDR: Adds parse.headers=true support to kafka-console-producer.sh's LineMessageReader, with configurable delimiters for header blocks, individual headers, and key-value pairs within a header. The console producer previously had no way to include record headers, limiting its usefulness for testing header-aware consumers and SMTs. |
Producer Admin | Accepted | Florin Akermann | 2021-11-10 | 2022-06-23 | KAFKA-13351 | 3.2 | |
| 796 | Interactive Query v2 TLDR: KIP-796 redesigns the Kafka Streams Interactive Query (IQ) API to be generic and extensible, replacing the typed `QueryableStoreTypes` with a `StateQueryRequest`/`StateQueryResult` model that decouples query types from store implementations. The existing IQ API is tightly coupled to specific store implementations and cannot be extended by third-party store plugins without breaking changes. |
Streams | Accepted | John Roesler | 2021-10-25 | 2022-01-28 | KAFKA-13479 | 3.2 | |
| 794 | Strictly Uniform Sticky Partitioner TLDR: Replaces the `UniformStickyPartitioner` with a strictly uniform variant that switches the target partition based on the number of records sent rather than on batch creation events. The original sticky partitioner sent disproportionately more records to slower brokers because batches were created more frequently for them, creating a positive-feedback loop that worsened the slowdown. |
Producer | Accepted | Artem Livshits | 2021-11-03 | 2022-04-13 | KAFKA-10888 | 3.3 | |
| 793 | Allow sink connectors to be used with topic-mutating SMTs TLDR: Adds originalTopic, originalKafkaPartition, and originalKafkaOffset fields to SinkRecord so that sink tasks overriding preCommit() can return the pre-transform offsets that the framework expects. When a topic-mutating SMT (e.g., RegexRouter) changes the topic name, the connector only sees the post-transform topic/partition/offset, causing preCommit() to return wrong offsets and breaking at-least-once offset tracking. |
Connect | Accepted | Diego Erdody | 2021-11-03 | 2023-07-07 | KAFKA-13431 | 3.6 | |
| 791 | Add Record Metadata to StateStoreContext TLDR: Adds `recordMetadata()` to the `StateStoreContext` public interface so state stores can access the source record's offset and partition metadata during processing. `AbstractProcessorContext` already implements this method internally, but it is not exposed through the `StateStoreContext` interface, preventing stores from implementing Read-Your-Writes consistency or offset tracking. |
Streams | Accepted | Patrick Stuedi | 2021-11-01 | 2021-11-16 | KAFKA-13426 | 3.2 | |
| 788 | Allow configuring num.network.threads per listener TLDR: Allows `num.network.threads` to be configured per listener via the `listener.name.<NAME>.num.network.threads` notation so each listener's thread pool can be sized independently. A single global `num.network.threads` value is wasteful when listeners carry very different traffic volumes (e.g. inter-broker vs. external client listeners). |
Broker | Accepted | Mickael Maison | 2021-10-26 | 2021-11-10 | KAFKA-7589 | 3.2 | |
| 786 | Emit Metric Client Quota Values TLDR: Adds JMX metrics that expose the configured quota limits (bytes/sec for produce/fetch and request quota) tagged at the client-id/user granularity alongside the existing throttle-time metrics. Without the quota ceiling value, operators cannot perform capacity planning or set meaningful alerts because they have no way to know how close clients are to being throttled. |
Metrics Admin | Discussion | Mason Joseph Legere | 2021-10-17 | 2021-11-25 | KAFKA-13395 | ||
| 785 | Automatic storage formatting TLDR: KIP-785 adds automatic storage formatting to KRaft brokers on first startup when storage has not already been formatted, eliminating the need to run `kafka-storage.sh format` as a separate step. The manual formatting step adds operational friction to KRaft deployments, especially in containerized and automated provisioning environments. |
KRaft | Discussion | Igor Soarez | 2021-10-18 | 2021-11-29 | KAFKA-13382 | ||
| 784 | Add top-level error code field to DescribeLogDirsResponse TLDR: Adds a top-level `errorCode` field to the `DescribeLogDirsResponse` so authorization failures (e.g., missing `Describe` on the Cluster resource) are surfaced explicitly rather than returning an empty response. Currently, an empty response is ambiguous—clients cannot distinguish a legitimate empty result from an authorization failure or an unknown server error. |
Protocol Admin | Accepted | Mickael Maison | 2021-10-18 | 2022-01-25 | KAFKA-13527 | 3.2 | DescribeLogDirs |
| 783 | Add TaskId field to StreamsException TLDR: Standardizes exception handling in Kafka Streams by ensuring all exceptions propagated to the `StreamsUncaughtExceptionHandler` are consistently wrapped as `StreamsException` with a well-defined subtype hierarchy. Previously only some exceptions were wrapped, making it impossible to write reliable handler logic or reason about the type of failure. |
Streams | Accepted | A. Sophie Blee-Goldman | 2021-10-18 | 2021-10-26 | KAFKA-13381 | 3.1 | |
| 782 | Expandable batch size in producer TLDR: Introduces a `batch.max.bytes` producer config that controls the maximum size at which a batch is drained from the accumulator, decoupling it from `batch.size` which governs memory allocation. Currently `batch.size` serves both roles, forcing operators to choose between memory waste (large allocation) and low throughput (small batch), with no way to tune the two concerns independently. |
Producer | Discussion | Luke Chen | 2021-10-16 | 2021-11-05 | |||
| 781 | Improve MirrorMaker2's client configuration TLDR: Fixes MirrorMaker 2 client configuration propagation so that replication-level configs such as `A→B.producer.batch.size` are correctly applied to the respective producer, consumer, and admin clients. The existing MM2 configuration parsing silently ignored any client-specific overrides that were not common connection properties. |
MirrorMaker | Discussion | Dongjin Lee | 2021-10-11 | 2021-10-22 | KAFKA-13365 | ||
| 780 | Support fine-grained compression options TLDR: Adds per-codec fine-grained compression configuration options (e.g., compression level, strategy, checksum mode) to producer, broker, and topic configs, going beyond the single compression.level parameter. Kafka allowed only default codec parameters with limited level tuning, preventing operators from trading off compression ratio vs. speed at the topic level for use cases like real-time ingest vs. long-term retention. |
Producer Broker | Discussion | Dongjin Lee | 2021-10-10 | 2025-08-18 | KAFKA-13361 | ||
| 779 | Allow Source Tasks to Handle Producer Exceptions TLDR: Adds a new `SourceTask.commitRecord(SourceRecord, RecordMetadata)` callback and error handling hook so source connectors can handle producer exceptions (e.g., record-too-large) per record rather than having the task killed. Previously, any producer write failure from a source task immediately killed the connector with no opportunity for the connector to log, route to a DLQ, or skip the bad record. |
Connect | Accepted | Knowles Atchison Jr | 2021-10-05 | 2021-11-29 | KAFKA-13348 | 3.2 | |
| 778 | KRaft to KRaft Upgrades TLDR: KIP-778 defines the rolling upgrade and downgrade protocol for Kafka clusters running in KRaft mode, gating new RPC versions and metadata record formats behind a cluster-wide metadata version (KV) that is only bumped once all nodes are upgraded. Without this mechanism, mixed-version KRaft clusters could produce or process record formats incompatible with older nodes, making safe rolling upgrades impossible without a disruptive "double roll". |
KRaft Protocol | Accepted | David Arthur | 2021-09-24 | 2023-04-24 | KAFKA-13410 | 3.3 | UpdateFeatures ApiVersions |
| 777 | Improved testability for Admin client TLDR: Establishes a consistent convention for `AdminClient` `*Result` class constructors—package-private by default, with static factory methods for test mocking—to make mock-based testing easier without exposing internal implementation details. Previous KIPs proposed either making all constructors public (risking ABI breakage) or keeping them package-private (blocking testability), leaving users unable to write reliable mock Admin tests. |
Admin Testing | Discussion | Tom Bentley | 2021-09-24 | 2021-09-24 | KAFKA-13285 | ||
| 775 | Custom partitioners in foreign key joins TLDR: Enables Kafka Streams foreign-key (FK) table joins to work correctly when the primary or foreign-key tables use custom partitioners by making the subscription and response topics copartition with the relevant tables. Previously FK joins silently produced missing results if either table was partitioned with a non-default partitioner, because lookups could be routed to Streams instances that did not hold the required state. |
Streams | Accepted | Victoria Xia | 2021-09-15 | 2021-10-01 | KAFKA-13261 | 3.1 | |
| 774 | Deprecate public access to Admin client's *Result constructors TLDR: Makes all `*Result` class constructors in `AdminClient` package-private, resolving the inconsistency where most had package-private constructors but some were accidentally made public. Public constructors on result classes caused a breaking API change during Kafka 3.0 development when signatures were changed without noticing a constructor was public. |
Admin | Discussion | Tom Bentley | 2021-09-09 | 2021-09-15 | KAFKA-13285 | ||
| 773 | Differentiate consistently metric latency measured in millis and nanos TLDR: Renames the producer metrics `bufferpool-wait-time-total`, `io-waittime-total`, and `iotime-total` to include `-ns` or `-ms` suffixes (e.g., `bufferpool-wait-time-ns-total`) to make the unit explicit and consistent with the rest of the metric family. These three metrics deviate from the established naming convention where nanosecond metrics include `-ns` in their name, causing confusion about units. |
Metrics Producer | Accepted | Josep Prat | 2021-09-03 | 2021-09-07 | KAFKA-13243 | 3.1 | |
| 769 | Connect APIs to list all connector plugins and retrieve their configuration definitions TLDR: KIP-769 adds a `connectorsOnly=false` query parameter to `GET /connector-plugins` so it lists all installed plugin types (connectors, SMTs, converters, header converters, predicates), and adds a new `GET /connector-plugins/{plugin}/config` endpoint returning the plugin's configuration definition. Previously, only connector plugins were discoverable via the REST API; other plugin types had to be found by inspecting the Connect worker's classpath or documentation. |
Connect | Accepted | Mickael Maison | 2021-08-19 | 2022-04-11 | KAFKA-13510 | 3.2 | |
| 767 | Connect Latency Metrics TLDR: KIP-767 introduces per-record latency metrics (average and max) to Kafka Connect that measure the delay between a record's source timestamp and its successful delivery to the sink. Connect currently provides no common latency instrumentation, forcing operators to implement ad-hoc monitoring outside the framework. |
Connect Metrics | Discussion | Jordan Bull | 2021-08-09 | 2021-09-01 | |||
| 764 | Configurable backlog size for creating Acceptor TLDR: Adds a configurable connection-acceptance rate limit to the Kafka broker's `Acceptor` so that a burst of new TCP connections (e.g. after a preferred-leader election) does not fill the SYN backlog and drop subsequent connection attempts. Under high partition counts a rolling restart or leader election causes hundreds of clients to reconnect simultaneously, overwhelming the kernel TCP backlog. |
Broker | Accepted | Haruki Okada | 2021-07-22 | 2021-10-20 | KAFKA-9648 | 3.2 | |
| 763 | Range Queries with Open Endpoints TLDR: Adds unbounded (null-terminated) range and reverse-range queries to Kafka Streams `ReadOnlyKeyValueStore` so callers can scan from a lower bound to the end of the store, or from an upper bound to the beginning. The existing `range(K from, K to)` required both bounds, forcing workarounds for open-ended scans. |
Streams | Accepted | Patrick Stuedi | 2021-07-18 | 2021-07-30 | KAFKA-4064 | 3.1 | |
| 762 | Delete Committed Connect Records TLDR: Extends Kafka Connect to automatically delete records from source topics once all SinkConnectors that subscribe to those topics have processed them, enabling low-latency storage-efficient pipelines. Without this, Connect pipelines that use Kafka purely as a transport medium (not for long-term retention) must either over-provision retention or risk SinkConnectors falling behind a short retention window. |
Connect | Discussion | Ryanne Dolan | 2021-07-14 | 2021-07-16 | |||
| 761 | Add Total Blocked Time Metric to Streams TLDR: Introduces a new `blocked-time-total` metric in Kafka Streams that accumulates the total wall-clock time a stream thread spends blocked waiting on Kafka (poll, commit, flush) rather than processing records. Existing metrics like `poll-ratio` and `processing-ratio` are instantaneous ratios from a single poll loop iteration and are too coarse-grained to identify systemic blocking bottlenecks. |
Streams Metrics | Accepted | Rohan Desai | 2021-07-07 | 2021-09-24 | KAFKA-13229 | 3.1 | |
| 760 | Minimum value for segment.ms and segment.bytes TLDR: Introduces broker-level `min.topic.segment.ms` and `min.topic.segment.bytes` configurations that enforce a minimum floor on the corresponding topic-level settings. Small values for `segment.ms` or `segment.bytes` cause an explosion of log segment files per partition, which can exhaust file descriptors or memory and crash brokers. |
Broker | Discussion | Badai Aqrandista | 2021-07-06 | 2021-09-10 | KAFKA-7760 | ||
| 759 | Unneeded repartition canceling TLDR: Adds a `repartitionRequired` flag to Kafka Streams' internal graph optimization so that a `selectKey`/`map` followed by a `groupByKey` does not insert an unnecessary repartition topic when the upstream stream is already partitioned by the new key. Every key-changing operation currently forces a repartition before any stateful aggregation, even when the resulting key is identical to the existing partition key, wasting network bandwidth and adding end-to-end latency. |
Streams | Accepted | Ivan Ponomarev | 2021-06-24 | 2023-08-03 | KAFKA-4835 | ||
| 755 | Add new AUTO_CREATE ACL for auto topic creation TLDR: Introduces a new `AUTO_CREATE` ACL operation that grants permission specifically for auto-topic creation (triggered by `MetadataRequest`) distinct from the `CREATE` ACL used for explicit `CreateTopicsRequest`. Currently a single `CREATE` ACL covers both auto-creation and explicit creation, preventing administrators from allowing auto-creation (which uses broker defaults) without also granting full topic creation rights. |
Security | Discussion | Christopher L. Shannon | 2021-06-08 | 2021-06-23 | KAFKA-12916 | Metadata CreateTopics | |
| 754 | Make Scala case class's final TLDR: Makes Scala case classes in the Kafka codebase `final` to align with Scala best practices for algebraic data types (ADTs) that represent immutable data. Non-final case classes break pattern-match exhaustiveness checking and allow subclassing that can violate the value-semantics contract of case classes. |
Broker | Discussion | Matthew de Detrich | 2021-06-08 | 2021-06-08 | KAFKA-12913 | ||
| 749 | Add --files and --file-separator options to the ConsoleProducer TLDR: Adds `--files` and `--file-separator` options to `kafka-console-producer.sh` to allow bulk-producing records from one or more files in a single invocation. Currently the console producer can only read from stdin or a single file via shell redirection, requiring complex shell scripting to ingest multiple log files. |
Producer Admin | Discussion | Wenbing Shen | 2021-06-04 | 2021-06-16 | KAFKA-12891 | ||
| 748 | Add Broker Count Metrics TLDR: Introduces two new KRaft controller metrics: `ActiveBrokerCount` (registered, unfenced brokers) and `FencedBrokerCount` (brokers that registered but are fenced). Without these metrics, a broker that is alive but failing to emit metrics or logs is indistinguishable from a broker that never registered, making cluster health assessment unreliable. |
Metrics KRaft | Accepted | Ryan Dielhenn | 2021-05-26 | 2021-08-26 | KAFKA-12882 | 3.1 | |
| 746 | Revise KRaft Metadata Records TLDR: Revises the KRaft metadata record schemas introduced in KIP-631 by adding flexible versions, missing fields (e.g. for KIP-455 reassignment), and proper `entityType` annotations. Several record types were incorrectly defined without flexible versions, and some lacked fields required by subsequent KIPs. |
KRaft Protocol | Accepted | Colin McCabe | 2021-06-02 | 2022-06-01 | KAFKA-12931 | 3.0 | |
| 745 | Connect API to restart connector and tasks TLDR: Adds a `ConnectorClientConfigRequest` and corresponding `ConnectorClientConfigOverridePolicy` extension point so Connect connector implementations can declare recommended client config overrides for their embedded producers, consumers, and admin clients. There was no structured way for connectors to influence the client configuration used by the Connect framework for their tasks. |
Connect | Accepted | Randall Hauch | 2021-06-01 | 2021-07-27 | KAFKA-4793 | 3.0 | |
| 744 | Migrate TaskMetadata and ThreadMetadata to an interface with internal implementation TLDR: Converts `TaskMetadata` and `ThreadMetadata` from concrete classes to interfaces with internal implementations, separating the public API contract from the implementation details. Users should never instantiate these metadata objects directly—making them concrete classes with public constructors prevents the Streams team from safely evolving the internal representation. |
Streams | Accepted | Josep Prat | 2021-05-27 | 2021-07-13 | KAFKA-12849 | 3.0 | |
| 743 | Remove config value 0.10.0-2.4 of Streams built-in metrics version config TLDR: KIP-743 removes the `0.10.0-2.4` config value for `built.in.metrics.version` in Kafka Streams (Kafka 3.0), leaving `latest` as the only valid value. The legacy metric structure introduced by KIP-444's predecessor was effectively deprecated when `latest` became the default in 2.5, and maintaining dual metric code paths is error-prone; new metrics added post-KIP-444 only exist in the `latest` structure, meaning users on `0.10.0-2.4` already receive a mix of old and new metric formats. |
Streams | Discussion | Bruno Cadonna | 2021-05-20 | 2021-05-21 | KAFKA-12519 | 3.0 | |
| 742 | Change default serde to be null | Discussion | |||||||
| 740 | Clean up public API in TaskId TLDR: Changes `TaskMetadata.taskId()` to return a `TaskId` object instead of its `String` encoding so callers receive a strongly-typed value with direct access to sub-topology and partition fields. The `String` representation required callers to parse the encoding themselves, and made it impossible to add new `TaskId` fields without breaking existing parsing code. |
Streams | Accepted | A. Sophie Blee-Goldman | 2021-05-13 | 2021-08-25 | KAFKA-12779 | 3.0 | |
| 738 | Removal of Connect's internal converter properties TLDR: Makes the deprecated `internal.key.converter` and `internal.value.converter` Connect worker properties fatal on startup so workers immediately fail rather than silently producing corrupted internal topic data. Workers configured with a non-default internal converter cannot read internal topic data written by workers with a different converter, leading to hard-to-diagnose data corruption. |
Connect | Accepted | Chris Egerton | 2021-05-05 | 2021-06-09 | KAFKA-12717 | 3.0 | |
| 737 | Add canTrackSource to ReplicationPolicy TLDR: KIP-737 adds a `canTrackSource()` method to the `ReplicationPolicy` interface in MirrorMaker 2, allowing implementations to declare whether they can reverse-map a topic name back to its source cluster. Without this capability, the `IdentityReplicationPolicy` (needed for MirrorMaker 1 migration) cannot coexist cleanly with MM2's offset syncing logic, which assumes topic names encode source cluster identity. |
MirrorMaker | Discussion | Matthew de Detrich | 2021-05-10 | 2021-05-10 | KAFKA-9726 | 3.0 | |
| 736 | Report the true end to end fetch latency TLDR: Adds per-follower latency breakdown metrics to the fetch purgatory so operators can see how long fetch requests spend waiting for each follower to acknowledge, decomposing end-to-end produce latency more precisely. The existing `RemoteTime` metric for produce requests aggregates all follower wait time without distinguishing between network congestion and slow fetch processing per follower. |
Metrics Broker | Discussion | Ming Liu | 2021-04-27 | 2021-08-12 | KAFKA-12713 | Fetch | |
| 735 | Increase default consumer session timeout TLDR: Adds a per-consumer-group `session.timeout.ms` override capability so that individual groups can be configured with session timeouts outside the broker's global `[group.min/max.session.timeout.ms]` range. In multi-tenant cloud environments workloads have highly variable GC and network characteristics, making a single global timeout range inadequate for all groups. |
Consumer | Accepted | Jason Gustafson | 2021-04-21 | 2021-07-30 | KAFKA-12874 | 3.0 | |
| 734 | Improve AdminClient.listOffsets to return timestamp and offset for the record with the largest timestamp TLDR: Adds OffsetSpec.MaxTimestampSpec to AdminClient.listOffsets(), returning the offset and timestamp of the record carrying the highest message timestamp in a partition. Provides a lightweight liveness indicator for partitions without requiring a consumer or metric scrape, noting that MM2 timestamp preservation means this may not equal the latest append time. |
Admin | Accepted | Thomas Scott | 2021-04-15 | 2024-04-10 | KAFKA-12541 | 3.0 | ListOffsets v7 |
| 733 | change Kafka Streams default replication factor config TLDR: Changes the default value of `replication.factor` in Kafka Streams from `1` to `-1` (use broker default) to align with the broker-side default replication factor. The hardcoded default of `1` is unsafe for production—users must remember to override it—while `-1` (available since Kafka 2.4, KIP-464) delegates to the broker's `default.replication.factor`. |
Streams | Accepted | Matthias J. Sax | 2021-04-14 | 2021-05-05 | KAFKA-8531 | 3.0 | |
| 732 | Deprecate eos-alpha and replace eos-beta with eos-v2 TLDR: Deprecates `processing.guarantee=eos-alpha` in Kafka Streams 3.0 (originally called `exactly_once`) and renames `eos-beta` (introduced in KIP-447) to `eos-v2`. The `eos-alpha` mode uses per-task producers with higher overhead; `eos-v2` (thread-level producer) is more efficient and scalable, and keeping both modes adds significant code complexity. |
Streams Transactions | Accepted | A. Sophie Blee-Goldman | 2021-04-14 | 2021-04-28 | KAFKA-12574 | 3.0 | |
| 731 | Record Rate Limiting for Kafka Connect TLDR: Adds producer-level throughput quotas to Kafka Connect tasks, configurable per connector, to prevent a single connector from exhausting shared network bandwidth or overwhelming downstream systems. Connect has no QoS mechanism, so a high-throughput connector (e.g. MirrorMaker) can starve other connectors or saturate external endpoints that have no rate-limiting of their own. |
Connect | Discussion | Ryanne Dolan | 2021-04-09 | 2021-05-14 | KAFKA-12645 | ||
| 730 | Producer ID generation in KRaft mode TLDR: Replaces the ZooKeeper-based block allocation scheme for producer IDs with a KRaft controller-managed `AllocateProducerIds` RPC, where brokers request blocks of IDs from the active controller. Without this, EOS and idempotent producers cannot function in a fully ZooKeeper-free KRaft cluster. |
KRaft Transactions | Accepted | David Arthur | 2021-04-05 | 2023-01-19 | KAFKA-12620 | 3.0 | InitProducerId AllocateProducerIds |
| 729 | Custom validation of records on the broker prior to log append TLDR: Adds a broker-side plugin interface `RecordInterceptor` (similar to `AuthorizerInterceptor`) that allows custom validation logic to be applied to records before they are appended to the log. There is no mechanism for operators to enforce application-level data quality constraints (e.g., schema compliance) on the broker, forcing consumers to handle malformed records via dead-letter queues or crashes. |
Broker | Discussion | Soumyajit Sahu | 2021-04-01 | 2021-04-14 | |||
| 727 | Add --under-preferred-replica-partitions option to describe topics comman TLDR: KIP-727 adds a `--non-preferred-leader` (human-readable) and `--non-preferred-leader-json` (JSON) option to `kafka-topics.sh --describe` to list partitions whose current leader is not the preferred (first) replica. Without this filter, identifying unbalanced partition leadership that causes broker egress skew requires parsing the full topic description output or inspecting controller logs. |
Admin Broker | Discussion | Wenbing Shen | 2021-03-26 | 2021-04-10 | KAFKA-12556 | ||
| 724 | Drop support for message formats v0 and v1 TLDR: Removes support for legacy Kafka message formats v0 and v1 (pre-0.11) in Kafka 4.0, after deprecating them in 3.0. These formats predate idempotent producers, transactions, record headers, and partition leader epochs, creating correctness risks and maintenance burden for code that must handle all three format variants. |
Protocol Broker | Accepted | Ismael Juma | 2021-03-21 | 2024-12-24 | KAFKA-12944 | 3.0 | |
| 723 | Add socket.tcp.no.delay property to Kafka Config TLDR: Makes `TCP_NODELAY` configurable on broker sockets (defaulting to off for inter-broker communication) to reduce packet-per-second rates on clusters with large numbers of partitions. With `TCP_NODELAY` always enabled, a 4-broker cluster with 30,000 partitions generated ~140,000 TCP packets/second at load because each inter-broker heartbeat was sent as a separate small packet, triggering traffic-shaper limits in cloud environments. |
Broker | Discussion | Andrei Iatsuk | 2021-03-17 | 2021-03-21 | KAFKA-12481 | ||
| 722 | Enable connector client overrides by default TLDR: Changes the default value of `connector.client.config.override.policy` from `None` to `All`, enabling connector-level client configuration overrides by default in Kafka 3.0. The original default of `None` was chosen conservatively for backward compatibility, but left the KIP-458 per-connector client override feature effectively disabled out of the box. |
Connect | Accepted | Randall Hauch | 2021-03-16 | 2021-06-23 | KAFKA-12483 | 3.0 | |
| 721 | Enable connector log contexts in Connect Log4j configuration TLDR: Enables connector log context enrichment (connector name, task ID) by default in the Connect `log4j.properties` configuration shipped with Kafka distributions. KIP-449 added the MDC context injection capability but intentionally left it disabled by default; Kafka 3.0's major version bump is the appropriate milestone to enable it. |
Connect | Accepted | Randall Hauch | 2021-03-16 | 2021-06-23 | KAFKA-12484 | 3.0 | |
| 720 | Deprecate MirrorMaker v1 TLDR: KIP-720 officially deprecates MirrorMaker v1 with the intent to remove it in a future major release. MirrorMaker 2 (built on Kafka Connect) is a superset of v1 in functionality and reliability, and maintaining both implementations divides engineering effort. |
MirrorMaker | Accepted | Ryanne Dolan | 2021-03-06 | 2021-08-02 | KAFKA-12436 | 3.0 | |
| 719 | Deprecate Log4J Appender TLDR: KIP-719 deprecates the `kafka-log4j-appender` module, which ships the legacy Log4j 1.x `KafkaAppender`, from the Apache Kafka distribution. The appender's dependency on Log4j 1.x (EOL, CVE-affected) conflicts with the Log4j2 upgrade done elsewhere in the codebase, causing both Log4j versions to coexist on the classpath with unpredictable behavior. |
Broker | Accepted | Dongjin Lee | 2021-03-02 | 2022-01-24 | KAFKA-12399 | 3.8 | |
| 718 | Make KTable Join on Foreign key unopinionated TLDR: Fixes `KTable` foreign key join to respect the `Materialized` store type passed by the user for the internal subscription store, instead of always using RocksDB. The KIP-213 implementation hardcodes RocksDB for the internal subscription store regardless of the `Materialized` configuration, causing test failures on Windows (RocksDB file locking) and violating the contract of in-memory materialization. |
Streams | Discussion | Marco Lotz | 2021-03-01 | 2021-04-18 | KAFKA-10383 | ||
| 716 | Allow configuring the location of the offset-syncs topic with MirrorMaker2 TLDR: Adds a `replication.policy.class`-scoped config `offset-syncs.topic.location` to MirrorMaker2 to allow the `mm2-offset-syncs` topic to be stored on the target cluster instead of the source. By default MirrorMaker2 writes `mm2-offset-syncs` to the source cluster, requiring write permissions on the source—problematic in secured environments where MirrorMaker2 should be read-only on the source. |
MirrorMaker | Accepted | Mickael Maison | 2021-02-26 | 2021-06-14 | KAFKA-12379 | 3.0 | |
| 714 | Client metrics and observability TLDR: Introduces a broker-side `GetTelemetrySubscriptions`/`PushTelemetryMetrics` protocol API pair so brokers can selectively request metrics from clients on a configurable push interval. There is currently no way for a cluster operator or hosted service provider to centrally observe Kafka client metrics without requiring changes to client application code or a client restart, making troubleshooting a cross-team coordination problem. |
Metrics Protocol Client | Accepted | Magnus Edenhill | 2021-02-07 | 2024-04-23 | KAFKA-15601 | 3.7 | GetTelemetrySubscriptions PushTelemetry Produce OffsetCommit |
| 713 | Validation of Enums in configuration TLDR: KIP-713 adds built-in enum validation support to Kafka's `ConfigDef` framework so that enum-typed configuration values produce error messages listing all valid options when an invalid value is provided. Current validation failures for enum configs produce opaque exceptions with no indication of acceptable values, degrading the operator experience. |
Admin | Discussion | Jeremy Custenborder | 2021-02-05 | 2021-02-05 | KAFKA-12301 | ||
| 712 | Shallow Mirroring TLDR: Proposes a zero-copy MirrorMaker path that transfers raw record batches from source to destination broker without deserializing/re-serializing individual records ('shallow mirroring'). The current MirrorMaker pipeline copies bytes multiple times: from the network buffer into `ConsumerRecord`, then from `ConsumerRecord` into `ProducerRecord`, each requiring full deserialization and re-serialization of record batches. |
MirrorMaker | Discussion | Henry Cai | 2021-02-04 | 2021-02-05 | KAFKA-12295 | Produce Fetch | |
| 711 | Deprecate org.apache.kafka.streams.errors.BrokerNotFoundException TLDR: Deprecates `org.apache.kafka.streams.errors.BrokerNotFoundException` since the Streams runtime has not thrown it for years following its removal from the code path. The unused exception class pollutes the public API and misleads users who might write catch clauses for an exception that will never be thrown. |
Streams | Discussion | Chia-Ping Tsai | 2021-02-03 | 2021-02-03 | KAFKA-12281 | 4.2 | |
| 710 | Full support for distributed mode in dedicated MirrorMaker 2.0 clusters TLDR: Enables full distributed mode for dedicated MirrorMaker2 clusters by enabling the Connect REST server for follower-to-leader forwarding and switching to lazy evaluation of config provider references in connector configs. Without the REST server, followers could detect dynamic configuration changes (new topics matching TopicFilter) but could not forward the reconfiguration to the leader, causing MM2 to stall. |
MirrorMaker Connect | Accepted | Daniel Urban | 2021-01-26 | 2023-02-08 | KAFKA-10586 | 3.5 | |
| 708 | Rack aware StandbyTask assignment for Kafka Streams TLDR: Adds rack-aware placement of Kafka Streams standby tasks so they are assigned to instances in a different rack/AZ than the corresponding active task, using the rack-aware replica assignment strategy. Without rack awareness, a single AZ failure could take down both an active task and its standby simultaneously, eliminating fast-failover recovery. |
Streams Consumer | Accepted | Levani Kokhreidze | 2021-01-25 | 2023-05-12 | KAFKA-6718 | 3.2 | |
| 707 | The future of KafkaFuture TLDR: Adds `KafkaFuture.toCompletionStage()` to allow conversion of `KafkaFuture` instances to Java's standard `CompletionStage`/`CompletableFuture` for interoperability with JDK async APIs and third-party libraries. `KafkaFuture` was designed when Java lacked `CompletionStage`; now that Java 8+ `CompletionStage` is universal, users must awkwardly wrap or unwrap `KafkaFuture` when composing async admin client operations. |
Admin Client | Accepted | Tom Bentley | 2021-01-22 | 2021-04-30 | KAFKA-6987 | 3.0 | |
| 705 | Selectively Disable Topology Optimizations TLDR: Splits the `topology.optimizations` configuration into fine-grained options (`REUSE_KTABLE_SOURCE_TOPICS`, `MERGE_REPARTITION_TOPICS`, etc.) so individual optimizations can be selectively enabled or disabled. The existing `none` / `all` binary choice caused problems because the source-changelog reuse optimization has unsafe side effects (it assumes symmetric serdes), yet users who wanted other optimizations had no way to exclude it. |
Streams | Discussion | Almog Gavra | 2021-01-13 | 2021-01-13 | KAFKA-12192 | ||
| 704 | Send a hint to the partition leader to recover the partition TLDR: Adds a new `AlterPartitionReassignments`-style hint mechanism allowing the controller to notify a partition leader elected via unclean leader election that it should fetch committed records from surviving replicas before serving reads. Unclean leader election can cause data loss and state inconsistency with transaction/group coordinators because the new leader may delete records committed by previous leaders; this hint allows recovery without user intervention. |
Broker | Accepted | Raman Verma | 2021-01-11 | 2022-03-23 | KAFKA-13587 | 3.2 | AlterPartition v1 LeaderAndIsr |
| 703 | Add a metric for reporting idle connections closed TLDR: Introduces a new JMX metric `connection-close-total` (tagged by `close-reason=idle`) on the `Selector` to count the number of connections closed due to exceeding `connection.max.idle.ms`. Currently, idle connection closures are only visible at TRACE log level, making it impossible to monitor this behavior from standard metrics/alerting infrastructure. |
Metrics Broker | Discussion | Pere Urbon | 2021-01-08 | 2021-01-08 | KAFKA-12166 | ||
| 702 | The control plane needs to force the validation of requests from the controller TLDR: KIP-702 proposes a `control.plane.force.controller.requests.enable` broker config that, when enabled, causes the control-plane listener to reject all non-controller requests (`LeaderAndIsr`, `UpdateMetadata`, `StopReplica`, `ControlledShutdown` are the only allowed types). Without enforcement, a misconfigured client pointing its `bootstrap.servers` at the control-plane listener can mix data traffic with controller traffic, degrading controller performance and violating the isolation designed by KIP-291. |
Protocol Broker | Discussion | Wenbing Shen | 2021-01-05 | 2021-01-06 | KAFKA-10891 | Metadata LeaderAndIsr StopReplica UpdateMetadata | |
| 700 | Add Describe Cluster API TLDR: Introduces a dedicated `DescribeCluster` API (ApiKey 60) in the Kafka protocol so `AdminClient.describeCluster()` no longer piggybacks on the `Metadata` API. The `Metadata` API was designed for producers and consumers to refresh topic partition metadata; adding Admin-only fields like authorized cluster operations to it misuses the API's purpose and adds overhead to high-frequency metadata refreshes. |
Protocol Admin | Accepted | David Jacot | 2020-12-14 | 2021-01-22 | KAFKA-10851 | 2.8 | Metadata v11 DescribeCluster |
| 699 | Update FindCoordinator to resolve multiple Coordinators at a time TLDR: Extends the `FindCoordinator` RPC to accept a batch of group/transaction IDs and return all their coordinator addresses in a single request. Currently `FindCoordinator` only resolves one coordinator per request, causing O(N) sequential round-trips for Admin API operations like `deleteConsumerGroups` that operate on many groups simultaneously. |
Protocol Admin | Accepted | Mickael Maison | 2020-12-11 | 2021-06-24 | KAFKA-12663 | 3.0 | FindCoordinator v4 |
| 698 | Add Explicit User Initialization of Broker-side State to Kafka Streams TLDR: KIP-698 adds a `KafkaStreams.cleanUp()` equivalent `addBrokerSideInitialization()` method and an `internal.topics.creation.mode` config (ENABLED/DISABLED) that allows users to pre-create internal topics (repartition, changelog) explicitly before the first application run. Previously, internal topics were silently recreated as empty topics when deleted between rebalances, causing undetected data loss without any user notification. |
Streams | Accepted | Bruno Cadonna | 2020-11-30 | 2021-01-29 | KAFKA-10357 | ||
| 697 | Stricter parsing of addresses in configs TLDR: Standardizes the parsing and validation of `bootstrap.servers` (and similar list configs like `quorum.voters`) to consistently accept and normalize whitespace around delimiters. The config was documented as a comma-separated list, but the actual parsing behavior varied across client types, and trailing/leading spaces caused silent misconfigurations. |
Client | Discussion | Tom Bentley | 2020-12-09 | 2020-12-09 | KAFKA-10713 | ||
| 696 | Update Streams FSM to clarify ERROR state meaning TLDR: Adds a `PENDING_ERROR` transient state to the `KafkaStreams` FSM with transitions `RUNNING/REBALANCING → PENDING_ERROR → ERROR`, and removes the `ERROR → PENDING_SHUTDOWN` transition to make `ERROR` a true terminal state. KIP-663 removed the automatic transition to `ERROR` when all threads die, making the original `ERROR` state definition (no threads running) ambiguous and logically inconsistent. |
Streams | Accepted | Walker Carlson | 2020-12-02 | 2021-01-26 | KAFKA-10555 | 2.8 | |
| 695 | Further Improve Kafka Streams Timestamp Synchronization TLDR: Improves Kafka Streams timestamp synchronization by using the last-seen record timestamp as a lower bound for a partition's unknown future timestamp when deciding which task input to process next. The existing `max.task.idle.ms` idle wait is symmetric—it waits the same duration regardless of how far ahead other partitions are—leading to excessive waiting or out-of-order processing when partitions have very different timestamps. |
Streams | Accepted | John Roesler | 2020-11-18 | 2021-02-22 | KAFKA-10091 | 3.0 | |
| 694 | Support Reducing Partitions for Topics TLDR: Proposes the ability to reduce the number of partitions for a topic via `AdminClient` and the broker, with data redistribution handled by merging trailing partitions into earlier ones during cleanup. Topics can only grow in partition count today, leading to ever-increasing partition totals in clusters with many short-lived or bursty topics that cause disk I/O degradation and increased failure blast radius. |
Admin Broker | Discussion | George Shu | 2020-12-02 | 2021-03-09 | |||
| 693 | Client-side Circuit Breaker for Partition Write Errors TLDR: Introduces a client-side per-partition circuit breaker in the Kafka producer that stops routing new records to partitions with persistent write errors, redirecting traffic to healthy partitions. When disk failures cause a subset of partitions to have high write latency, the shared producer buffer fills up with blocked batches, eventually blocking writes to all partitions including healthy ones. |
Producer | Discussion | George Shu | 2020-12-04 | 2021-06-08 | KAFKA-12793 | ||
| 692 | Make AdminClient value object constructors public TLDR: Makes the constructors of `AdminClient` result value objects (e.g., `TopicDescription`, `MemberDescription`) public so that test code can instantiate them directly without reflection or mocking frameworks. Currently most of these constructors are package-private, forcing developers to use complex mocking setups to construct realistic `AdminClient` responses in unit tests. |
Admin Testing | Discussion | Noa Resare | 2020-12-03 | 2020-12-03 | KAFKA-10490 | ||
| 691 | Enhance Transactional Producer Exception Handling TLDR: Adds a `fencedOnCommit` classification to transactional producer exceptions to distinguish fatal epoch-fenced errors (where the producer must abort and re-initialize) from transient errors (where only an `abortTransaction()` is needed). Previously, all `ProducerFencedException` and `OutOfOrderSequenceException` types required full `initTransactions()` restarts, causing unnecessary downtime for recoverable faults. |
Transactions Producer | Accepted | Boyang Chen | 2020-11-24 | 2023-01-20 | KAFKA-10733 | ||
| 690 | Add additional configuration to control MirrorMaker 2 internal topics naming convention TLDR: KIP-690 adds `heartbeatsTopic()`, `checkpointsTopic(clusterAlias)`, and `offsetSyncsTopic(clusterAlias)` methods to the `ReplicationPolicy` interface so MirrorMaker 2 internal topic names can be overridden via `replication.policy.separator` or a custom `ReplicationPolicy` implementation. The previous hardcoded names (`heartbeats`, `<alias>.checkpoint.internal`, `mm2-offset-syncs.<alias>.internal`) could not be changed, blocking deployment in clusters that enforce topic naming conventions or prohibit auto-topic creation. |
MirrorMaker | Accepted | Omnia Ibrahim | 2020-12-01 | 2023-07-05 | KAFKA-10777 | 3.1 | |
| 689 | Extend `StreamJoined` to allow more store configs TLDR: Adds `withLoggingEnabled()` and `withLoggingDisabled()` methods to `StreamJoined` so users can control changelog topic creation and logging for the state stores used in stream-stream joins. KIP-479 introduced `StreamJoined` to extend `Materialized`-like options to stream-stream joins, but changelog logging control was deferred and not implemented. |
Streams | Accepted | Leah Thomas | 2020-11-30 | 2020-12-07 | KAFKA-9126 | 2.8 | |
| 688 | Support dynamic update of delete.topic.enable config TLDR: Makes `delete.topic.enable` dynamically reconfigurable via `AdminClient` so administrators can temporarily enable topic deletion without rolling the cluster. Changing this setting previously required editing `server.properties` on every broker and performing two full cluster rolls (one to enable deletion, one to re-disable it), creating significant operational overhead. |
Admin Broker | Discussion | Prateek Agarwal | 2020-11-25 | 2020-11-25 | KAFKA-10646 | ||
| 687 | Automatic Reloading of Security Store TLDR: Adds an explicit `AlterConfig` handler for TLS keystore/truststore reload so that the security store reload still works after all `AlterConfig` requests are forwarded to the active KRaft controller. The existing reload relied on the broker receiving the ZooKeeper notification directly; controller-side forwarding broke this mechanism. |
Security KRaft | Accepted | Boyang Chen | 2020-11-25 | 2021-09-08 | KAFKA-10345 | ||
| 686 | API to ensure Records policy on the broker TLDR: Adds a broker-side `RecordsPolicy` plugin interface that intercepts produce requests and can reject records that do not conform to a required schema or format. Client-side SerDe enforcement is insufficient because any client can write arbitrary bytes; enforcing data format at the broker is the only way to guarantee schema compliance across all producers. |
Broker Security | Discussion | Nikolay Izhikov | 2020-11-17 | 2020-11-19 | KAFKA-10732 | ||
| 685 | Loosen permission for listing reassignments TLDR: Lowers the required permission to call `ListPartitionReassignments` from `ClusterDescribe` to `Topic:Describe` so that end users can distinguish under-replicated partitions caused by ongoing reassignments from those caused by broker failures. After KIP-339 required `ClusterDescribe` for this API, ordinary users lost the ability to differentiate reassignment-related URPs from failure-related URPs, regressing the observability improvement from KIP-352. |
Security Admin | Discussion | David Jacot | 2020-11-10 | 2020-11-10 | KAFKA-10216 | ||
| 684 | Support mutual TLS authentication on SASL_SSL listeners TLDR: Enables mutual TLS (mTLS) client authentication on `SASL_SSL` listeners by honoring the `ssl.client.auth` broker configuration for those listeners. Kafka currently ignores `ssl.client.auth` on `SASL_SSL` listeners, preventing operators from requiring TLS client certificates in addition to SASL credentials for defense-in-depth authentication. |
Security | Accepted | Rajini Sivaram | 2020-11-09 | 2021-03-03 | KAFKA-10700 | 2.8 | |
| 683 | Add recursive support to Connect Cast and ReplaceField transforms, and support for casting complex types to either a native or JSON string TLDR: KIP-683 extends the Kafka Connect `Cast` and `ReplaceField` SMTs to support recursive field traversal into nested `Struct`, `Map`, and `Array` fields, and adds an option to cast complex-type fields to their JSON string representation. Previously these transforms operated only on top-level flat fields, making it impossible to transform or project nested schema structures common in real-world message formats without writing custom SMTs. |
Connect | Discussion | Joshua Grisham | 2020-11-08 | 2020-11-08 | KAFKA-10640 | ||
| 682 | Connect TimestampConverter support for multiple fields and multiple input formats TLDR: Extends the `TimestampConverter` SMT to support converting multiple fields in a single transform configuration and accepting multiple input date format patterns via comma-separated lists. Currently, each `TimestampConverter` instance can only convert a single field, requiring N chained transform instances for N timestamp fields, degrading performance. |
Connect | Discussion | Joshua Grisham | 2020-11-06 | 2021-11-22 | KAFKA-10627 | ||
| 681 | Rename master key in delegation token feature TLDR: Renames the broker config `delegation.token.master.key` to `delegation.token.secret.key` and deprecates the old name while maintaining backward compatibility. The term "master key" uses racially charged terminology; the more neutral "secret key" more accurately describes the config's purpose. |
Security | Discussion | Tom Bentley | 2020-11-06 | 2020-11-06 | KAFKA-10201 | 3.0 | |
| 680 | TopologyTestDriver should not require a Properties argument TLDR: Adds no-arg and clock-time-only constructors to `TopologyTestDriver` so tests that require no special configuration do not need to supply an empty `Properties` object. The single existing constructor required a `Properties` argument even for tests that need no configuration, adding boilerplate and randomizes the `application.id` by default to prevent test isolation issues. |
Streams Testing | Accepted | Rohit Deshpande | 2020-11-03 | 2020-12-04 | KAFKA-10629 | 2.8 | |
| 679 | Producer will enable the strongest delivery guarantee by default TLDR: Changes the producer defaults to `enable.idempotence=true`, `acks=all`, `retries=MAX_INT`, and `max.in.flight.requests.per.connection=5` to enable exactly-once delivery by default. Previously the defaults (`enable.idempotence=false`, `acks=1`) prioritized backward compatibility over correctness, leaving users exposed to duplicates and data loss without explicit configuration. |
Producer | Accepted | Cheng Tan | 2020-10-19 | 2022-02-14 | KAFKA-10619 | 2.8 | Produce |
| 676 | Respect logging hierarchy TLDR: KIP-676 fixes the `AlterClientQuotas` API and `kafka-configs.sh` to honor Log4j's logger hierarchy when dynamically setting log levels, so that setting a parent logger's level also applies to all child loggers as Log4j intends. Previously, the broker stored and applied only the explicitly named logger, silently ignoring hierarchy and forcing operators to enumerate every child logger individually. |
Admin | Accepted | Tom Bentley | 2020-10-07 | 2021-04-30 | KAFKA-10469 | 2.8 | |
| 674 | API to Aggregate Metrics in Kafka Streams TLDR: Adds a `KafkaStreamsMetricsAggregator` interface and a built-in aggregating `MetricsReporter` implementation that can roll up Kafka Streams task-level and thread-level metrics into application-level aggregates. Commercial monitoring services often impose limits on the number of metrics reported; a Kafka Streams application with many tasks/partitions can easily exceed these limits. |
Streams Metrics | Discussion | Bruno Cadonna | 2020-09-29 | 2020-10-13 | KAFKA-10484 | ||
| 673 | Emit JSONs with new auto-generated schema TLDR: Extends Kafka's request/response debug logging (`RequestChannel`) to emit structured JSON using the auto-generated `*JsonConverter` classes (e.g., `FetchRequestDataJsonConverter`) instead of the existing pseudo-JSON format. The existing log format is not valid JSON and cannot be reliably parsed by tools like `jq`, Elasticsearch, or Druid, limiting its utility for request tracing and analysis. |
Protocol Broker | Accepted | Anastasia Vela | 2020-09-24 | 2020-12-16 | KAFKA-10525 | 2.8 | Produce Fetch |
| 671 | Introduce Kafka Streams Specific Uncaught Exception Handler TLDR: Introduces a Kafka Streams-specific `StreamsUncaughtExceptionHandler` that receives uncaught exceptions from stream threads and returns one of `SHUTDOWN_STREAM_THREAD`, `REPLACE_STREAM_THREAD`, or `SHUTDOWN_CLIENT` to control recovery behavior. The existing Java `UncaughtExceptionHandler` fires after the thread is already dead, preventing in-thread recovery actions like graceful shutdown or thread replacement. |
Streams | Accepted | Walker Carlson | 2020-09-11 | 2020-12-04 | KAFKA-9331 | 2.8 | |
| 669 | Preserve Source Partition in Kafka Streams from context TLDR: Adds a `preserve.source.partition` configuration to Kafka Streams sink nodes that instructs the built-in `StreamPartitioner` to route output records to the same partition number as the source partition rather than computing a partition from the record key. In pipeline use cases where input and output topics have the same partition count, data locality and ordering guarantees depend on routing to the matching partition, but the key-based partitioner cannot guarantee this when the source partitioner logic is unknown. |
Streams | Discussion | satya k | 2020-09-03 | 2020-09-10 | KAFKA-10448 | ||
| 668 | Expose REST endpoint to list converter plugins TLDR: Adds a `GET /connector-plugins?pluginType=converter` REST endpoint to Kafka Connect that lists the available converter plugin classes. The existing `GET /connector-plugins` endpoint only returns connectors and SMTs; there is no programmatic way to enumerate available converters without inspecting worker logs. |
Connect | Discussion | Rupesh Kumar Patel | 2020-09-02 | 2020-09-02 | KAFKA-4279 | ||
| 666 | Add Instant-based methods to ReadOnlySessionStore TLDR: Adds `Instant`-based query methods to `ReadOnlySessionStore` interactive query API, aligning it with the `Instant`-based API already introduced for `ReadOnlyWindowStore` in KIP-358. The session store API still used millisecond `long` parameters, creating an inconsistency that forced users to manage timestamp conversions manually. |
Streams | Accepted | Jorge Esteban Quilcate Otoya | 2020-08-28 | 2021-04-30 | KAFKA-10434 | 3.0 | |
| 665 | Kafka Connect Hash SMT TLDR: Adds a `HashField` SMT that hashes the value of specified fields using a configurable algorithm (SHA-256, SHA-512, MD5, etc.) and replaces the original field value with the hash. Sensitive PII fields (SSNs, email addresses, etc.) passed through Connect pipelines need to be obfuscated before landing in downstream systems, and no out-of-the-box SMT provided cryptographic hashing. |
Connect Security | Discussion | Brandon Brown | 2020-08-27 | 2020-10-27 | KAFKA-10299 | ||
| 663 | API to Start and Shut Down Stream Threads TLDR: Adds `KafkaStreams.addStreamThread()` and `KafkaStreams.removeStreamThread()` methods to allow dynamic scaling of stream thread count at runtime without restarting the Streams client. Currently, if a stream thread dies due to an uncaught exception, the only recovery option is to restart the entire `KafkaStreams` client, causing unnecessary downtime. |
Streams | Accepted | Bruno Cadonna | 2020-08-24 | 2021-01-29 | KAFKA-6943 | 2.8 | |
| 662 | Throw Exception when Source Topics of a Streams App are Deleted TLDR: Changes Kafka Streams behavior when source topics are deleted at runtime: instead of silently shutting down all stream threads, the application now throws a `MissingSourceTopicException` via the streams uncaught exception handler so the application can handle or surface the error explicitly. The silent shutdown with an ERROR log is insufficient for production systems that need explicit notification and recovery logic when source topics disappear. |
Streams | Accepted | Bruno Cadonna | 2020-08-21 | 2020-08-31 | KAFKA-10355 | 2.7 | |
| 661 | Expose task configurations in Connect REST API TLDR: Adds a `GET /connectors/{connector}/tasks/{task}/config` REST endpoint to expose the computed task configuration as applied by the connector, not just the raw connector config. Connectors like `MirrorSourceConnector` or `JdbcSourceConnector` transform the connector config before distributing it to tasks; the resulting per-task config is not currently retrievable via the REST API. |
Connect | Accepted | Mickael Maison | 2020-08-20 | 2020-12-10 | KAFKA-10833 | 2.8 | |
| 659 | Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size TLDR: Adds a `window.size.ms` config to `StreamsConfig` and deprecates the no-arg constructors of `TimeWindowedSerde`/`TimeWindowedDeserializer` that default window size to `Long.MAX_VALUE`, making window size explicit. Without window size being passed through, consumers performing windowed aggregations initialize deserializers with `Long.MAX_VALUE` as window size, causing incorrect window-end time calculations and runtime errors. |
Streams | Accepted | Leah Thomas | 2020-08-19 | 2021-01-21 | KAFKA-10366 | 2.8 | |
| 656 | MirrorMaker2 Exactly-once Semantics TLDR: Adds an optional exactly-once semantics (EOS) mode to MirrorMaker 2 by implementing a `MirrorSinkTask` that manages consumer offsets transactionally, ensuring each record is delivered to the target cluster exactly once. MirrorMaker 2 is built on Kafka Connect's source connector framework, which provides only at-least-once delivery, meaning duplicate records can be produced to target clusters on task restarts. |
MirrorMaker Transactions | Discussion | Ning Zhang | 2020-08-03 | 2020-11-26 | KAFKA-10339 | 3.5 | |
| 655 | Windowed Distinct Operation for Kafka Streams API TLDR: Adds a windowed distinct() operation to the Kafka Streams DSL that deduplicates records within a configurable time window using a state store. Applications consuming topics written with at-least-once semantics have no native deduplication primitive, forcing teams to implement custom stateful processors that are error-prone and difficult to reuse. |
Streams | Accepted | Ivan Ponomarev | 2020-08-06 | 2026-03-27 | KAFKA-10369 | ||
| 654 | Aborted transaction with non-flushed data should throw a non-fatal exception TLDR: KIP-654 introduces a `TransactionAbortedException` (extending `ApiException`) thrown when a transaction is aborted via `KafkaProducer.abortTransaction()` while there are still non-flushed pending records in the accumulator. The existing `KafkaException("Failing batch since transaction was aborted")` is treated as fatal by many error-handling frameworks, forcing clients to restart producers unnecessarily when the abort is actually a recoverable, expected flow. |
Transactions Producer | Accepted | Gokul Srinivas | 2020-08-06 | 2020-09-10 | KAFKA-10186 | 2.7 | |
| 653 | Upgrade log4j to log4j2 TLDR: Replaces the log4j 1.x dependency on the Kafka broker/server side with log4j2 (2.x) and its corresponding SLF4J bindings. Log4j 1.x reached end-of-life in 2012, exposes known CVEs (e.g., CVE-2019-17571), and forces operators to use an obsolete configuration format unfamiliar to most users who know log4j2 syntax. |
Broker | Accepted | Dongjin Lee | 2020-08-05 | 2024-11-19 | KAFKA-9366 | 4.0 | |
| 651 | Support PEM format for SSL certificates and private key TLDR: Adds support for PEM-encoded certificates and private keys directly in Kafka SSL configuration via new `ssl.keystore.type=PEM` and inline `ssl.keystore.certificate.chain`/`ssl.keystore.key` configs, eliminating the need to manage JKS or PKCS12 keystore files. Managing JKS/PKCS12 keystores in containerized and secret-management environments (Vault, Kubernetes Secrets) is cumbersome; PEM strings can be injected directly from environment variables or secret stores. |
Security | Accepted | Rajini Sivaram | 2020-08-03 | 2020-08-10 | KAFKA-10338 | 2.7 | |
| 649 | Dynamic client configuration TLDR: KIP-649 adds a `reconfigure()` method to Kafka producer and consumer clients allowing select configurations (e.g., `acks`, `session.timeout.ms`) to be changed at runtime without restarting the client. Today all client configuration is fixed at construction time, requiring a full restart cycle to tune misbehaving clients. |
Client | Discussion | Ryan Dielhenn | 2020-07-29 | 2020-09-18 | KAFKA-10325 | Produce Metadata JoinGroup DescribeConfigs DescribeClientQuotas | |
| 648 | Renaming getter method for Interactive Queries TLDR: Renames `KeyQueryMetadata` getter methods from `getActiveHost()`, `getStandbyHosts()`, and `getPartition()` to `activeHost()`, `standbyHosts()`, and `partition()` to match Kafka's convention of not using `get` prefixes for accessor methods. The methods were added in KIP-535 with `get` prefixes that violated the established Kafka API naming convention. |
Streams | Accepted | John Thomas | 2020-07-28 | 2020-08-04 | KAFKA-6144 | 2.5 | |
| 646 | Serializer API should support ByteBuffer TLDR: Extends the Serializer interface with a new serialize(topic, headers, data) overload returning ByteBuffer, and updates the Partitioner interface to accept ByteBuffer, enabling zero-copy serialization that avoids allocating intermediate byte arrays. Kafka's Serializer API returned only byte[], requiring a copy from any ByteBuffer-based encoder before the result could be placed in the producer record buffer. |
Client | Discussion | Chia-Ping Tsai | 2020-07-22 | 2025-05-08 | KAFKA-5761 | ||
| 640 | Add log compression analysis tool TLDR: Introduces a `kafka-compression-analyzer.sh` tool that reads an existing log segment and reports the estimated compressed size for each supported compression algorithm (none, gzip, snappy, lz4, zstd). Choosing a compression algorithm currently requires manually producing data with each codec and measuring bandwidth and log size, which is time-consuming and error-prone. |
Broker Admin | Discussion | Christopher Beard | 2020-06-26 | 2020-11-27 | KAFKA-10281 | ||
| 637 | Include min.insync.replicas in MetadataResponse to make Producer smarter in partitioning events TLDR: Proposes including `min.insync.replicas` per-partition in the `MetadataResponse` so that producers can proactively avoid routing to partitions that cannot satisfy the configured `acks=all` requirement. With `acks=all`, a partition with fewer ISRs than `min.insync.replicas` will reject produce requests; the producer currently has no way to detect this without attempting the write and receiving a `NOT_ENOUGH_REPLICAS` error. |
Producer Protocol | Discussion | Arvin Zheng | 2020-07-03 | 2020-07-03 | KAFKA-10230 | Metadata | |
| 636 | Make RPC error codes and messages tagged fields TLDR: Converts the `ErrorCode` and `ErrorMessage` fields in all RPC messages using flexible (tagged-field-capable) versions to tagged fields, so they occupy zero bytes when the error code is `NONE`. On happy-path responses, `ErrorCode=0` and `ErrorMessage=null` are present in every response but carry no information; making them tagged fields saves 2-3 bytes per response sub-entity, which compounds for batched responses. |
Protocol | Discussion | Tom Bentley | 2020-07-01 | 2020-07-02 | |||
| 635 | GetOffsetShell: support for multiple topics and consumer configuration override TLDR: Extends `GetOffsetShell` (via `kafka-get-offsets.sh`) to support querying multiple topics in a single invocation, adds `--bootstrap-servers` as an alias for `--broker-list`, and allows passing additional consumer configuration overrides. The tool currently accepts only a single topic per invocation and uses the non-standard `--broker-list` argument name, requiring users to script around it to fetch offsets for multiple topics. |
Admin | Accepted | Arseniy Tashoyan | 2018-06-01 | 2020-10-14 | KAFKA-5235 | 3.0 | |
| 633 | Deprecate 24-hour Default Grace Period for Windowed Operations in Streams TLDR: KIP-633 deprecates the 24-hour default grace period for windowed operations (window aggregates, stream-stream joins) by removing APIs that silently apply it and replacing them with two explicit factory methods: one requiring a grace period parameter and one named `WithNoGrace()` that applies a grace period of zero. The 24-hour default caused results to be suppressed for up to 24 hours, which is almost always wrong for production use cases and caused widespread confusion among Streams users. |
Streams | Accepted | A. Sophie Blee-Goldman | 2021-03-31 | 2021-08-19 | KAFKA-8613 | 3.0 | |
| 632 | Add DirectoryConfigProvider TLDR: Adds a `DirectoryConfigProvider` that reads configuration values from files in a directory (one file per key, filename = key name) rather than from a properties file. On Kubernetes, Secrets are mounted as a directory with one file per key; the existing `FileConfigProvider` (KIP-297) reads only `.properties` files, making it ergonomically incompatible with the Kubernetes Secret directory mount pattern. |
Admin | Accepted | Tom Bentley | 2020-06-29 | 2020-08-06 | KAFKA-7370 | ||
| 631 | The Quorum-based Kafka Controller TLDR: Details the architectural changes to the Kafka Controller when operating in KRaft (KIP-500) mode: the controller becomes a Raft-based quorum member storing metadata in an internal `__cluster_metadata` topic, eliminating ZooKeeper as the metadata store. The existing ZooKeeper-dependent controller architecture is a scalability and operational bottleneck; the KRaft controller enables faster failover, larger partition counts, and removal of the ZooKeeper dependency. |
KRaft | Discussion | Colin McCabe | 2020-06-24 | 2024-09-20 | 2.8 | Fetch Metadata LeaderAndIsr StopReplica UpdateMetadata BrokerRegistration UnregisterBroker | |
| 630 | Kafka Raft Snapshot TLDR: KIP-630 defines the snapshot mechanism for the Raft-replicated `__cluster_metadata` log in KRaft mode: snapshot creation by the active controller, transfer to voter and observer replicas via a new `FetchSnapshot` RPC, and installation to bound log growth. Without snapshots the `__cluster_metadata` log grows unboundedly and new replicas must replay the entire log from the beginning to catch up. |
KRaft | Accepted | Jose Armando Garcia Sancio | 2020-06-24 | 2021-09-28 | KAFKA-10310 | 2.8 | FetchSnapshot |
| 629 | Use racially neutral terms in our codebase TLDR: Replaces racially charged terminology (`blacklist`/`whitelist`) throughout the Kafka codebase and configuration with neutral alternatives (`block`/`allow` or `deny`/`permit`). Several broker and Connect configurations (e.g., `metric.reporters`, topic creation allowlists) used these terms, which are now renamed with backward-compatible aliases. |
Admin | Accepted | Xavier Léauté | 2020-06-19 | 2020-09-29 | KAFKA-10201 | 2.7 | |
| 628 | ConsumerPerformance's multi-thread implementation TLDR: Restores multi-thread support to `kafka-consumer-perf-test.sh` by implementing a `ConsumerPerfThread` class for the new consumer API, making the `--threads` option functional again. Multi-thread consumer performance testing was removed when the tool was migrated from the old consumer to the new consumer API, leaving no way to benchmark consumer throughput with multiple threads. |
Consumer | Discussion | Jiamei.xie | 2020-06-18 | 2020-07-16 | KAFKA-10136 | ||
| 627 | Expose Trogdor-specific JMX Metrics for Tasks and Agents TLDR: Exposes Trogdor (Kafka's fault injection/benchmarking framework) agent and task state as JMX metrics (e.g., `active-agents-count`, `active-tasks-count`, `task-state`), eliminating the need to poll the Trogdor REST API for cluster health. Monitoring Trogdor cluster health requires REST API polling, which is burdensome compared to standard JMX-based monitoring used for all other Kafka components. |
Testing Metrics | Accepted | Sam Pal | 2020-06-16 | 2020-07-06 | KAFKA-8528 | ||
| 626 | Rename StreamsConfig config variable name TLDR: Renames the `StreamsConfig` Java constant for `topology.optimizations` from `TOPOLOGY_OPTIMIZATION` to `TOPOLOGY_OPTIMIZATION_CONFIG` to match the `_CONFIG` suffix convention used by all other Streams configuration constants. The misnamed constant was an oversight that made the API inconsistent and harder to discover by convention. |
Streams | Accepted | Matthias J. Sax | 2020-06-16 | 2020-06-19 | KAFKA-10168 | 2.7 | |
| 625 | Richer encodings for integral-typed protocol fields TLDR: Extends the Kafka binary protocol to allow variable-length integer encodings (`varint`/`varlong`) for regular (non-length, non-tagged) integer fields in message schemas, complementing the unsigned varint encoding already used for array lengths in flexible versions (KIP-482). Fixed-width 4-byte or 8-byte integers in protocol fields waste bytes for small values; variable-length encoding can reduce wire size significantly for fields that predominantly carry small values. |
Protocol | Discussion | Tom Bentley | 2020-06-15 | 2020-06-16 | KAFKA-9927 | Metadata | |
| 622 | Add currentSystemTimeMs and currentStreamTimeMs to ProcessorContext TLDR: Exposes `currentSystemTimeMs()` and `currentStreamTimeMs()` methods on `ProcessorContext` in Kafka Streams, sourcing the timestamps from the Streams runtime rather than `System.currentTimeMillis()`. This makes wall-clock and stream-time usage testable because `MockTime` can control both values during unit tests. |
Streams Testing | Accepted | William Bottrell | 2020-06-09 | 2022-04-14 | KAFKA-10062 | 3.0 | |
| 618 | Exactly-Once Support for Source Connectors TLDR: Adds exactly-once delivery support for Kafka Connect source connectors by writing source records and their offsets atomically within a Kafka transaction, using the transactional producer API. Without EoS, source connector offsets are committed separately from the records they correspond to, creating a window where records can be duplicated on Connect worker restart. |
Connect Transactions | Accepted | Chris Egerton | 2020-05-16 | 2023-06-07 | KAFKA-10000 | 3.2 | |
| 617 | Allow Kafka Streams State Stores to be iterated backwards TLDR: Adds `backwardRange()`, `backwardAll()`, and `backwardFetch()` methods to `ReadOnlyKeyValueStore`, `ReadOnlyWindowStore`, and `ReadOnlySessionStore` in Kafka Streams for reverse-order iteration. All existing store iteration APIs return results in ascending order; there is no way to iterate from newest to oldest or in reverse key order without materializing all results in memory. |
Streams | Accepted | Jorge Esteban Quilcate Otoya | 2020-05-19 | 2020-07-29 | KAFKA-9929 | 2.7 | |
| 616 | Rename implicit Serdes instances in kafka-streams-scala TLDR: Renames the implicit `Serde` instances in the Kafka Streams Scala DSL from names matching the type (e.g., `implicit val String: Serde[String]`) to `implicit val stringSerde: Serde[String]` to avoid name clashes on wildcard imports. When the implicit val name matches a type name, a wildcard import shadows the type itself, making it impossible to use the type name in scope alongside the implicit. |
Streams | Accepted | Yuriy Badalyants | 2020-05-19 | 2021-01-20 | KAFKA-10020 | 2.7 | |
| 614 | Add Prefix Scan support for State Stores TLDR: Adds a `prefixScan(prefix, serializer)` method to `ReadOnlyKeyValueStore` and its RocksDB/in-memory implementations in Kafka Streams for efficiently retrieving all keys sharing a common byte prefix. Currently there is no native prefix query; developers must use `range()` with a manually computed upper bound or iterate the full key space, incurring unnecessary I/O. |
Streams | Accepted | Sagar Rao | 2020-05-12 | 2020-10-27 | KAFKA-10648 | 2.8 | |
| 613 | Add end-to-end latency metrics to Streams TLDR: Introduces an `end-to-end-latency-avg` / `end-to-end-latency-max` task-level metric in Kafka Streams that measures the time from when a source record's timestamp was created (its event time) to when it is processed. There is no existing metric that captures the full pipeline latency from event time to processing time, making it difficult to bound or tune real-time application SLOs. |
Streams Metrics | Accepted | A. Sophie Blee-Goldman | 2020-05-11 | 2020-07-15 | KAFKA-9983 | 2.6 | |
| 612 | Ability to Limit Connection Creation Rate on Brokers TLDR: Introduces `max.connection.creation.rate` broker and per-listener/per-IP/per-user quota configs to throttle the rate of new TCP connection creation, with quota violation responses that delay or close new connections. Rapid client restarts or new deployments can cause connection storms that saturate broker CPU with TLS handshakes and connection setup, degrading request latencies for existing clients. |
Broker | Accepted | Anna Povzner | 2020-05-10 | 2020-12-08 | KAFKA-10023 | 2.7 | DescribeConfigs DescribeClientQuotas AlterClientQuotas |
| 610 | Error Reporting in Sink Connectors TLDR: Extends Kafka Connect's dead letter queue (DLQ) error reporting to cover failures that occur within `SinkTask.put()` in addition to the existing coverage for SMT and converter failures. The KIP-298 DLQ mechanism only captured errors in the transform and converter phases; once a record was handed to the connector's `put()` method, errors had no DLQ path and were simply logged or caused task failure. |
Connect | Accepted | Aakash Shah | 2020-05-06 | 2020-06-18 | KAFKA-9971 | 2.6 | |
| 608 | Expose Kafka Metrics in Authorizer TLDR: Passes the broker's `Metrics` instance to authorizer plugins via a new `AuthorizerServerInfo` context object so authorizers can register and expose their own JMX metrics using the broker's existing metric infrastructure. Authorizer plugins previously had no access to the broker's metrics registry, forcing them to create their own `Metrics` objects with separate JMX reporters. |
Security Metrics | Accepted | Jeff Huang | 2020-05-05 | 2021-01-20 | KAFKA-9958 | ||
| 607 | Add Metrics to Kafka Streams to Report Properties of RocksDB TLDR: Adds RocksDB-level metrics to Kafka Streams task metrics (e.g., `rocksdb-total-sst-files-size`, `rocksdb-live-sst-files-size`, `rocksdb-estimate-num-keys`, `rocksdb-block-cache-usage`) using RocksDB property queries rather than the statistics API. Existing KIP-471 RocksDB metrics use the RocksDB statistics API which has measurable performance overhead; property-based metrics are cheaper and expose memory/disk usage not covered by statistics. |
Streams Metrics | Accepted | Bruno Cadonna | 2020-05-05 | 2020-09-07 | KAFKA-9924 | 2.7 | |
| 606 | Add Metadata Context to MetricsReporter TLDR: Adds a `ClientInformation` object to the `MetricsContext` passed to `MetricsReporter` plugins, exposing metadata about whether the reporting component is a broker, Connect worker, or client library (producer/consumer/admin/streams). Without this context, metric reporters could not determine which component was emitting metrics, making it impossible to route or label metrics correctly in multi-component deployments. |
Metrics Client | Accepted | Xavier Léauté | 2020-05-01 | 2020-05-26 | KAFKA-9960 | 2.6 | |
| 605 | Expand Connect Worker Internal Topic Settings TLDR: Exposes additional topic-level configuration properties (e.g., `cleanup.policy`, `min.insync.replicas`, `retention.ms`, compression type) for each of the three Connect internal topics (`config.storage.topic`, `offset.storage.topic`, `status.storage.topic`) so workers can create them with the correct settings. Previously only `replication.factor` and `partitions` were configurable, leading to internal topics being created with broker defaults that were often inappropriate for Connect's durability and compaction requirements. |
Connect | Accepted | Randall Hauch | 2020-04-30 | 2020-05-15 | KAFKA-9931 | 2.6 | |
| 604 | Remove ZooKeeper Flags from the Administrative Tools TLDR: Removes the deprecated `--zookeeper` flag from all Kafka administrative CLI tools (`kafka-topics.sh`, `kafka-consumer-groups.sh`, etc.), completing the migration to broker-based APIs. Previous KIPs (KIP-455, KIP-497, KIP-554, KIP-543) added the broker-side equivalents and KIP-555 deprecated the ZooKeeper flags; this KIP finalizes their removal as part of the KIP-500 ZooKeeper elimination effort. |
Admin KRaft | Accepted | Colin McCabe | 2020-04-29 | 2021-11-15 | KAFKA-9945 | 2.6 | |
| 602 | Change default value for client.dns.lookup TLDR: Changes the default value of `client.dns.lookup` from `default` to `use_all_dns_ips` so clients automatically try all IP addresses returned by DNS before marking a hostname as unreachable. The previous default tried only the first IP address, causing connection failures when a hostname resolved to multiple IPs and the first IP was unavailable (e.g. during rolling restarts behind a DNS load balancer). |
Client | Accepted | Badai Aqrandista | 2020-04-20 | 2020-06-04 | KAFKA-9313 | 2.6 | |
| 599 | Throttle Create Topic, Create Partition and Delete Topic Operations TLDR: Introduces per-client-id and per-broker mutation rate quotas (`create_topics`, `delete_topics`, `create_partitions`) enforced by the controller to throttle the rate of topic creation, deletion, and partition expansion requests. Unthrottled topic mutation requests can overload the Kafka controller, delaying leader elections and making partitions temporarily unavailable, particularly in shared multi-tenant clusters. |
Broker Admin | Accepted | David Jacot | 2020-04-21 | 2020-08-06 | KAFKA-9915 | 2.7 | CreateTopics v6 DeleteTopics v5 CreatePartitions v3 |
| 597 | MirrorMaker2 internal topics Formatters TLDR: Adds a `MirrorMaker2MessageFormatter` that decodes and pretty-prints records from MirrorMaker 2 internal topics (`mm2-offsets`, `mm2-configs`, `mm2-status`) so their binary content can be inspected via `kafka-console-consumer`. MirrorMaker 2 internal topics use binary encoding with no existing formatter, making debugging offset or configuration issues require custom decoding tools. |
MirrorMaker | Accepted | Mickael Maison | 2020-04-16 | 2020-08-24 | KAFKA-10232 | 2.7 | |
| 596 | Safely abort Producer transactions during application shutdown TLDR: Adds a `close(Duration timeout)` overload to `KafkaProducer` that automatically calls `abortTransaction()` when the producer is in a non-recoverable error state before closing, instead of requiring users to manually invoke `abortTransaction` inside a nested try-catch. The existing error handling pattern for transactional producers is complex because `abortTransaction()` itself can throw if the producer is already fenced or in a fatal state. |
Transactions Producer | Discussion | Xiang Zhang | 2020-04-14 | 2020-05-15 | KAFKA-9592 | ||
| 595 | A Raft Protocol for the Metadata Quorum TLDR: Defines a Kafka-native Raft consensus protocol (`KRaft`) for the cluster metadata quorum, adapting Kafka's existing leader-follower log replication to support leader election, epoch tracking, and linearizable reads without ZooKeeper. This is the core protocol layer underpinning KIP-500's goal to eliminate ZooKeeper from Kafka's control plane. |
KRaft Protocol | Accepted | Jason Gustafson | 2020-04-15 | 2023-04-12 | KAFKA-9876 | 2.7 | BeginQuorumEpoch DescribeQuorum EndQuorumEpoch LeaderChangeMessage Vote Fetch |
| 594 | Expose output topic names from TopologyTestDriver TLDR: Adds a `producedTopicNames()` method to `TopologyTestDriver` that returns all topic names the topology wrote to during a test run, including internal repartition and changelog topics. This enables catch-all assertions that detect unexpected topic writes, which is impossible today without explicitly naming every output topic in each test. |
Streams Testing | Accepted | Andrew Coates | 2020-04-14 | 2020-05-07 | KAFKA-9865 | 2.6 | |
| 592 | Replicate mirrormaker topics from earliest TLDR: Changes MirrorMaker 1.0's default `auto.offset.reset` from `latest` to `earliest` so that replication topics are consumed from the beginning. The current `latest` default causes data loss when MirrorMaker is subscribed via regex patterns and a matching topic is created while MirrorMaker is running—records produced before MirrorMaker catches up are permanently skipped. |
MirrorMaker | Discussion | Jeff Widman | 2020-04-11 | 2020-05-04 | KAFKA-4668 | ||
| 591 | Add Kafka Streams config to set default state store TLDR: Adds a new Kafka Streams config `default.dsl.store` that lets users globally switch the DSL operator state store type (e.g., `rocksdb` vs `in_memory`) without modifying each operator individually via `Materialized`. Currently, switching all DSL state stores from RocksDB to in-memory (e.g., for Kubernetes deployments without local disks) requires tedious per-operator `Materialized` configurations. |
Streams | Accepted | Matthias J. Sax | 2020-04-09 | 2022-02-04 | KAFKA-13281 | 3.2 | |
| 590 | Redirect Zookeeper Mutation Protocols to The Controller TLDR: Routes all ZooKeeper mutation RPCs (AlterConfig, CreateAcls, DeleteAcls, AlterPartitionReassignments, etc.) exclusively through the controller during the KIP-500 bridge release. This ensures no broker writes directly to ZooKeeper, making the controller the single writer and enabling safe migration to KRaft. |
KRaft Protocol | Accepted | Boyang Chen | 2020-04-03 | 2020-11-02 | KAFKA-9705 | 2.7 | RequestHeader Metadata LeaderAndIsr CreateTopics Envelope |
| 589 | Add API to update Replica state in Controller TLDR: Replaces the ZooKeeper-based log-dir failure notification path (writing znodes under `/log_dir_event_notification`) with a new broker-to-controller RPC (`AlterReplicaLogDirs` extension) that directly notifies the controller of offline log directories. The ZooKeeper watch-based notification is inherently tied to ZooKeeper and must be eliminated as part of the KIP-500 ZooKeeper removal effort. |
Protocol Broker | Accepted | David Arthur | 2020-04-03 | 2021-06-23 | KAFKA-9837 | 3.7 | |
| 588 | Allow producers to recover gracefully from transaction timeouts TLDR: Adds a TRANSACTION_TIMED_OUT error code returned when a producer's transaction times out, allowing the same producer instance to recover by re-initializing with the bumped epoch via InitProducerId rather than requiring the application to create a new producer. Previously, a timed-out transaction surfaced as a fatal ProducerFencedException indistinguishable from being fenced by a competing producer, forcing unnecessary producer reconstruction. |
Transactions Producer | Accepted | Boyang Chen | 2020-04-02 | 2022-08-16 | KAFKA-9803 | ||
| 587 | Suppress detailed responses for handled exceptions in security-sensitive environments TLDR: Introduces a `rest.extension.classes` configuration option for Kafka Connect workers that enables pluggable REST extensions, and specifically adds a mode to suppress exception stack traces and detailed error messages from Connect REST API responses in security-sensitive deployments. Detailed exception messages in REST responses risk leaking internal system information, violating security policies that require information minimization in API error bodies. |
Connect Security | Discussion | Connor Penhale | 2020-04-02 | 2020-05-06 | KAFKA-9766 | ||
| 586 | Deprecate commit records without record metadata TLDR: Deprecates the no-argument `SourceTask.commitRecord(SourceRecord)` method in Kafka Connect in favor of the `commitRecord(SourceRecord, RecordMetadata)` overload introduced by KIP-382. The old method is never called by the framework anymore and retaining it causes confusion for connector developers about which callback to implement. |
Connect | Accepted | Mario Molina | 2020-04-02 | 2020-05-15 | KAFKA-9780 | 2.6 | |
| 585 | Filter and Conditional SMTs TLDR: KIP-585 introduces `Filter` and `Predicate`-based conditional Single Message Transformations (SMTs) in Kafka Connect, allowing records to be dropped or transformed only when a specified predicate matches. Without conditional logic, all SMTs apply unconditionally to every record, requiring connector-specific code to implement routing or selective transformation. |
Connect | Accepted | Tom Bentley | 2020-03-24 | 2020-05-20 | KAFKA-7052 | 2.6 | |
| 584 | Versioning scheme for features TLDR: Introduces a cluster-wide feature versioning system that allows brokers to advertise supported feature ranges and enables rolling upgrades where new features are activated only after all brokers support them. ApiVersions negotiation existed for RPC versions but provided no mechanism to coordinate non-RPC cluster-wide behavioral changes (e.g., new storage formats or replication semantics) across heterogeneous broker versions. |
Protocol Broker | Accepted | Kowshik | 2020-03-20 | 2026-01-31 | KAFKA-9755 | 2.7 | ApiVersions UpdateFeatures LeaderAndIsr AlterConfigs |
| 581 | Value of optional null field which has default value TLDR: Adds replace.null.with.default (default true) to JsonConverter, allowing null field values that have a schema default to be preserved as null in the JSON output instead of being substituted with the default. Debezium and similar CDC connectors emit explicit nulls to represent deleted columns, but JsonConverter was silently replacing them with the schema default, corrupting the change event semantics. |
Connect | Accepted | Cheng Pan | 2020-03-18 | 2023-03-21 | KAFKA-8713 | 3.5 | |
| 580 | Exponential Backoff for Kafka Clients TLDR: Replaces the static retry backoff in Kafka clients with an exponential backoff with configurable base, multiplier, and maximum so that metadata fetch retries spread out under broker failures. A fixed small retry interval (100ms default) causes all clients to retry metadata fetches simultaneously after a broker failure, amplifying load on the recovering cluster and delaying metadata convergence. |
Client | Accepted | Sanjana Kaundinya | 2020-03-13 | 2020-07-01 | KAFKA-9800 | 3.7 | |
| 579 | new exception on min.insync.replicas > replication.factor TLDR: Adds validation to reject topic creation or config updates where min.insync.replicas exceeds the topic's replication.factor, throwing a new InvalidConfigurationException immediately. Currently this misconfiguration is silently accepted, causing every acks=all produce to fail with NOT_ENOUGH_REPLICAS, making the topic permanently unwritable. |
Broker | Discussion | Paolo Moriello | 2020-03-13 | 2020-03-30 | KAFKA-4680 | ||
| 578 | Add configuration to limit number of partitions TLDR: Introduces `max.partition.count` broker and per-topic configuration limits that the broker enforces on `CreateTopics` and `CreatePartitions` requests to prevent partition counts from exceeding safe operational thresholds. Without enforcement, operators can create topics with partition counts that cause live-locked clusters where even remediation (topic deletion) becomes unresponsive. |
Broker Admin | Discussion | Gokul Ramanan Subramanian | 2020-03-11 | 2020-07-14 | KAFKA-9590 | Metadata CreateTopics CreatePartitions AlterPartitionReassignments | |
| 576 | Support dynamic update of more broker configs related to replication TLDR: Extends Kafka's dynamic broker configuration (KIP-226) to cover additional replication-related settings including `fetch.max.bytes`, `replica.fetch.max.bytes`, `replica.fetch.backoff.ms`, `replica.lag.time.max.ms`, and others, enabling live tuning without broker restarts. These configs currently require a full broker restart to change, which is operationally disruptive during incidents involving replication lag or throttling. |
Broker Admin | Discussion | Cheng Tan | 2020-03-08 | 2020-03-24 | KAFKA-9683 | ||
| 575 | build a Kafka-Exporter by Java TLDR: Proposes adding a JVM-level metrics exporter for Kafka that exposes JVM heap, GC, thread, and class-loading metrics via JMX or HTTP in a standard format. No official JVM metrics exporter existed in the Kafka ecosystem, forcing operators to run separate JVM exporter agents alongside Kafka or forgo JVM-level observability. |
Metrics | Discussion | francis lee | 2020-03-05 | 2020-05-04 | KAFKA-9660 | ||
| 574 | CLI Dynamic Configuration with file input TLDR: Adds file input support to kafka-configs.sh so complex structured configurations (JSON, nested lists) can be passed via a file rather than inline on the command line. The existing --add-config option is limited to simple key=value pairs, making it impossible to set structured dynamic config values through the CLI. |
Admin | Accepted | Aneel Nazareth | 2020-02-26 | 2020-04-01 | KAFKA-9612 | 2.6 | |
| 573 | Enable TLSv1.3 by default TLDR: KIP-573 changes the default value of `ssl.enabled.protocols` to include only TLSv1.2 and TLSv1.3, removing the obsolete and insecure TLSv1 and TLSv1.1 from the default negotiation list. Older TLS versions have known cryptographic weaknesses and are deprecated by RFC 8446, yet Kafka continued to advertise them by default because TLSv1.3 required JDK 11+. |
Security | Accepted | Nikolay Izhikov | 2020-02-21 | 2020-06-03 | KAFKA-9320 | 2.6 | |
| 572 | Improve timeouts and retries in Kafka Streams TLDR: Introduces a dedicated `task.timeout.ms` configuration for Kafka Streams that causes a task to fail (rather than block indefinitely) when a single operation exceeds the timeout, decoupling Streams-level timeout handling from the embedded client retry behavior. Previously Streams relied entirely on internal Kafka client retries, which blocked the entire `StreamThread` for all tasks when one client retried, and gave users no control over the overall operation timeout. |
Streams | Accepted | Matthias J. Sax | 2020-02-19 | 2021-02-19 | KAFKA-9274 | 2.8 | |
| 571 | Add option to force remove members in StreamsResetter TLDR: Adds a `--force-remove` option to the Kafka Streams application resetter tool that calls `AdminClient.removeMembersFromConsumerGroup` to evict all static and dynamic members from the consumer group. This unblocks application resets when leftover group members hold up the group coordinator past the session timeout, particularly when session.timeout.ms is configured to a high value. |
Streams Admin | Accepted | feyman | 2020-02-13 | 2020-05-29 | KAFKA-9146 | 2.6 | LeaveGroup |
| 570 | Add leader epoch in StopReplicaRequest TLDR: Adds a leader epoch field to `StopReplicaRequest` so brokers can fence out-of-order or stale stop-replica commands that arrive after a newer leader epoch has been established. Without an epoch check, a delayed `StopReplicaRequest` from a previous controller term could incorrectly stop a replica that is legitimately serving a newer leader, causing data unavailability. |
Protocol Broker | Accepted | David Jacot | 2020-02-11 | 2020-04-30 | KAFKA-9539 | 2.6 | StopReplica LeaderAndIsr |
| 569 | DescribeConfigsResponse - Update the schema to include additional metadata information of the field TLDR: Extends the `DescribeConfigsResponse` schema to include additional metadata per configuration entry: `type` (string, int, boolean, etc.), `documentation`, and `validValues`, alongside the existing `isSensitive`/`isReadOnly` fields. GUI tools like Confluent Control Center and Lenses need config type and validation metadata to render appropriate input controls and prevent invalid submissions without requiring hardcoded knowledge of each config. |
Protocol Admin | Accepted | Shailesh Panwar | 2020-02-05 | 2020-06-01 | KAFKA-9494 | 2.6 | DescribeConfigs |
| 568 | Explicit rebalance triggering on the Consumer TLDR: Adds a `KafkaConsumer.enforceRebalance(reason)` method that forces the broker to trigger an immediate rebalance by sending a `JoinGroupRequest` with `generation=-1`. Kafka Streams required the ability to force immediate rebalances for its cooperative protocol (task warm-up transfers, version probing), but the only existing mechanism—unsubscribe then resubscribe—was asynchronous and unreliable. |
Consumer | Accepted | A. Sophie Blee-Goldman | 2020-02-07 | 2020-04-30 | KAFKA-9525 | 2.6 | |
| 567 | Kafka Cluster Audit TLDR: Introduces a structured audit log framework for Kafka that delivers authorization and administrative event notifications to a pluggable audit backend (e.g., a Kafka topic). Kafka has no built-in audit trail, making it impossible to satisfy regulatory compliance requirements (SOC2, GDPR, HIPAA) that mandate a tamper-evident record of who accessed or modified cluster resources. |
Security Admin | Discussion | Igor Martemyanov | 2020-01-23 | 2025-11-26 | KAFKA-9413 | ||
| 566 | Add rebalance callbacks to ConsumerInterceptor TLDR: Extends ConsumerInterceptor to also implement ConsumerRebalanceListener, adding onPartitionsRevoked/onPartitionsAssigned callbacks to the interceptor lifecycle. Without this, interceptors that maintain per-partition state (e.g., for metrics or tracing) have no way to clean up or reinitialize state when partitions are reassigned. |
Consumer | Discussion | Tommy Becker | 2020-01-23 | 2020-01-24 | |||
| 565 | Using AclCommand,avoid call the global method loadcache in SimpleAclAuthorizer TLDR: Fixes `AclCommand` (the `kafka-acls.sh` CLI) to avoid invoking `SimpleAclAuthorizer.loadCache()` when not strictly necessary for the requested operation (e.g., add/delete ACL). On clusters with 20,000+ topics, `loadCache()` reads all ACL znodes from ZooKeeper into memory, turning a simple ACL mutation into a minutes-long operation. |
Security Admin | Accepted | StevenLuMT | 2019-12-07 | 2020-01-21 | KAFKA-9424 | ||
| 564 | Add new cached authorizer:change the dim of cache TLDR: Introduces a result-level authorization cache in the broker's ACL authorizer, keyed on `(principal, operation, resource)` tuples, so repeated authorization checks for the same subject skip ACL list traversal and return a cached decision. Without caching, each request triggers a full ACL scan over all entries which becomes a CPU bottleneck on clusters with tens of thousands of ACLs. |
Security | Accepted | StevenLuMT | 2019-12-07 | 2020-01-21 | KAFKA-9452 | ||
| 563 | Add 'tail -n' feature for ConsoleConsumer TLDR: Adds a tail -n <count> mode to kafka-console-consumer.sh that reads the last N messages from each partition by seeking to offset (LEO - N) before consuming. There is currently no way to quickly inspect the last N messages of a partition from the CLI without consuming from the beginning or knowing exact offsets. |
Consumer Admin | Discussion | huxihx | 2020-01-20 | 2020-01-20 | KAFKA-9322 | ||
| 562 | Allow fetching a key from a single partition rather than iterating over all the stores on an instance TLDR: KIP-562 adds a `partition` parameter to `KafkaStreams.store()` so callers can query a specific partition of a local state store rather than always querying all local partitions. The existing API forces the store wrapper to iterate over all partitions on the instance, making it impossible for a query-routing layer (as designed in KIP-535) to execute a targeted sub-query against a single active or standby partition replica. |
Streams | Discussion | Navinder Pal Singh Brar | 2020-01-17 | 2020-02-06 | KAFKA-9445 | 2.5 | |
| 561 | Regex Support for ConsumerGroupCommand TLDR: KIP-561 adds regex support to `kafka-consumer-groups.sh` for the `--describe`, `--delete`, and `--reset-offsets` operations, allowing operators to target multiple groups with a single regex pattern. Without regex support, bulk operations against groups with common naming conventions require scripting around the CLI or enumerating every group explicitly. |
Admin Consumer | Discussion | Alex Dunayevsky | 2019-12-23 | 2020-01-16 | KAFKA-7817 | ||
| 560 | Auto infer external topic partitions in stream reset tool TLDR: Adds an `--all-user-topics` flag to the Kafka Streams application reset tool to automatically discover and reset offsets for all externally committed topic offsets without requiring the user to explicitly list input topic names. Currently, operators must manually enumerate all external input topics and pass them to the reset tool, which is error-prone and cumbersome for applications with many input topics. |
Streams Admin | Discussion | high.lee | 2020-01-15 | 2020-07-06 | KAFKA-9042 | ||
| 559 | Make the Kafka Protocol Friendlier with L7 Proxies TLDR: Adds a mandatory `client_software_name` and `client_software_version` field to the `ApiVersionsRequest` (v3+) so that brokers can identify which client library and version initiated a connection. L7 proxies and service meshes need access to client identity information embedded in the Kafka protocol to implement routing, observability, and policy enforcement without out-of-band mechanisms. |
Protocol Client | Accepted | David Jacot | 2020-01-14 | 2020-02-28 | KAFKA-9437 | 2.4 | JoinGroup v6 SyncGroup v4 Produce Fetch DescribeGroups SaslAuthenticate CreateDelegationToken RenewDelegationToken ExpireDelegationToken DescribeDelegationToken |
| 558 | Track the set of actively used topics by connectors in Kafka Connect TLDR: Adds a `/connectors/{name}/topics` REST endpoint to Kafka Connect that returns the set of Kafka topics actively used by a connector since it was started, tracking topic consumption and production at the framework level. There is no existing mechanism to determine which topics a source connector dynamically produces to or which exact topic names a regex-subscribed sink connector is consuming. |
Connect | Accepted | Konstantine Karantasis | 2020-01-14 | 2020-03-08 | KAFKA-9422 | 2.5 | |
| 557 | Add emit on change support for Kafka Streams TLDR: Adds an `emit-on-change` processing mode to Kafka Streams aggregations so that downstream records are only forwarded when the aggregated value actually changes, suppressing redundant identical updates. Currently, Kafka Streams only supports `emit-on-update` (always forward) and `emit-on-window-close` (suppress until window closes), with no way to deduplicate identical successive aggregation results. |
Streams | Accepted | Richard Yu | 2020-01-11 | 2021-03-24 | KAFKA-12508 | 2.6 | |
| 555 | Deprecate direct Zookeeper access in Kafka administrative tools TLDR: Deprecates the `--zookeeper` flag across all Kafka administrative CLI tools (e.g., `kafka-topics.sh`, `kafka-configs.sh`) in preparation for KIP-500's full ZooKeeper removal, marking the flag as unsupported and routing all operations through the broker Admin API instead. Direct ZooKeeper access from CLI tools bypasses broker-side authorization and couples tooling to ZooKeeper's internal metadata schema. |
Admin KRaft | Accepted | Colin McCabe | 2020-01-09 | 2020-04-20 | KAFKA-9397 | 2.5 | |
| 554 | Add Broker-side SCRAM Config API TLDR: Adds `describeUserScramCredentials()` and `alterUserScramCredentials()` methods to `AdminClient` as the broker-based API for managing SCRAM user credentials, replacing the ZooKeeper-direct path used by `kafka-configs.sh`. SCRAM credential management previously required direct ZooKeeper access, which must be eliminated as part of the KIP-500 ZooKeeper removal effort. |
Security Admin | Accepted | Colin McCabe | 2020-05-01 | 2020-09-01 | KAFKA-10259 | 2.7 | AlterUserScramCredentials DescribeUserScramCredentials |
| 551 | Expose disk read and write metrics TLDR: Adds two broker-level JMX gauges — TotalDiskReadBytes and TotalDiskWriteBytes — measuring physical disk I/O bytes excluding page cache hits. Without direct disk I/O metrics, operators cannot distinguish whether broker latency is caused by page cache misses versus network or CPU pressure. |
Metrics Broker | Accepted | Colin McCabe | 2019-12-10 | 2020-01-16 | KAFKA-9292 | 2.6 | |
| 550 | Mechanism to Delete Stray Partitions on Broker TLDR: Introduces a mechanism for brokers to detect and clean up stray partitions — log directories present on disk but unknown to the controller — that accumulate when partition reassignment completes while the original broker is offline. Stray partitions waste disk space, can cause log directory scanning overhead, and are invisible to operators with no automated cleanup path. |
Broker | Discussion | Dhruvil Shah | 2019-12-07 | 2020-01-06 | KAFKA-7362 | LeaderAndIsr StopReplica | |
| 548 | Add Option to enforce rack-aware custom partition reassignment execution TLDR: KIP-548 proposes making `kafka-reassign-partitions.sh --execute` enforce rack-aware placement by default, rejecting a reassignment plan where any replica set has two replicas in the same rack, with an explicit `--disable-rack-aware` flag to override. Currently, rack awareness is only validated when the `--generate` option computes the plan; manually crafted or tool-generated reassignments can silently violate rack isolation, reducing fault domain separation. |
Broker Admin | Discussion | Satish Bellapu | 2019-11-22 | 2019-11-23 | KAFKA-9205 | ||
| 547 | Extend ConsumerInterceptor to allow modification of Consumer Commits TLDR: Extends `ConsumerInterceptor` with an `onCommit(Map<TopicPartition, OffsetAndMetadata>)` callback that allows the interceptor to rewrite the `OffsetAndMetadata` map before it is committed, enabling programmatic injection of commit metadata. Commit metadata (the string field in `OffsetAndMetadata`) is only settable by callers who construct their own `OffsetAndMetadata` objects; interceptors observing commits have no way to enrich them with cross-cutting metadata like timestamps or trace IDs. |
Consumer | Discussion | Eric Azama | 2019-11-18 | 2019-11-18 | OffsetCommit | ||
| 546 | Add Client Quota APIs to the Admin Client TLDR: Adds dedicated `describeClientQuotas()` and `alterClientQuotas()` methods to `AdminClient` with a rich quota entity model supporting entity types (user, client-id, IP) and quota types (produce rate, consume rate, request percentage, connection rate). Managing quotas via the existing `describeConfigs`/`alterConfigs` APIs is awkward because quotas do not fit the key-value config model and the APIs cannot return all quota metadata (e.g., which entity types have quotas applied). |
Admin | Accepted | Brian Byrne | 2019-10-28 | 2020-08-24 | KAFKA-7740 | 2.6 | AlterClientQuotas DescribeClientQuotas |
| 545 | support automated consumer offset sync across clusters in MM 2.0 TLDR: KIP-545 adds a `sync.group.offsets.enabled` MirrorMaker 2 configuration that, when combined with `emit.checkpoints.enabled`, periodically writes translated consumer offsets into the target cluster's `__consumer_offsets` topic automatically. Previously, offset translation from the source cluster was available via `MirrorClient.remoteConsumerOffsets()` but had to be driven by external tooling; without automatic sync, consumers failing over to the target cluster cannot resume from their last committed position without manual intervention. |
MirrorMaker | Accepted | Ning Zhang | 2019-10-23 | 2020-07-19 | KAFKA-9076 | 2.7 | |
| 544 | Make metrics exposed via JMX configurable TLDR: Introduces two new configurations, `metrics.include` (whitelist regex) and `metrics.exclude` (blacklist regex), applied to JMX MBean names to limit which metrics are registered in JMX. On large clusters with many partitions, Kafka registers tens of thousands of MBeans, causing monitoring agents that enumerate JMX metrics via RMI to time out. |
Metrics | Accepted | Xavier Léauté | 2019-10-22 | 2020-03-31 | KAFKA-9106 | 2.6 | |
| 543 | Expand ConfigCommand's non-ZK functionality TLDR: Expands ConfigCommand (kafka-configs.sh) to support all broker config operations via KafkaAdminClient rather than direct ZooKeeper access, achieving ZK-parity without ZK dependency. This is required for KRaft mode (KIP-500) where ZooKeeper is eliminated and all administrative operations must route through the broker. |
Admin KRaft | Accepted | Brian Byrne | 2019-10-22 | 2020-04-17 | KAFKA-9082 | 2.5 | |
| 542 | Partition Reassignment Throttling TLDR: KIP-542 adds per-reassignment throttle controls to partition reassignment, separating reassignment replication traffic from normal ISR replication traffic. The existing KIP-73 quota applies to all non-ISR replication, so throttling reassignment also throttles catch-up replication for out-of-sync replicas, and throttle settings must be manually updated if leadership changes during reassignment. |
Broker Admin | Discussion | Viktor Somogyi | 2019-10-22 | 2019-12-13 | KAFKA-8330 | LeaderAndIsr | |
| 541 | Create a fetch.max.bytes configuration for the broker TLDR: Adds a broker-side fetch.max.bytes configuration that caps the total bytes any single FetchRequest can return regardless of what the client requests. A consumer configured with a very large fetch.max.bytes can monopolize broker I/O threads, degrading throughput fairness for other consumers sharing the same broker. |
Broker Consumer | Accepted | Colin McCabe | 2019-10-21 | 2019-11-07 | KAFKA-9101 | ||
| 540 | Implement per key stream time tracking TLDR: Proposes per-key stream time tracking in Kafka Streams, where stream time advances independently for each key rather than as a single high-watermark across all records in a partition. The global per-partition stream time model incorrectly advances time for all keys when any key's record has a high timestamp, causing premature window closing and grace period expiry for keys whose own data is late or sparse. |
Streams | Discussion | Richard Yu | 2019-10-19 | 2019-11-10 | KAFKA-8769 | ||
| 539 | Add mechanism to flush records out in low volume suppression buffers TLDR: KIP-539 adds a `Suppressed.untilWallClockTimeLimit(Duration, BufferConfig)` factory method that flushes suppression buffer entries based on wall-clock elapsed time rather than waiting indefinitely for stream time to advance. In low-traffic scenarios, stream time cannot advance because no new records arrive to update it, leaving buffered records permanently stuck in suppression operators and preventing downstream results from being emitted. |
Streams | Discussion | Richard Yu | 2019-10-17 | 2019-10-22 | KAFKA-8769 | ||
| 538 | Add a metric tracking the number of open connections with a given SSL cipher type TLDR: KIP-538 adds a per-listener JMX metric (`kafka.common.network:type=selector-metrics,cipher-suite=<suite>,protocol=<protocol>,name=connections`) counting the number of open SSL connections using each cipher suite and protocol version. Operators managing cipher security have no way to verify which cipher suites are actually in use because the information is only logged at DEBUG/TRACE level and not aggregated. |
Metrics Security | Accepted | Colin McCabe | 2019-10-16 | 2019-10-24 | KAFKA-9091 | 2.5 | |
| 537 | Increase default zookeeper session timeou TLDR: Increases the default value of zookeeper.session.timeout.ms from 6000ms to 18000ms. Cloud and containerized environments experience higher network jitter than on-premise datacenters, causing spurious ZooKeeper session expirations that trigger unnecessary controller failovers and partition leader elections. |
Broker | Accepted | Jason Gustafson | 2019-10-15 | 2019-10-25 | KAFKA-9102 | 2.5 | |
| 536 | Propagate broker start time to Admin API TLDR: Adds a `startTimeMs` field to the `DescribeCluster` Admin API response (and the `Node` class) carrying each broker's start time in milliseconds. The only existing way to retrieve broker start times was to query ZooKeeper directly (`/brokers/ids/<id>`), which is a privileged operation and will be unavailable post-KIP-500. |
Admin Protocol | Discussion | Noa Resare | 2019-10-14 | 2019-11-04 | Metadata UpdateMetadata | ||
| 535 | Allow state stores to serve stale reads during rebalance TLDR: Adds a `stale.metadata.allowed` configuration for Kafka Streams Interactive Queries that allows standby replicas to serve reads during an active rebalance, returning potentially stale state rather than throwing `InvalidStateStoreException`. During rebalances, all IQ requests fail because active tasks are being migrated, causing complete read unavailability even though standbys hold a recent copy of the state. |
Streams | Discussion | Navinder Pal Singh Brar | 2019-10-13 | 2020-03-31 | KAFKA-6144 | 2.6 | |
| 534 | Reorganize checkpoint system in log cleaner to per partition TLDR: Fixes log compaction to ensure that tombstone records and transaction abort markers are retained for at least `delete.retention.ms` milliseconds from the point the compacting segment was written, not just from the segment roll time. Under the prior implementation, tombstones could be physically removed before `delete.retention.ms` had elapsed from the last possible read of the preceding non-tombstone record, violating the semantic guarantee that consumers reading within the retention window will observe deletions. |
Broker | Accepted | Richard Yu | 2019-09-02 | 2019-10-18 | KAFKA-8522 | 3.1 | |
| 533 | Add default api timeout to AdminClient TLDR: Adds a default.api.timeout.ms config to AdminClient that bounds the total wall-clock time for any admin operation including retries, with a per-operation override. Without it, AdminClient timeout semantics are controlled only by request.timeout.ms (per-attempt), making it impossible to reason about or bound the total time a single logical operation takes. |
Admin | Accepted | Jason Gustafson | 2019-10-09 | 2019-11-27 | KAFKA-8503 | 2.5 | |
| 532 | Broker Consumer Lag metrics in size and time TLDR: Proposes broker-side consumer lag metrics expressed in both message count and estimated time-behind-tail, using PageCache hit/miss signals to infer whether consumer fetches are served from cache or disk. Existing consumer lag metrics are client-side only; brokers have no intrinsic view of how far behind consumers are or whether they are about to fall off the PageCache, which degrades fetch performance. |
Metrics Consumer | Discussion | NIKHIL BHATIA | 2019-08-26 | 2019-10-09 | KAFKA-8833 | Fetch | |
| 531 | Drop support for Scala 2.11 in Kafka 2.5 TLDR: Drops Scala 2.11 as a supported build target starting with Kafka 2.5, retaining only Scala 2.12 and 2.13. Scala 2.11 reached end-of-life in November 2017 and maintaining it imposes non-trivial CI/CD costs since every commit must be compiled and tested across all supported Scala versions. |
Broker | Accepted | Ismael Juma | 2019-01-20 | 2020-05-11 | KAFKA-9324 | 2.5 | |
| 528 | Deprecate PartitionGrouper configuration and interface TLDR: Deprecates and removes the PartitionGrouper configuration and interface from Kafka Streams. The interface was undocumented in its constraints and easily misused, and no production use cases existed; the internal task-to-partition assignment is better handled by the Streams assignment logic directly. |
Streams | Accepted | Matthias J. Sax | 2019-09-19 | 2019-09-24 | KAFKA-8927 | 2.4 | |
| 527 | Add VoidSerde to Serdes TLDR: KIP-527 adds a `Serdes.Void()` serde to the Kafka Streams API that serializes/deserializes `Void` (always null), providing a type-safe representation of topics where keys or values are always null. Using `BytesSerde` or `ByteArraySerde` for null-keyed topics produces a misleading `KStream<Bytes, V>` type that requires defensive null checks throughout the topology. |
Streams | Accepted | Nikolay Izhikov | 2019-09-19 | 2019-10-19 | KAFKA-8455 | 2.5 | |
| 526 | Reduce Producer Metadata Lookups for Large Number of Topics TLDR: Proposes allowing producers to pre-declare a set of known topics via a new `pre.known.topics` configuration, enabling the producer to batch metadata requests for all those topics at startup rather than fetching metadata on-demand per topic as records are first produced. For producers writing to a large, stable set of topics (e.g., thousands of per-tenant topics), on-demand metadata fetching causes high latency for first records to each new topic as individual metadata round-trips serialize. |
Producer | Discussion | Brian Byrne | 2019-09-17 | 2020-01-23 | KAFKA-8904 | 2.5 | |
| 525 | Return topic metadata and configs in CreateTopics response TLDR: Extends the `CreateTopicsResponse` to include the newly created topic's metadata (partition count, replication factor) and effective configuration (including broker defaults for unspecified configs). Previously `CreateTopics` returned only success/error status, forcing clients to make separate `Metadata` and `DescribeConfigs` round-trips to learn the actual configuration of a just-created topic. |
Protocol Admin | Accepted | Rajini Sivaram | 2019-09-13 | 2019-10-08 | KAFKA-8907 | 2.4 | CreateTopics v5 Metadata DescribeConfigs |
| 524 | Allow users to choose config source when describing configs TLDR: Adds `--dynamic` and `--all` flags to `kafka-configs.sh --describe --entity-type brokers` so operators can explicitly request only dynamic overrides or all configurations (static + dynamic + defaults) respectively. Without this, the tool only shows dynamic overrides by default, with no way to inspect static broker configs or view the effective value when no dynamic override exists. |
Admin | Accepted | Jason Gustafson | 2019-09-12 | 2019-10-15 | KAFKA-9040 | 2.5 | |
| 521 | Enable redirection of Connect's log4j messages to a file by default TLDR: Changes Kafka Connect's default log4j configuration to write worker logs to a file (`connect.log`) in addition to stdout, mirroring the behavior of Kafka broker and ZooKeeper startup scripts. By default, Connect logs only to stdout, meaning logs are lost unless the process is managed by a supervisor that captures stdout, and log rotation is unavailable without manual log4j reconfiguration. |
Connect | Accepted | Konstantine Karantasis | 2019-09-10 | 2019-10-02 | KAFKA-5609 | 2.4 | |
| 518 | Allow listing consumer groups per state TLDR: Extends the `ListGroups` API and `AdminClient.listConsumerGroups()` to accept an optional set of `ConsumerGroupState` filters, returning only groups in the specified states. Without this, operators had to call `DescribeGroups` for every group individually to determine state, which was prohibitively expensive on clusters with hundreds or thousands of groups. |
Admin Consumer | Accepted | Mickael Maison | 2019-09-06 | 2020-05-28 | KAFKA-9130 | 2.6 | ListGroups v4 DescribeGroups |
| 517 | Add consumer metrics to observe user poll behavior TLDR: Adds consumer metrics including `poll-idle-ratio`, `time-between-poll-avg`, `time-between-poll-max`, and `last-poll-seconds-ago` to expose how frequently and how promptly the application calls `Consumer.poll()`. Without these metrics, there is no observable signal to distinguish whether a `max.poll.interval.ms` violation is caused by slow record processing or by the application neglecting to call `poll()` frequently enough. |
Consumer Metrics | Accepted | Kevin Lu | 2019-09-04 | 2019-10-17 | KAFKA-8874 | 2.4 | |
| 516 | Topic Identifiers & Topic Deletion State Improvements TLDR: Assigns a unique `Uuid` (topic ID) to each topic at creation time, propagates it in metadata responses and fetch requests, and uses it to disambiguate stale vs. current topic replicas after delete-and-recreate operations. Identifying topics solely by name causes race conditions when a topic is deleted and recreated with the same name: brokers that were offline during deletion continue to serve stale data under the recycled name. |
Protocol Broker | Accepted | Lucas Bradstreet | 2019-09-04 | 2021-06-29 | KAFKA-8872 | 4.1 | Produce v13 Fetch v13 Metadata LeaderAndIsr StopReplica UpdateMetadata OffsetCommit CreateTopics DeleteTopics DeleteRecords AddPartitionsToTxn TxnOffsetCommit DescribeConfigs AlterConfigs AlterReplicaLogDirs DescribeLogDirs CreatePartitions Vote BeginQuorumEpoch EndQuorumEpoch |
| 515 | Enable ZK client to use the new TLS supported authentication TLDR: Adds configuration properties (`zookeeper.ssl.client.enable`, `zookeeper.clientCnxnSocket`, `zookeeper.ssl.keystore.*`, `zookeeper.ssl.truststore.*`) to enable TLS-encrypted communication between Kafka brokers and ZooKeeper 3.5.x+. Previously, Kafka brokers could only communicate with ZooKeeper in plaintext, leaving ZooKeeper coordination traffic unencrypted even in security-hardened environments. |
Security | Accepted | Pere Urbon | 2019-08-29 | 2020-02-18 | KAFKA-8843 | 2.5 | |
| 514 | Add a bounded flush() API to Kafka Producer TLDR: Adds a `Producer.flush(long timeout, TimeUnit unit)` overload that blocks until all buffered records are sent or the timeout expires, throwing `TimeoutException` on expiry. The existing `flush()` has no timeout, meaning it can block indefinitely if the broker is slow or network connectivity is degraded, giving callers no ability to bound flush duration. |
Producer | Discussion | kun du | 2019-08-27 | 2019-10-21 | KAFKA-7711 | ||
| 513 | Distinguish between Key and Value serdes in scala wrapper library for kafka streams TLDR: KIP-513 proposes introducing `KeySerde[K]` and `ValueSerde[V]` wrapper types in the Kafka Streams Scala API so the compiler can statically distinguish serdes intended for keys (which must be configured with `isKey=true` for schema-registry serdes) from serdes intended for values. Without the distinction, code using the Confluent Schema Registry serde must manually manage the `isKey` flag, and mixing key and value serdes compiles successfully but fails at runtime. |
Streams | Discussion | Mykhailo Yeromenko | 2019-08-23 | 2019-10-21 | |||
| 512 | make Record Headers available in onAcknowledgement and onComplete TLDR: KIP-512 exposes record headers in the `ProducerInterceptor.onAcknowledgement()` callback alongside the existing topic, partition, offset, and timestamp. Without headers, use cases like latency measurement (correlating request/response via header IDs) and distributed tracing (propagating trace context) cannot be implemented in producer interceptors. |
Producer Client | Accepted | Renuka Metukuru | 2019-08-23 | 2024-09-06 | KAFKA-8830 | 4.1 | |
| 511 | Collect and Expose Client's Name and Version in the Brokers TLDR: Extends `ApiVersionsRequest` (v3+) to carry `client_software_name` and `client_software_version` fields from the client library, which the broker stores and exposes via `DescribeClientQuotas` and internal metrics. Cluster operators have no visibility into which client library versions are connected to their brokers, making it difficult to assess upgrade impact or debug misbehaving clients. |
Protocol Client Metrics | Accepted | David Jacot | 2019-08-21 | 2019-12-16 | KAFKA-8855 | 2.4 | ApiVersions v3 |
| 510 | Metrics library upgrade TLDR: KIP-510 proposes upgrading Kafka's internal metrics library from the abandoned Yammer Metrics 2.2.0 to Dropwizard Metrics 4.x, gaining bug fixes, JDK 9+ support, and improved reservoir implementations. The current library has not been maintained since 2012, lacks support for modern JDK features, and contains known bugs that the newer version addresses. |
Metrics | Discussion | Mario Molina | 2019-08-21 | 2019-08-21 | KAFKA-8721 | ||
| 509 | Rebalance and restart Producers TLDR: KIP-509 proposes a producer-side load-balancing mechanism analogous to consumer group rebalancing, where producers coordinate which source partitions each instance is responsible for without requiring Kafka Connect. In Kubernetes-style environments where producers autoscale horizontally, there is no built-in coordination primitive to shard source partitions across producer replicas. |
Producer | Discussion | Werner Dähn | 2019-08-19 | 2019-10-21 | KAFKA-8812 | ||
| 508 | Make Suppression State Queriable TLDR: Exposes interactive query access to the suppression buffer in Kafka Streams so callers can query only the finalized (emitted) records from a `KTable#suppress()` operator via a dedicated `ReadOnlyKeyValueStore`. The existing workaround of materializing the downstream table required double storage and could not provide an iterator over only the finalized state held by the suppression buffer. |
Streams | Discussion | Dongjin Lee | 2019-08-16 | 2020-09-28 | KAFKA-8403 | ||
| 507 | Securing Internal Connect REST Endpoints TLDR: Secures the internal Connect REST endpoint (/connectors/tasks/configs) used for follower-to-leader task config relay by requiring that only authenticated Connect workers (identified by mutual TLS or a shared principal) can call it. When the Connect REST API is secured, the internal endpoint is also exposed and can be called by unauthenticated external clients, bypassing task isolation. |
Connect Security | Accepted | Chris Egerton | 2019-08-12 | 2019-10-02 | KAFKA-8804 | 2.4 | |
| 506 | Allow setting SCRAM password via Admin interface TLDR: Extends the `AdminClient` (and the underlying `AlterUserScramCredentials` RPC) to allow creating, altering, and deleting SCRAM credentials for Kafka users. SCRAM credential management currently requires running `kafka-configs.sh` with a direct ZooKeeper connection, making it impossible for application code to manage user credentials without a ZooKeeper dependency. |
Security Admin | Discussion | Tom Bentley | 2019-08-09 | 2019-08-12 | KAFKA-8780 | ||
| 505 | Add new public method to only update assignment metadata in consumer TLDR: KIP-505 adds a `Consumer.updateAssignmentMetadata(Duration timeout)` public method that triggers partition assignment, offset fetching, and group coordination without returning any records. The `poll(0)` hack used for this purpose has undefined semantics in the new `poll(Duration)` API (a zero duration may not provide enough time for the coordinator round-trip), and there is no dedicated no-fetch method for callers that only need assignment state to be current. |
Consumer | Discussion | Jungtaek Lim | 2019-08-09 | 2019-08-09 | KAFKA-8776 | ||
| 504 | Add new Java Authorizer Interface TLDR: Replaces the deprecated Scala kafka.security.auth.Authorizer trait with a new Java interface org.apache.kafka.server.authorizer.Authorizer, adding support for async authorization, broker start interceptors, and ACL metadata. The existing Scala trait cannot be depended on without pulling in kafka-core, creates compatibility barriers for third-party authorizer implementations (Ranger, Sentry), and lacks async support needed for remote authorization services. |
Security | Accepted | Rajini Sivaram | 2019-08-06 | 2019-09-24 | KAFKA-8865 | 2.4 | |
| 503 | Add metric for number of topics marked for deletion TLDR: Adds broker-side JMX metrics for the number of topics currently marked for deletion and the total number of replica-deletion tasks queued in the controller. When many topics are deleted simultaneously, the controller can become overwhelmed processing deletion tasks, but operators have no metric to track how many deletions remain pending or to determine when the queue will drain. |
Metrics Broker | Discussion | David Arthur | 2019-08-05 | 2019-08-13 | KAFKA-8753 | 2.4 | |
| 502 | Connect SinkTask.put(...) to specify ArrayList<SinkRecord> in Signature TLDR: Proposes changing `SinkTask.put(Collection<SinkRecord>)` to `SinkTask.put(List<SinkRecord>)` to expose the full `List` interface to connector implementors. The `Collection` type in the current signature denies connector developers access to index-based access, `subList()`, and other `List`-specific methods that simplify batch processing logic. |
Connect | Discussion | Cyrus Vafadari | 2019-08-03 | 2019-09-24 | KAFKA-8749 | ||
| 501 | Avoid out-of-sync or offline partitions when follower fetch requests not processed in time TLDR: KIP-501 proposes tracking the last time a follower sent a fetch request separately from whether it has caught up in bytes, so the ISR eviction timer (`replica.lag.time.max.ms`) is not reset only by fetch receipt but also requires the leader to have actually processed the fetch within that window. When a leader's I/O is slow (GC pause, disk stall), it may not service a follower's fetch request within `replica.lag.time.max.ms` even though the follower is actively fetching, causing ISR shrinkage and potential offline partitions that are not caused by follower issues. |
Broker | Discussion | Satish Duggana | 2019-07-30 | 2021-06-29 | KAFKA-8733 | ||
| 500 | Replace ZooKeeper with a Self-Managed Metadata Quorum TLDR: Replaces Apache ZooKeeper as Kafka's metadata store with a self-managed Raft-based metadata quorum (KRaft) embedded in the Kafka brokers, where one or more broker nodes act as the metadata log leader and followers. ZooKeeper is an external dependency that limits metadata scalability (partition count), complicates deployment, and creates an operational boundary between Kafka and its metadata store. |
KRaft | Accepted | Colin McCabe | 2019-07-29 | 2020-07-09 | KAFKA-9119 | 2.8 | BrokerRegistration AddOffsetsToTxn AddPartitionsToTxn AlterConfigs AlterPartitionReassignments AlterUserScramCredentials CreateAcls CreateDelegationToken CreatePartitions DeleteAcls DescribeAcls DescribeClientQuotas DescribeConfigs DescribeDelegationToken DescribeUserScramCredentials ElectLeaders EndTxn ExpireDelegationToken InitProducerId ListPartitionReassignments Metadata RenewDelegationToken TxnOffsetCommit UpdateFeatures BrokerHeartbeat Fetch LeaderAndIsr |
| 499 | Unify connection name flag for command line tool TLDR: Standardizes all Kafka CLI tools to use `--bootstrap-server` as the sole flag for specifying broker connection strings, removing or deprecating inconsistent alternatives like `--brokers` used in some tools. Different CLI tools use different flag names for the same concept, creating confusion for operators who must remember tool-specific flag names. |
Admin | Accepted | Mitchell Henderson | 2019-07-31 | 2019-08-29 | KAFKA-8507 | 2.5 | |
| 498 | Add client-side configuration for maximum response size to protect against OOM TLDR: KIP-498 adds a `max.response.size` client-side config that is passed to `NetworkReceive` as the maximum allowable response size, causing the client to throw `InvalidReceiveException` instead of attempting to allocate a multi-hundred-MB buffer when it reads a malformed response header. The vulnerability is triggered when a producer configured with `security.protocol=PLAINTEXT` connects to an SSL listener: the broker sends a TLS alert whose first four bytes decode to ~350 MB, causing an OOM on the client. |
Client | Discussion | Alexandre Dupriez | 2019-07-28 | 2019-08-06 | KAFKA-4090 | ||
| 497 | Add inter-broker API to alter ISR TLDR: Introduces a new inter-broker `AlterIsrRequest`/`AlterIsrResponse` RPC that allows partition leaders to update the ISR via the controller over the Kafka protocol instead of writing directly to ZooKeeper. ZooKeeper-based ISR updates are a scalability bottleneck and a prerequisite for KIP-500's ZooKeeper-free architecture since the controller must be the single authoritative source for ISR state. |
Protocol Broker | Accepted | Jason Gustafson | 2019-06-26 | 2020-04-16 | KAFKA-8836 | 2.7 | |
| 496 | Administrative API to delete consumer offsets TLDR: Adds AdminClient#deleteConsumerGroupOffsets() to allow deletion of committed offsets for a stopped consumer group on specific partitions, without deleting the group itself. Previously operators could only delete entire groups or wait for retention-based expiry; there was no way to selectively reset offsets on individual partitions for a dead group. |
Admin Consumer | Accepted | Jason Gustafson | 2019-07-23 | 2019-09-20 | KAFKA-8730 | 2.4 | OffsetDelete |
| 495 | Dynamically Adjust Log Levels in Connect TLDR: Adds a dynamic log level REST API (`PUT /admin/loggers/{logger}`) to Kafka Connect workers, allowing log levels to be changed at runtime without restarting the worker. Previously, changing log levels in Connect required editing `log4j.properties` and restarting the worker, which often masked bugs by resetting in-memory state. |
Connect Admin | Accepted | Arjun Satish | 2019-07-22 | 2023-10-11 | KAFKA-7772 | 3.6 | |
| 492 | Add java security providers in Kafka Security config TLDR: Adds a `security.providers` configuration accepting a comma-separated list of `java.security.Provider` class names that Kafka will dynamically register into the JVM security framework at startup. Custom `KeyManager`/`TrustManager` algorithms referenced via `ssl.keymanager.algorithm` require their `Provider` to be registered in `java.security` JVM properties, forcing operators to modify JVM startup files rather than Kafka configuration. |
Security | Accepted | Sai Sandeep Mopuri | 2019-07-15 | 2019-10-20 | KAFKA-8669 | 2.4 | |
| 490 | New metric to count offsets expired without being consumed by a consumer group TLDR: KIP-490 proposes a topic-level config `non.consumed.offsets.groups` listing consumer groups for which the broker emits a JMX metric counting messages that have been deleted by retention before being consumed by the group. Without this metric, consumers have no signal that their offset has been overtaken by the log-start-offset, meaning messages were silently lost due to retention. |
Metrics Consumer | Discussion | Jose Morales Aragon | 2019-07-12 | 2019-07-24 | |||
| 488 | Clean up Sum,Count,Total Metrics TLDR: Consolidates and renames the overlapping `Count`, `Sum`, `Total`, and `SampledTotal` metric classes in Kafka's metrics framework, eliminating redundant stat implementations and clarifying the semantics of cumulative vs. windowed counts. The existing metric hierarchy has multiple classes with confusingly similar names and behaviors, some of which are duplicates, making it difficult for developers to choose the right metric type. |
Metrics | Accepted | John Roesler | 2019-07-11 | 2019-07-22 | KAFKA-8696 | 2.4 | |
| 487 | Automatic Topic Creation on Producer TLDR: Proposes a `allow.auto.create.topics` configuration for the producer client that controls whether producing to a non-existent topic triggers an auto-creation metadata request. The current auto-topic-creation is triggered implicitly during metadata fetches, conflating metadata retrieval with topic creation—a side effect that is surprising, makes metadata requests non-idempotent, and bypasses AdminClient-based topic lifecycle management. |
Producer Client | Discussion | Justine Olshan | 2019-07-10 | 2019-08-06 | KAFKA-8657 | ||
| 484 | Expose metrics for group and transaction metadata loading duration TLDR: Adds `group-metadata-load-duration-ms` and `txn-metadata-load-duration-ms` histograms to the broker's `GroupCoordinator` and `TransactionCoordinator` respectively, measuring how long it takes to load partition state after a coordinator leadership change. When a consumer group coordinator partition moves to a new broker, offset loading can take arbitrarily long but appears externally as unexplained consumer group inactivity with no diagnostic metric. |
Metrics Consumer Transactions | Accepted | Anastasia Vela | 2019-06-25 | 2019-08-01 | KAFKA-6263 | 2.4 | |
| 482 | The Kafka Protocol should Support Optional Tagged Fields TLDR: Defines a tagged-fields encoding extension for the Kafka wire protocol that allows optional fields to be appended to any request or response version without incrementing the schema version, using a compact varint tag+length encoding. Adding any new optional field to an RPC currently requires bumping the request version and updating all codec implementations, making incremental protocol evolution costly and slowing the pace of protocol improvement. |
Protocol | Accepted | Colin McCabe | 2019-06-25 | 2019-09-06 | KAFKA-8885 | 2.4 | Fetch Metadata |
| 481 | SerDe Improvements for Connect Decimal type in JSON TLDR: Adds a `decimal.format` configuration to the `JsonConverter` (values: `BASE64`, `NUMERIC`) so that Connect Decimal schema fields can be serialized as human-readable decimal strings (e.g., `10.2345`) in addition to the existing BASE64-encoded binary representation. All existing JSON systems represent decimals as numeric strings, making Connect's BASE64 encoding incompatible with any JSON consumer that is not Kafka Connect–aware. |
Connect | Accepted | Almog Gavra | 2019-06-24 | 2019-09-17 | KAFKA-8595 | 2.4 | |
| 480 | Sticky Partitioner TLDR: Introduces a `StickyPartitioner` that sends multiple consecutive records without an explicit key to the same partition until the batch is sent, then rotates to a new partition. The default round-robin partitioner assigns each keyless record to a different partition, producing many small batches and higher latency even when `linger.ms=0`, because records for the same partition never accumulate. |
Producer | Accepted | Justine Olshan | 2019-06-24 | 2019-08-15 | KAFKA-8601 | 2.4 | |
| 479 | Add StreamJoined config object to Join TLDR: Adds a StreamJoined config object to KStream join operations, allowing users to specify custom SerDes, custom state store suppliers, and named repartition topics for both sides of a stream-stream join. Previously there was no unified way to configure the two windowed state stores backing a stream join, and SerDes had to be inferred or set globally. |
Streams | Accepted | Bill Bejeck | 2019-06-18 | 2019-09-25 | KAFKA-8558 | 2.4 | |
| 478 | Strongly typed Processor API TLDR: Adds compile-time type safety to the Kafka Streams Processor API by introducing typed `ProcessorSupplier<KIn, VIn, KOut, VOut>`, `FixedKeyProcessorSupplier`, and related interfaces so the processor chain's key-value types are checked at compile time. The original Processor API used raw types for inputs and outputs, allowing type mismatches to fail only at runtime. |
Streams | Accepted | John Roesler | 2019-06-15 | 2021-06-16 | KAFKA-8396 | 2.7 | |
| 477 | Add PATCH method for connector config in Connect REST API TLDR: Adds HTTP PATCH support to the /connectors/{id}/config endpoint in the Kafka Connect REST API, enabling partial connector configuration updates that merge the provided fields into the existing config without replacing the entire document. The existing PUT-only approach required a GET-modify-PUT cycle that was non-atomic and error-prone, with a race window where concurrent modifications could overwrite each other. |
Connect | Accepted | Ivan Yurchenko | 2019-06-13 | 2024-05-09 | KAFKA-16445 | 3.8 | |
| 476 | Add Java AdminClient Interface TLDR: Converts AdminClient from an abstract class to a Java interface, enabling proper mocking, multiple implementations, and cleaner dependency injection. The abstract class design was a Java 7 workaround; with Java 8 as the minimum version, default interface methods can replace base class implementations while giving users testable interfaces. |
Admin | Accepted | Andrew Coates | 2019-05-31 | 2019-08-06 | KAFKA-8454 | 2.4 | |
| 475 | New Metrics to Measure Number of Tasks on a Connector TLDR: Adds connector-level task count metrics to Kafka Connect—`connector-total-task-count`, `connector-running-task-count`, `connector-paused-task-count`, `connector-failed-task-count`, `connector-unassigned-task-count`—per connector instance. The existing task count metric is only available at the worker level, preventing per-connector task monitoring and making it impossible to detect connectors that are creating too many or too few tasks. |
Connect Metrics | Accepted | Cyrus Vafadari | 2019-05-30 | 2019-07-03 | KAFKA-8447 | 2.4 | |
| 473 | Enable KafkaLog4JAppender to use SASL Authentication Callback Handlers TLDR: Extends `KafkaLog4JAppender` to support pluggable SASL authentication callback handlers via `sasl.login.callback.handler.class` and `sasl.client.callback.handler.class` properties, consistent with KIP-86. The existing `KafkaLog4JAppender` with SASL (added by KIP-425) requires credentials to be hardcoded in the JAAS configuration, preventing use of dynamic credential providers like AWS IAM or Vault for log appenders. |
Security Client | Discussion | Ryan Pridgeon | 2019-05-23 | 2019-05-24 | KAFKA-8419 | ||
| 471 | Expose RocksDB Metrics in Kafka Streams TLDR: Exposes RocksDB internal statistics (block cache hits, compaction stats, memtable metrics, etc.) as Kafka Streams JMX metrics via the existing metrics reporter infrastructure. RocksDB is the storage backend for stateful Streams operators, but its performance bottlenecks (cache pressure, write stalls) were invisible without attaching a separate JVM profiler or reading dump files. |
Streams Metrics | Accepted | Bruno Cadonna | 2019-05-17 | 2019-06-21 | KAFKA-6498 | 2.4 | |
| 470 | TopologyTestDriver test input and output usability improvements TLDR: Introduces strongly-typed `TestInputTopic<K,V>` and `TestOutputTopic<K,V>` wrappers for `TopologyTestDriver` with fluent pipe/read methods, deprecating the raw `ConsumerRecordFactory` and `pipeInput`/`readOutput` APIs. Existing `TopologyTestDriver` tests require verbose boilerplate—constructing `ConsumerRecord` objects manually and passing deserializers on every read—making topology tests significantly larger than the application code under test. |
Streams Testing | Discussion | Jukka Karvanen | 2019-05-15 | 2019-10-04 | KAFKA-8233 | 2.4 | |
| 467 | Augment ProduceResponse error messaging for specific culprit records TLDR: Extends `ProduceResponse` to include a per-record-batch error field that identifies which specific record batch within a produce request caused a validation failure (invalid magic, missing key on compacted topic, non-monotonic offsets), not just the partition-level error code. When a `ProduceRequest` contains multiple record batches and one is invalid, the current response only returns a partition-level `INVALID_RECORD` error with no indication of which batch is the culprit. |
Protocol Broker | Accepted | Guozhang Wang | 2019-05-06 | 2019-08-02 | KAFKA-8729 | 2.4 | Produce v8 |
| 466 | Add support for List<T> serialization and deserialization TLDR: Adds `ListSerializer`, `ListDeserializer`, and `ListSerde` to the Kafka clients library for serializing and deserializing `List<T>` types with a configurable inner serde. Kafka Streams DSL operations that aggregate values into lists (e.g., `aggregate`) have no built-in `List` serde, forcing users to implement custom serialization. |
Streams Client | Accepted | Daniyar Yeralin | 2019-05-02 | 2021-05-14 | KAFKA-8326 | 3.0 | |
| 465 | Add Consolidated Connector Endpoint to Connect REST API TLDR: Adds `?expand=status` and `?expand=info` query parameters to the `GET /connectors` REST endpoint so clients can retrieve connector status and/or configuration for all connectors in a single request. Without this, fetching the status of N connectors requires N+1 HTTP calls (`GET /connectors` to list names, then `GET /connectors/{name}/status` for each), causing O(N) latency at scale. |
Connect | Accepted | dan norwood | 2019-05-01 | 2019-05-16 | KAFKA-8309 | 2.3 | |
| 462 | Use local thread id for KStreams TLDR: Changes Kafka Streams thread IDs to use a JVM-local thread counter rather than a shared static counter across multiple KafkaStreams instances in the same JVM, ensuring stable and unique group.instance.id values for static membership. Multiple KafkaStreams instances in one JVM shared a static thread counter, causing thread IDs to be non-deterministic across restarts when instances initialized concurrently, breaking static membership semantics. |
Streams Consumer | Accepted | Boyang Chen | 2019-04-25 | 2019-05-03 | KAFKA-8285 | 2.3 | |
| 461 | Improve Replica Fetcher behavior at handling partition failure TLDR: Changes the replica fetcher thread to isolate partition-level errors so that a fetch failure for one partition only removes that partition from tracking, allowing the thread to continue fetching all other healthy partitions. Currently, an unrecoverable exception in one partition causes the entire replica fetcher thread to terminate, leaving all other partitions it was serving to become under-replicated. |
Broker | Accepted | Aishwarya Gune | 2019-04-25 | 2019-08-02 | KAFKA-8346 | 2.3 | StopReplica |
| 460 | Admin Leader Election RPC TLDR: Extends the AdminClient leader election API to support four election types — preferred, unclean, reassignment, and any — not just preferred replica elections. Previously only preferred replica election was accessible via the AdminClient; triggering an unclean election to restore availability when ISR is empty required direct ZooKeeper manipulation. |
Admin Broker | Accepted | Jose Armando Garcia Sancio | 2019-04-24 | 2019-06-04 | KAFKA-8286 | 2.4 | ElectLeaders v1 |
| 458 | Connector Client Config Override Policy TLDR: Adds a connector.client.config.override.policy configuration to the Connect worker to control which producer/consumer config properties connectors are allowed to override from their task configs. Without this, any connector (including untrusted ones deployed to a shared cluster) could override sensitive worker-level client configs like security credentials or bootstrap servers. |
Connect Security | Accepted | Magesh kumar Nandakumar | 2019-04-19 | 2019-05-17 | KAFKA-8265 | 2.3 | |
| 455 | Create an Administrative API for Replica Reassignment TLDR: Introduces an `AdminClient` API for incremental replica reassignment and cancellation of in-progress reassignments, replacing the ZooKeeper-based `/admin/reassign_partitions` node interface. The ZooKeeper-based interface lacked error codes, security controls, incremental updates, and the ability to cancel an ongoing reassignment. |
Admin Broker | Accepted | Colin McCabe | 2019-04-16 | 2020-04-20 | KAFKA-8345 | 2.4 | ListPartitionReassignments AlterPartitionReassignments LeaderAndIsr |
| 454 | Expansion of the ConnectClusterState interface TLDR: KIP-454 extends the `ConnectClusterState` interface (introduced in KIP-382 for REST extensions) with additional metadata including connector and task status, worker information, and connector configuration. The original interface exposed only minimal cluster metadata, preventing REST extensions from implementing meaningful request validation or routing logic. |
Connect | Accepted | Chris Egerton | 2019-04-14 | 2019-05-09 | KAFKA-8231 | 2.3 | |
| 453 | Add close() method to RocksDBConfigSetter TLDR: Adds a close() method to the RocksDBConfigSetter interface, called when the RocksDB instance is shut down, allowing users to release native RocksObjects they created in setConfig(). RocksJava options objects (e.g., BlockBasedTableConfig, BloomFilter) extend AbstractNativeReference and must be explicitly closed to free C++ heap memory; without a close hook, user-created objects leaked native memory. |
Streams | Accepted | A. Sophie Blee-Goldman | 2019-04-13 | 2019-05-17 | KAFKA-8324 | 2.3 | |
| 452 | Tool to view cluster status TLDR: Proposes a kafka-cluster-status.sh CLI tool that displays active brokers, rack assignments, controller identity, listener addresses, and Kafka version without requiring ZooKeeper or JMX access. Operators currently have no single CLI command to get a quick health overview of the cluster; they must query ZooKeeper directly or parse JMX metrics from each broker individually. |
Admin | Discussion | Łukasz Antoniak | 2019-04-12 | 2019-04-12 | KAFKA-6393 | Metadata | |
| 450 | Sliding Window Aggregations in the DSL TLDR: Implements sliding windows in Kafka Streams where each record creates exactly two new windows (one ending at the record's timestamp, one starting at it), enabling precise window boundary semantics without size-aligned bucketing. Tumbling and hopping windows align boundaries to fixed grid intervals, which loses the fine-grained event-time resolution needed for use cases like computing metrics over a sliding N-millisecond window relative to each event. |
Streams | Discussion | Leah Thomas | 2020-07-20 | 2020-07-20 | 2.7 | ||
| 449 | Add connector contexts to log messages in Connect workers TLDR: Adds structured logging context (connector name, task ID) to Connect worker log messages using MDC (Mapped Diagnostic Context), so that log entries from different connectors and tasks running in the same JVM can be attributed to their source. Connect worker logs intermix output from multiple threads handling different connectors and tasks, making it extremely difficult to trace a connector failure through log lines without manual correlation. |
Connect | Accepted | Randall Hauch | 2019-04-02 | 2019-05-16 | KAFKA-3816 | 2.3 | |
| 448 | Add State Stores Unit Test Support to Kafka Streams Test Utils TLDR: KIP-448 adds `MockStoreFactory` and mock `KeyValueStoreBuilder`, `WindowStoreBuilder`, and `SessionStoreBuilder` implementations to the `kafka-streams-test-utils` module so developers can build state stores backed by simple in-memory maps for unit testing. Writing unit tests for DSL operators previously required hand-rolling complex EasyMock-based store builder mocks, making test code verbose and fragile. |
Streams Testing | Discussion | Yishun Guan | 2019-04-01 | 2019-10-14 | KAFKA-6460 | ||
| 447 | Producer scalability for exactly once semantics TLDR: Scales exactly-once semantics (EOS) to support more concurrent transactional producers by introducing a shared transactional ID prefix and allowing epoch-based producer fencing without requiring a globally unique transactional ID per producer instance. The original EOS design (KIP-98) required one unique transactional ID per producer, forcing Kafka Streams to pre-allocate a fixed pool of IDs and limiting parallelism for applications like Kafka Streams with many tasks. |
Transactions Producer | Accepted | Jason Gustafson | 2019-03-09 | 2026-03-04 | KAFKA-8587 | 2.5 | OffsetCommit TxnOffsetCommit |
| 446 | Add changelog topic configuration to KTable suppress TLDR: Adds a `Suppressed.withLoggingDisabled()` option and a `Materialized`-style `SuppressedInternal` configuration to allow users to configure the changelog topic created for a `KTable.suppress()` operation, including setting `cleanup.policy=delete` with a shorter retention. The suppress operator's changelog topic inherits the same aggressive compaction defaults as other KTable changelogs, but suppress data is short-lived and highly compactable, wasting storage and increasing compaction load. |
Streams | Accepted | Maarten Duijn | 2019-03-25 | 2020-02-25 | KAFKA-8147 | 2.7 | |
| 445 | In-memory Session Store TLDR: Adds an in-memory `SessionStore` implementation to Kafka Streams via a new `Stores.inMemorySessionStore()` factory method. The session store type previously lacked an in-memory variant, forcing all session-windowed aggregations to use RocksDB even for workloads that fit in heap and do not need persistence. |
Streams | Accepted | A. Sophie Blee-Goldman | 2019-03-29 | 2019-04-26 | KAFKA-8029 | 2.3 | |
| 444 | Augment metrics for Kafka Streams TLDR: Adds extensive Kafka Streams JMX metrics covering thread state, task counts (active, standby, restored), store sizes, record processing rates, commit rates, and punctuator invocation rates. The existing Streams metrics were too sparse to support production monitoring dashboards, alerting thresholds, or autoscaling decisions based on processing lag. |
Streams Metrics | Accepted | Guozhang Wang | 2019-03-29 | 2020-08-20 | KAFKA-6820 | 2.4 | |
| 443 | Return to default segment.ms and segment.index.bytes in Streams repartition topics TLDR: Removes Kafka Streams' overrides of `segment.ms` (set to 1 minute) and `segment.index.bytes` (set to 52428800) for internal repartition topics, reverting to broker defaults. The aggressive `segment.ms=1min` override caused excessive log segment rolling on low-traffic repartition topics, leading to large numbers of small segments that impacted compaction performance and were the root cause of KAFKA-7190. |
Streams Broker | Accepted | Guozhang Wang | 2019-03-27 | 2019-04-03 | KAFKA-7190 | 2.3 | |
| 442 | Return to default max poll interval in Streams TLDR: Restores `max.poll.interval.ms` in Kafka Streams' internal consumer to the standard Kafka default (`300000ms`) instead of the previously overridden `Integer.MAX_VALUE` (effectively infinite). Streams set an infinite poll interval because it did not call `poll()` during state restoration prior to Kafka 1.0; since 1.0 introduced polling during restore, the infinite override is no longer needed and masks real liveness failures. |
Streams Consumer | Accepted | John Roesler | 2019-03-27 | 2019-04-04 | KAFKA-6399 | 2.3 | |
| 441 | Smooth Scaling Out for Kafka Streams TLDR: Introduces a `warmup` task state in Kafka Streams where a new instance restores standby changelogs before being assigned active tasks, so the prior active owner retains the active assignment until the new owner is sufficiently caught up. After a rebalance, stateful tasks assigned to a new instance could block processing for hours while rebuilding their state stores from scratch, causing severe availability degradation during scale-out. |
Streams Consumer | Accepted | Guozhang Wang | 2019-03-15 | 2020-05-01 | KAFKA-8019 | 2.5 | |
| 440 | Extend Connect Converter to support headers TLDR: Extends the Connect Converter interface with header-aware serialize/deserialize methods that receive the Kafka record headers alongside the value. Converters previously had no access to headers, making it impossible to encode schema metadata (e.g., schema registry IDs) in headers rather than embedding them in the value payload. |
Connect | Accepted | Yaroslav Tkachenko | 2019-03-12 | 2019-09-25 | KAFKA-7273 | 2.4 | |
| 439 | Cleanup built-in Store interfaces TLDR: Cleans up the ReadOnlyWindowStore and ReadOnlySessionStore interfaces by standardizing return types (KeyValueIterator everywhere instead of mixed WindowStoreIterator/KeyValueIterator) and renaming methods for clarity. The existing window/session store interfaces had inconsistent method names and mixed return types that confused users and made it hard to write correct store queries. |
Streams | Discussion | Matthias J. Sax | 2019-03-11 | 2019-05-28 | KAFKA-8088 | ||
| 438 | Expose task, connector IDs in Connect API TLDR: Exposes the task ID and connector name to `SourceTask` and `SinkTask` implementations via a `TaskContext` object passed to `initialize()`, allowing connector code to include its own task identity in log messages and custom metrics. Connect tasks currently lack programmatic access to their assigned task ID, making connector-generated logs and metrics unattributable to the specific task instance. |
Connect Admin | Discussion | Ryanne Dolan | 2019-03-04 | 2019-03-05 | |||
| 437 | Custom replacement for MaskField SMT TLDR: Extends the `MaskField` SMT to accept a configurable `replacement` value so masked fields can be replaced with a custom string or number (e.g. `***-***-****`) instead of the type's null equivalent. The original `MaskField` only replaced values with `null` or zero, making it unusable for PII redaction scenarios that require a recognizable sentinel value rather than nulling the field. |
Connect | Accepted | Valeria Vasylieva | 2019-02-28 | 2020-05-15 | KAFKA-6755 | 2.6 | |
| 436 | Add a metric indicating start time TLDR: Adds a process.start.time.ms metric to Kafka brokers and clients, recording the Unix timestamp when the process started. Without a start time metric, operators cannot determine how long a process has been running from metrics alone, making it impossible to detect silent restarts or measure process uptime in alerting systems. |
Metrics Broker | Accepted | Stanislav Kozlovski | 2019-02-24 | 2019-03-20 | |||
| 435 | Internal Partition Reassignment Batching TLDR: Introduces incremental batching for partition reassignment so new replicas are brought in-sync one at a time before decommissioning old ones, rather than all concurrently. Reassigning many replicas simultaneously causes fetch amplification proportional to the replication factor and makes throttle configuration unpredictable because ISR traffic bypasses throttle limits. |
Broker | Discussion | Viktor Somogyi | 2019-02-22 | 2019-12-04 | KAFKA-6794 | ListPartitionReassignments | |
| 434 | Add Replica Fetcher and Log Cleaner Count Metrics TLDR: Adds JMX metrics for the current count of active replica fetcher threads and log cleaner threads on each broker. When fetcher or cleaner threads die due to unrecoverable exceptions, there is currently no metric signal—operators only notice the problem after replicas begin lagging or disks fill up. |
Metrics Broker | Accepted | Viktor Somogyi | 2019-02-22 | 2020-04-30 | KAFKA-7981 | 2.4 | |
| 431 | Support of printing additional ConsumerRecord fields in DefaultMessageFormatter TLDR: Adds `print.offset`, `print.partition`, `print.headers`, and related formatting options to the `DefaultMessageFormatter` in `kafka-console-consumer.sh`. The console consumer currently can only print key, value, and timestamp—fields like partition, offset, and headers require file-system access to the broker (via `kafka-dump-log.sh`) or external tools like `kafkacat`. |
Consumer Admin | Accepted | Mateusz Zakarczemny | 2019-02-18 | 2020-10-03 | KAFKA-6733 | 2.7 | |
| 430 | Return Authorized Operations in Describe Responses TLDR: Adds an includedAuthorizedOperations field to DescribeTopics, DescribeGroups, and DescribeCluster responses, listing which ACL operations the requesting principal is authorized to perform on the described resource. Without this, clients like Kafka Manager or Cruise Control must issue separate ACL describe requests (requiring additional permissions) to determine what operations a user can perform. |
Security Admin Protocol | Accepted | Rajini Sivaram | 2019-02-12 | 2019-05-24 | KAFKA-7922 | 2.3 | DescribeGroups Metadata |
| 429 | Kafka Consumer Incremental Rebalance Protocol TLDR: Introduces an incremental cooperative rebalance protocol for the Kafka consumer client, where members surrender only the partitions that need to move rather than releasing all partitions on each rebalance. The existing eager stop-the-world rebalance causes all consumers in a group to pause processing while the coordinator reassigns all partitions, creating significant latency spikes for stateful applications like Kafka Streams. |
Consumer | Accepted | Boyang Chen | 2019-02-12 | 2021-05-11 | KAFKA-8019 | 2.4 | JoinGroup |
| 428 | Add in-memory window store TLDR: Adds an in-memory implementation of the windowed state store for Kafka Streams, available via Stores.inMemoryWindowStore(). Only a persistent (RocksDB-backed) windowed store existed, but some use cases — short retention windows, latency-sensitive stateful ops — benefit from keeping all state in heap memory without the serialization overhead of RocksDB. |
Streams | Accepted | A. Sophie Blee-Goldman | 2019-02-06 | 2019-02-20 | KAFKA-4730 | 2.3 | |
| 426 | Persist Broker Id to Zookeeper TLDR: Proposes persisting auto-generated broker IDs to ZooKeeper in addition to the local meta.properties file, so a broker that loses its disk can recover its original ID on restart. In large clusters with frequent disk failures, losing the meta.properties file causes a broker to generate a new ID, creating ghost entries in ZooKeeper and requiring manual cleanup. |
Broker | Discussion | Kan Li | 2019-02-04 | 2019-02-08 | KAFKA-1070 | ||
| 425 | Add some Log4J Kafka Appender Properties for Producing to Secured Brokers TLDR: KIP-425 adds `SaslMechanism` and `ClientJaasConf` configuration properties to the `KafkaLog4jAppender` so it can produce to brokers requiring SASL/PLAIN authentication with inline JAAS configuration rather than only GSSAPI with a JVM-level JAAS config file. This unblocks Log4j-based audit logging to secured Kafka clusters that use PLAIN or other SASL mechanisms without requiring a separate JAAS config file on the producer host. |
Security Producer | Accepted | Rohan Desai | 2019-02-03 | 2019-05-28 | KAFKA-7896 | 2.3 | |
| 424 | Allow suppression of intermediate events based on wall clock time TLDR: Extends Kafka Streams' KTable suppress operator with a wall-clock-time-based suppression option (via `Suppressed.untilTimeLimit(Duration, BufferConfig)`) in addition to the existing stream-time-based semantics. Stream-time-based suppression stalls indefinitely if no new events arrive, making it unusable for rate-capping aggregate emission in sparse or bursty IoT workloads where real time must govern flush cadence. |
Streams | Discussion | Jonathan Gordon | 2019-01-23 | 2020-05-01 | KAFKA-7748 | ||
| 423 | Add JoinReason to Consumer Join Group Protocol TLDR: Proposes adding a `joinReason` field to the `JoinGroup` request so that brokers can log the specific reason a consumer initiated a rebalance (new member, metadata change, leader rejoin, etc.), improving rebalance diagnostics as a complement to KIP-345's static membership work. Without structured join reasons, diagnosing why unnecessary rebalances occur requires correlating timestamps across broker and client logs with no explicit causal signal. |
Consumer Protocol | Discussion | Boyang Chen | 2019-01-28 | 2019-02-24 | KAFKA-7728 | JoinGroup LeaveGroup | |
| 422 | KIP-421 TLDR: Proposes adding client quota management (create, alter, describe, delete) to the `AdminClient` API, covering user, client-ID, and IP quota entities. Previously, client quota configuration required direct ZooKeeper writes (`/config/users/`, `/config/clients/`), bypassing broker-side authorization and making programmatic quota management impossible without a ZooKeeper dependency. |
Admin | Discarded | Yaodong Yang | 2018-12-23 | 2020-04-09 | KAFKA-7740 | 2.6 | DescribeConfigs AlterConfigs IncrementalAlterConfigs |
| 421 | Support resolving externalized secrets in AbstractConfig TLDR: Adds a ConfigProvider SPI to Kafka's config system that allows config values to be resolved from external sources (Vault, environment variables, files) at startup using a ${provider:key} syntax in config files. Secrets like passwords and credentials are currently stored in plaintext in server.properties and connector configs, with no standard mechanism to retrieve them from secret managers. |
Connect Admin | Accepted | TEJAL ADSUL | 2019-01-21 | 2019-05-13 | KAFKA-7847 | 2.3 | |
| 420 | Add Single Value Fetch in Session Stores TLDR: Adds a `fetch(Bytes key, long startTime, long endTime)` single-point lookup method to the `SessionStore` interface, enabling efficient O(log N) retrieval of a specific session window by key and exact time bounds. Without single-value fetch, retrieving a specific session required a range scan over all sessions for a key, even when the exact start and end time of the target session are known. |
Streams | Accepted | Guozhang Wang | 2019-01-19 | 2019-01-31 | KAFKA-7652 | 2.3 | |
| 418 | A method-chaining way to branch KStream TLDR: Replaces KStream#branch(Predicate...) with a fluent split()/branch()/defaultBranch() API returning a named Map<String, KStream> instead of an array. Eliminates the impedance mismatch between Java generics and arrays, removes magic-number array indexing, and enables dynamic branch construction without unchecked casts. |
Streams | Accepted | Ivan Ponomarev | 2019-01-18 | 2021-01-19 | KAFKA-5488 | 2.8 | |
| 417 | Allow JmxTool to connect to a secured RMI port TLDR: Extends `JmxTool` (the `kafka-jmx.sh` CLI) with `--jmx-ssl-enable` and `--jmx-auth-prop` options to connect to JMX RMI ports secured with SSL and password authentication respectively. When JMX is configured with SSL or password auth, `JmxTool` fails with `ConnectIOException` because it uses a plain unauthenticated `JMXConnector`, making the tool unusable against secured production brokers. |
Admin Metrics | Accepted | Fangbin Sun | 2019-01-16 | 2019-05-11 | KAFKA-7455 | 2.3 | |
| 416 | Notify SourceTask of ACK'd offsets, metadata TLDR: Adds a callback to SourceTask notifying it of the downstream Kafka offsets after produced records are ACK'd by the broker. MirrorMaker 2 and other replication connectors need downstream offsets to implement cross-cluster offset translation, but WorkerSourceTask previously discarded this information. |
Connect | Discussion | Ryanne Dolan | 2018-12-15 | 2019-10-05 | KAFKA-7815 | 2.4 | |
| 415 | Incremental Cooperative Rebalancing in Kafka Connect TLDR: Replaces Connect's eager stop-the-world rebalance with an incremental cooperative protocol where only tasks that need to move are stopped and restarted, leaving unaffected tasks running throughout the rebalance. The existing eager protocol revokes all task assignments at the start of every rebalance, causing a full restart of all connectors whenever any single connector is added, updated, or removed. |
Connect | Accepted | Konstantine Karantasis | 2019-01-11 | 2019-05-17 | KAFKA-5505 | 2.3 | |
| 414 | Expose Embedded ClientIds in Kafka Streams TLDR: Exposes the client IDs of Kafka Streams' embedded producer, consumer, restore-consumer, and admin client instances via ThreadMetadata. Without this, operators cannot correlate Streams thread metrics with the corresponding low-level client metrics (e.g., producer byte rates) because the embedded client IDs were opaque internal strings. |
Streams Metrics | Accepted | Guozhang Wang | 2019-01-09 | 2019-02-08 | KAFKA-7798 | 2.2 | |
| 412 | Extend Admin API to support dynamic application log levels TLDR: Extends the `AdminClient` API with `describeLoggers()` and `alterClientQuotas()` (log-level variant: `alterConfigs`) to allow dynamic modification of log4j logger levels without requiring JMX or broker restart. Static `log4j.properties` configuration can only be applied at startup, preventing runtime log-level adjustments critical for production diagnosis. |
Admin | Accepted | Stanislav Kozlovski | 2019-01-08 | 2021-07-29 | KAFKA-7800 | 2.4 | AlterConfigs IncrementalAlterConfigs |
| 411 | Make default Kafka Connect worker task client IDs distinct TLDR: KIP-411 changes the default `client.id` for Kafka clients created by Connect worker tasks to include the connector name and task ID (e.g. `connector-consumer-conn1-2`), making per-task producer and consumer JMX MBeans uniquely named. Previously all tasks on a worker inherited the same default client ID, causing JMX MBean naming conflicts that made per-task metric collection impossible. |
Connect | Accepted | Paul Davidson | 2018-12-21 | 2019-05-23 | KAFKA-5061 | 2.3 | |
| 410 | Add metric for request handler thread pool utilization by request type TLDR: KIP-410 proposes adding a per-request-type JMX metric (`RequestHandlerPoolUsagePercent`) measuring what fraction of request-handler thread pool time is consumed by each API request type (Produce, Fetch, etc.). When the request handler pool is saturated, this metric lets operators identify which request type is the bottleneck, which is not distinguishable from the existing aggregate pool utilization metrics. |
Metrics Broker | Discussion | Mayuresh Gharat | 2018-12-20 | 2019-01-02 | KAFKA-7681 | ||
| 409 | Allow creating under-replicated topics and partitions TLDR: Allows `CreateTopics` and `CreatePartitions` requests to succeed even when fewer brokers are available than the requested replication factor, creating partitions in an under-replicated state initially. Currently, any broker outage—even a routine rolling restart—blocks all topic and partition creation because the broker enforces that all replica slots be assigned to live brokers before acknowledging the request. |
Broker | Discussion | Mickael Maison | 2018-12-18 | 2019-12-18 | CreateTopics CreatePartitions | ||
| 407 | Kafka Connect support override worker kafka api configuration with connector configuration that post by rest api TLDR: KIP-407 proposes allowing per-connector overrides of worker-level Kafka client configs by accepting `consumer.*` and `producer.*` prefixed properties in the connector config POSTed via the REST API. Without this, all sink connectors on a worker share the same consumer settings (e.g. `auto.offset.reset`) configured in the worker properties file, making it impossible to run connectors with different offset-reset semantics on the same worker. |
Connect | Discussion | laomei | 2018-12-17 | 2018-12-17 | KAFKA-7723 | ||
| 406 | GlobalStreamThread should honor custom offset policy. TLDR: Makes `GlobalStreamThread` automatically handle `InvalidOffsetException` by resetting offsets according to the application's `auto.offset.reset` policy and rebuilding the global state store, instead of propagating the exception to the user. Previously, receiving an `InvalidOffsetException` from global store topics required a manual application restart, whereas stream threads already handle this automatically. |
Streams | Discussion | Richard Yu | 2018-12-16 | 2020-10-26 | KAFKA-7480 | ||
| 405 | Kafka Tiered Storage TLDR: Introduces tiered storage for Kafka, offloading older log segments to remote object storage (S3, GCS, etc.) while keeping only recent segments on local broker disks, governed by a new RemoteLogManager and RemoteStorageManager plugin interface. Local disk capacity proportional to retention × throughput limited Kafka's ability to serve as a long-term data store, requiring expensive broker hardware just to maintain retention. |
Tiered Storage | Accepted | Harsha | 2018-12-14 | 2025-06-11 | KAFKA-7739 | 3.5 | Fetch v14 ListOffsets v8 |
| 403 | Increase ProducerPerformance precision by using nanoTime TLDR: Switches ProducerPerformance's latency measurement from System.currentTimeMillis() to System.nanoTime() for sub-millisecond precision. The ms-granularity timer caused many samples to report 0ms latency on fast systems, skewing percentile calculations and making it impossible to benchmark high-throughput producers accurately. |
Producer | Discussion | Kevin Lu | 2018-12-11 | 2018-12-19 | KAFKA-7722 | ||
| 402 | Improve fairness in SocketServer processors TLDR: Changes SocketServer's Processor threads to interleave processing of new and existing connections rather than draining all new connections first. Under connection storms, the old behavior blocked existing connections from being processed until the storm subsided, causing memory buildup and response latency spikes for established clients. |
Broker | Accepted | Rajini Sivaram | 2018-12-11 | 2019-01-21 | KAFKA-7719 | 2.2 | |
| 400 | Improve exit status in case of errors in ConsoleProducer TLDR: Adds a `--exit-on-producer-error` flag to `kafka-console-producer.sh` and changes the default async-mode behavior to track send callback failures and exit with a non-zero status code after all records are sent if any failed. In async mode, the existing `ConsoleProducer` swallows send errors and always exits with status 0, making it impossible for shell scripts to detect data-loss failures. |
Producer | Discussion | Kamal Kang | 2018-11-12 | 2018-12-14 | |||
| 399 | Extend ProductionExceptionHandler to cover serialization exceptions TLDR: Extends the Kafka Streams `ProductionExceptionHandler` to also cover serialization exceptions that occur before records are handed to the producer, allowing custom logic to skip, reroute, or fail on serialization errors. The original `ProductionExceptionHandler` only intercepted exceptions from the producer callback, leaving pre-send serialization failures unhandled. |
Streams | Accepted | Kamal Chandraprakash | 2018-12-03 | 2023-03-29 | KAFKA-7499 | 3.5 | |
| 395 | Encypt-then-MAC Delegation token metadata TLDR: Proposes switching delegation token metadata serialization to use Encrypt-then-MAC (AES-CBC + HMAC) rather than storing plaintext token metadata in ZooKeeper. Delegation token metadata stored in ZooKeeper is readable by anyone with ZK access; an authenticated encryption scheme ensures confidentiality and integrity of token information at rest. |
Security | Discussion | Attila Sasvári | 2018-11-29 | 2018-12-03 | KAFKA-7691 | ||
| 394 | Require member.id for initial join group request TLDR: Changes the consumer group join protocol so that the initial `JoinGroup` request with `member.id=UNKNOWN_MEMBER_ID` returns a `MEMBER_ID_REQUIRED` error with a server-assigned member ID, and the client must retry with that member ID to complete the join. When clients repeatedly retry `JoinGroup` without completing the handshake (due to connection timeouts, infinite `max.poll.interval.ms`, or restart loops), the broker accumulates dangling member state entries that are never cleaned up, bloating coordinator memory. |
Consumer Protocol | Discussion | Boyang Chen | 2018-11-26 | 2020-06-01 | KAFKA-7824 | 2.2 | JoinGroup |
| 393 | Time windowed serde to properly deserialize changelog input topic TLDR: Fixes `TimeWindowedDeserializer` to correctly decode windowed keys read from Streams changelog topics by accounting for the sequence number appended by `WindowKeySchema` during serialization. The existing `TimeWindowedSerde` uses `WindowKeySchema.fromStoreKey()` which does not strip the sequence number, producing incorrect window timestamps when a Streams application reads back its own changelog. |
Streams | Accepted | Shawn Nguyen | 2018-11-25 | 2019-01-16 | KAFKA-7110 | 2.2 | |
| 392 | Allow consumers to fetch from closest replica TLDR: Adds a `client.rack` consumer configuration and corresponding broker-side rack-aware fetch logic (`ReplicaSelector` plugin) that routes consumer fetch requests to the closest in-sync replica rather than always to the partition leader. In multi-AZ or multi-DC Kafka deployments, all consumer fetches go to the leader regardless of replica locality, incurring expensive cross-AZ network traffic and latency. |
Consumer Broker | Accepted | Jason Gustafson | 2018-10-31 | 2019-11-05 | KAFKA-8443 | 2.3 | Fetch |
| 390 | Support Compression Level TLDR: Adds per-codec compression level configuration (`compression.gzip.level`, `compression.lz4.level`, `compression.zstd.level`) at the producer, broker, and topic levels. Kafka's fixed default compression level is a suboptimal trade-off for many workloads; some use cases prioritize smaller compressed size (higher level) while others prioritize CPU throughput (lower level), and the choice can yield up to 156% more messages/sec in benchmarks. |
Producer Broker | Accepted | Dongjin Lee | 2018-11-18 | 2024-05-21 | KAFKA-7632 | 3.8 | |
| 389 | Introduce a configurable consumer group size limit TLDR: Introduces a `group.max.size` broker configuration that enforces an upper bound on the number of members in a single consumer group, rejecting `JoinGroup` requests with `GROUP_MAX_SIZE_REACHED` when the limit is exceeded. Without a group size limit, arbitrarily large consumer groups can form, causing rebalance times and coordinator memory usage to grow unboundedly, degrading performance for all groups managed by that coordinator. |
Consumer Broker | Accepted | Boyang Chen | 2018-11-18 | 2019-01-18 | KAFKA-7610 | 2.1.1 | |
| 388 | Add observer interface to record request and response TLDR: Introduces an observer/interceptor interface that intercepts every Kafka request and response at the broker level, enabling audit trail collection across topics, principals, and request types. Solves the lack of a native audit mechanism for tracking which application produces or consumes data on which topics over time. |
Protocol Broker | Discussion | Lincong Li | 2018-11-06 | 2018-12-14 | KAFKA-7596 | Produce | |
| 387 | Fair Message Consumption Across Partitions in KafkaConsumer TLDR: Proposes a round-robin partition-interleaved fetch strategy in `KafkaConsumer.poll()` so that `max.poll.records` is distributed evenly across all assigned partitions rather than filled greedily from the last-fetched partition. Under the current implementation, a partition with a large backlog can monopolize the entire `max.poll.records` budget, starving other partitions and breaking fairness guarantees. |
Consumer | Discussion | CHIENHSING WU | 2018-11-05 | 2018-11-05 | KAFKA-3932 | ||
| 385 | Avoid throwing away prefetched data TLDR: Proposes that the consumer retains prefetched fetch responses across pause() calls instead of discarding them, so data already transferred is not re-fetched when the partition is resumed. When using the pause/resume API, the consumer currently discards any in-flight prefetch for paused partitions, wasting network bandwidth by re-fetching the same data after resume. |
Consumer | Discussion | Zahari Dichev | 2018-10-21 | 2018-10-25 | KAFKA-7526 | ||
| 384 | Add config for incompatible changes to persistent metadata TLDR: Proposes a separate `metadata.version` (or similar) config distinct from `inter.broker.protocol.version` to control when brokers begin writing incompatible persistent metadata formats (e.g., `__consumer_offsets` schema). Using `inter.broker.protocol.version` alone for this purpose prevents safe downgrade once the new schema has been written, conflating wire protocol versioning with storage format versioning. |
Broker | Discussion | Jason Gustafson | 2018-10-19 | 2018-10-20 | KAFKA-7481 | ||
| 382 | MirrorMaker 2.0 TLDR: Replaces MirrorMaker 1 with MirrorMaker 2, a Kafka Connect-based replication framework that preserves partition semantics, syncs ACLs and topic configurations, supports exactly-once delivery, enables active-active topologies, and allows configuration changes without cluster restarts. MirrorMaker 1 had critical limitations: default topic configuration loss, no ACL/config sync, loss of semantic partitioning, and no consumer migration support across clusters. |
MirrorMaker | Accepted | Ryanne Dolan | 2018-10-11 | 2023-02-07 | KAFKA-7500 | 2.4 | |
| 381 | Connect: Tell about records that had their offsets flushed in callback TLDR: Adds an `offsetsFlushed` callback and a `commitRecord(SourceRecord, RecordMetadata)` overload to the Connect `SourceTask` API so connectors know precisely which records had their source offsets durably committed. Without this, a SourceTask has no way to track whether a given polled record's offset was actually flushed, making it impossible to maintain correct state for exactly-once or at-least-once semantics. |
Connect | Discussion | Per Steffensen | 2018-10-10 | 2018-12-12 | KAFKA-5716 | ||
| 380 | Detect outdated control requests and bounced brokers using broker generation TLDR: KIP-380 introduces a monotonically increasing `brokerEpoch` field assigned at each broker registration in ZooKeeper so that brokers and the controller can detect and reject control requests (LeaderAndIsr, StopReplica, ControlledShutdown) that belong to a previous incarnation of the same broker. Without a generation counter, a quickly bouncing broker could accidentally process stale control messages sent to its previous incarnation, leaving replicas offline or causing incorrect HWM truncation. |
Broker | Accepted | Patrick Huang | 2018-10-09 | 2018-11-13 | KAFKA-7235 | 2.2 | Metadata LeaderAndIsr StopReplica UpdateMetadata ControlledShutdown |
| 379 | Multiple Consumer Group Management TLDR: Extends kafka-consumer-groups.sh to accept multiple --group arguments and an --all-groups flag so describe/delete/reset-offsets operations can target several groups or all groups in a single invocation. Running a single consumer group operation required launching a separate JVM per group, making bulk operations on many groups extremely slow and resource-intensive. |
Consumer Admin | Accepted | Alex Dunayevsky | 2018-10-03 | 2019-01-16 | KAFKA-7471 | 2.4 | |
| 377 | TopicCommand to use AdminClient TLDR: KIP-377 migrates `kafka-topics.sh` from direct ZooKeeper connections to the `AdminClient` Kafka protocol API. Direct ZooKeeper access bypasses broker-enforced policies (e.g., `CreateTopicPolicy`), requires ZooKeeper connectivity separate from broker connectivity, and blocks the eventual removal of ZooKeeper from Kafka. |
Admin | Accepted | Viktor Somogyi | 2018-09-27 | 2019-02-21 | KAFKA-7433 | 2.2 | Metadata |
| 376 | Implement AutoClosable on appropriate classes that want to be used in a try-with-resource statement TLDR: KIP-376 makes `KafkaStreams`, `KafkaChannel`, `ConsumerInterceptor`, `MetricsReporter`, and several other Kafka and Streams components implement `java.lang.AutoCloseable` so they can be used in try-with-resources blocks. These classes hold resources and have `close()` methods but did not implement the interface, preventing resource-safe usage patterns and complicating test teardown. |
Client | Accepted | Yishun Guan | 2018-09-25 | 2018-11-17 | KAFKA-7402 | 2.2 | |
| 375 | Kafka Clients - make Metadata#TOPIC_EXPIRY_MS configurable TLDR: Makes `Metadata.TOPIC_EXPIRY_MS` (the TTL for unused topic metadata in the client cache) configurable via a new `metadata.max.idle.ms` producer/consumer config. In high-throughput, low-latency systems, periodic metadata refresh triggered by topic expiry adds unexpected latency spikes to the caller thread. |
Client | Discussion | Pavel Moukhataev | 2018-09-24 | 2018-10-03 | |||
| 374 | Add '--help' option to all available Kafka CLI commands TLDR: Standardizes `--help` support across all Kafka CLI tools so every script prints usage information when `--help` is passed. Several commands previously threw an `UnrecognizedOptionException` on `--help` instead of printing help text, creating an inconsistent user experience. |
Admin | Accepted | Attila Sasvári | 2018-09-19 | 2018-11-19 | KAFKA-7418 | 2.2 | |
| 373 | Allow users to create delegation tokens for other users TLDR: Allows users with sufficient privilege to create delegation tokens on behalf of other users by adding an optional `owner_principal` field to the `CreateDelegationToken` request. Previously, delegation tokens could only be created for the currently authenticated user, preventing service accounts and administrators from pre-provisioning tokens for other principals. |
Security | Accepted | Manikumar Reddy O. | 2018-09-18 | 2020-01-31 | KAFKA-6945 | 3.3 | CreateAcls CreateDelegationToken DeleteAcls DescribeAcls DescribeDelegationToken |
| 372 | Naming Repartition Topics for Joins and Grouping TLDR: Adds a named repartition topic API to Kafka Streams joins and grouping operations, allowing users to assign stable, user-defined names to internal repartition topics. Auto-generated repartition topic names include positional counters that shift when topology structure changes, causing Streams to create new topics and lose state when the application is updated. |
Streams | Accepted | Bill Bejeck | 2018-09-12 | 2018-10-03 | KAFKA-7406 | 2.1 | |
| 371 | Add a configuration to build custom SSL principal name TLDR: Adds an `ssl.principal.mapping.rules` config that accepts a list of regex substitution rules to transform an SSL certificate's X.500 Distinguished Name into a short `KafkaPrincipal` username. By default, SSL principals are the full DN string (`CN=writeuser,OU=Unknown,...`), which is unwieldy for ACLs and operator tooling. |
Security | Accepted | Manikumar Reddy O. | 2018-09-05 | 2018-10-10 | KAFKA-5462 | 2.2 | |
| 370 | Remove Orphan Partitions TLDR: KIP-370 introduces a mechanism for brokers to detect and clean up "orphan" partitions — log directories belonging to partitions that were reassigned away while the broker was offline. When such a broker comes back online it neither leads nor follows those partitions, so retention never runs and the data accumulates indefinitely, wasting disk space. |
Broker | Discussion | xiongqi wu | 2018-09-05 | 2022-08-30 | KAFKA-7362 | ||
| 368 | KIP 368: Allow SASL Connections to Periodically Re-Authenticate TLDR: Adds support for periodic SASL re-authentication on existing connections by introducing a `connections.max.reauth.ms` broker configuration, after which the client must re-authenticate before the next request or the connection is closed. Long-lived Kafka connections bypass token expiry for SASL/OAUTHBEARER and prevent immediate revocation of access when SASL/SCRAM credentials are rotated or ACLs are changed. |
Security | Accepted | Ron Dagostino | 2018-08-20 | 2020-02-03 | KAFKA-7352 | 2.2 | SaslHandshake SaslAuthenticate |
| 366 | Make FunctionConversions deprecated TLDR: KIP-366 deprecates the public `FunctionConversions` object in the Kafka Streams Scala API and replaces it with a private equivalent. The object was accidentally exposed in the public API but contains only internal type conversion implicits, and its public presence prevents the project from evolving the conversions without going through the KIP process. |
Streams | Accepted | Joan Goyeau | 2018-08-23 | 2018-09-19 | KAFKA-7399 | 2.1 | |
| 365 | Materialized, Serialized, Joined, Consumed and Produced with implicit Serde TLDR: Adds Scala implicit conversions in the `kafka-streams-scala` module so that `Materialized`, `Consumed`, `Produced`, `Joined`, and `Serialized` objects automatically pick up the correct `Serde` from implicit scope without requiring explicit serde arguments. Without implicits, every DSL call in Scala requires verbose serde boilerplate identical to the Java API, negating the ergonomic benefit of the Scala wrapper. |
Streams | Accepted | Joan Goyeau | 2018-08-24 | 2018-09-11 | KAFKA-7396 | 2.1 | |
| 363 | Allow performance tools to print final results to output file TLDR: Adds a `--print-metrics` flag and an `--output-file` option to `ProducerPerformance` and `ConsumerPerformance` tools so final benchmark results can be persisted to a file in JSON or CSV format. Without this, users must capture stdout manually, making automated performance regression pipelines fragile. |
Admin | Discussion | Attila Sasvári | 2018-08-17 | 2018-11-25 | KAFKA-7289 | ||
| 361 | K TLDR: Adds an `allow.auto.create.topics` configuration to the consumer client (default `true` for backward compatibility) that controls whether the consumer's metadata requests trigger auto topic creation on the broker. When a consumer subscribes to a topic before a producer starts, `allow.auto.create.topics=true` on the broker causes the consumer to create the topic with broker defaults (wrong partition count / replication factor) rather than waiting for the intended topic configuration. |
Consumer | Accepted | Dhruvil Shah | 2018-08-21 | 2019-05-08 | KAFKA-7320 | 2.3 | Metadata |
| 359 | Verify leader epoch in produce requests TLDR: Adds the leader epoch to ProduceRequest and validates it against the partition leader's current epoch before accepting the write, rejecting stale requests with FENCED_LEADER_EPOCH. Without this fencing, a producer holding a stale leader reference (after a leader failover) could successfully append records to the wrong broker if routing metadata was not refreshed in time, potentially causing duplicate or out-of-order writes. |
Broker Producer | Accepted | Jason Gustafson | 2018-08-18 | 2018-09-06 | KAFKA-7383 | ||
| 358 | Migrate Streams API to Duration instead of long ms times TLDR: Replaces all `long`-typed millisecond duration parameters in the Kafka Streams public API (`Stores.windowedBy`, `KStream.groupByKey().windowedBy`, `KafkaStreams.close`, etc.) with `java.time.Duration`. Using raw `long` milliseconds provides no compile-time unit safety and makes API semantics ambiguous — callers cannot tell whether a parameter is seconds, milliseconds, or nanoseconds. |
Streams | Accepted | Nikolay Izhikov | 2018-08-16 | 2018-10-09 | KAFKA-7277 | 2.1 | |
| 357 | Add support to list ACLs per principal TLDR: Adds a `--principal` filter option to `kafka-acls.sh --list` so operators can retrieve all ACLs assigned to a specific principal in one command. Currently, `--list` only filters by resource; finding all ACLs for a principal requires iterating the full ACL list and grepping. |
Security Admin | Accepted | Manikumar Reddy O. | 2018-08-16 | 2018-09-18 | KAFKA-5690 | 2.1 | DescribeAcls |
| 356 | Add withCachingDisabled() to StoreBuilder TLDR: KIP-356 adds `StoreBuilder.withCachingDisabled()` as an explicit API to disable the record cache on a Kafka Streams state store. `StoreBuilder` initializes with caching disabled by default but only exposes `withCachingEnabled()`, making it impossible to programmatically toggle caching back off after enabling it, which is required when chaining builder configurations. |
Streams | Accepted | Guozhang Wang | 2018-08-14 | 2018-08-17 | KAFKA-6998 | 2.1 | |
| 354 | Add a Maximum Log Compaction Lag TLDR: Adds a max.compaction.lag.ms topic configuration that guarantees a dirty log segment will be compacted within a specified time, regardless of the dirty ratio. Without a maximum lag, GDPR-style delete markers (tombstones) could remain in the log indefinitely if the dirty ratio never reached the min.cleanable.dirty.ratio threshold, violating data deletion SLAs. |
Broker | Accepted | xiongqi wu | 2018-08-09 | 2018-12-06 | KAFKA-7321 | 2.3 | |
| 353 | Improve Kafka Streams Timestamp Synchronization TLDR: Formalizes Kafka Streams stream-time tracking by defining partition time as the max timestamp seen on each partition independently, and task time as the min across all partition times, with configurable behavior for empty partitions. The ad-hoc timestamp synchronization logic was non-deterministic when a task consumed from multiple input partitions — processing order depended on fetch scheduling, producing non-repeatable results. |
Streams | Accepted | Guozhang Wang | 2018-08-03 | 2018-08-15 | KAFKA-3514 | 2.1 | |
| 352 | Distinguish URPs caused by reassignment TLDR: Adds a new under-replicated metric that excludes replicas catching up during an active reassignment, and a separate metric specifically for reassignment-lagging replicas. During reassignment, the standard UnderReplicatedPartitions metric fires even when desired replication factor is met, making it impossible to use URP alerts without false positives during planned reassignments. |
Broker Admin | Accepted | Jason Gustafson | 2018-08-02 | 2019-10-17 | KAFKA-8834 | 2.5 | |
| 351 | Add --under-min-isr option to describe topics command TLDR: Adds an `--under-min-isr` flag to `kafka-topics.sh --describe` that filters the output to show only partitions whose ISR size is below `min.insync.replicas`. The existing `UnderMinIsrPartitionCount` JMX metric shows a count but not which partitions are affected, forcing operators to describe every topic and parse ISR sizes manually to identify the specific under-min-ISR partitions. |
Admin Broker | Accepted | Kevin Lu | 2018-08-02 | 2019-04-05 | KAFKA-7236 | 2.3 | |
| 350 | Allow kafka-topics.sh to take brokerid as parameter to show partitions associated with it TLDR: Adds a `--broker-id` filter to `kafka-topics.sh --describe` to list only partitions assigned to (as leader or replica) a specific broker. Currently, operators must pipe `--describe` output through `grep` to find partitions on a given broker, which is fragile and slow in large clusters. |
Admin | Discussion | Ratish Ravindran | 2018-08-02 | 2018-08-02 | KAFKA-7232 | ||
| 346 | Improve LogCleaner behavior on error TLDR: Changes LogCleaner behavior on compaction failure from halting the entire cleaner thread (which stops all compaction) to pausing only the affected partition and logging the error with full context. Previously a single corrupt log or compaction bug would freeze all log compaction across the broker, silently leaving dirty partitions uncompacted until a broker restart. |
Broker | Accepted | Stanislav Kozlovski | 2018-07-23 | 2018-11-24 | KAFKA-7215 | 2.1 | |
| 345 | Introduce static membership protocol to reduce consumer rebalances TLDR: Introduces static consumer group membership via a `group.instance.id` configuration, allowing a consumer to rejoin after a restart with its previous member identity and partition assignment without triggering a group rebalance. Every consumer restart or temporary disconnect causes a full group rebalance, which for stateful Streams applications means expensive state migration and processing downtime proportional to the number of group members. |
Consumer | Accepted | Boyang Chen | 2018-07-22 | 2019-09-09 | KAFKA-6145 | 2.3 | Heartbeat JoinGroup OffsetCommit SyncGroup LeaveGroup DescribeGroups FindCoordinator |
| 342 | Add support for custom SASL extensions in OAuthBearer authentication TLDR: Adds a pluggable SASL extensions mechanism for the OAuthBearer authentication protocol, allowing clients to attach arbitrary key-value pairs to the `SASL/OAUTHBEARER` `ClientInitialResponse` message. Existing SASL extensions in SCRAM are hard-coded for delegation tokens; OAuthBearer clients receiving third-party JWT tokens cannot add custom claims to the token and have no other channel to pass auxiliary data to a custom server-side callback. |
Security | Accepted | Stanislav Kozlovski | 2018-07-16 | 2018-08-14 | KAFKA-7169 | 2.1 | |
| 341 | Update Sticky Assignor's User Data Protocol TLDR: Updates the `StickyAssignor`'s user-data protocol to encode the previous assignment generation ID alongside the partition assignment, allowing the leader to detect and discard stale assignment reports from members that missed a generation. Without the generation ID, a consumer rejoining after a missed rebalance reports an outdated assignment, causing the sticky assignor's leader to start from a suboptimal partition distribution. |
Consumer | Accepted | Vahid Hashemian | 2018-07-13 | 2019-04-16 | KAFKA-7026 | 2.3 | |
| 340 | Allow kafka-reassign-partitions.sh and kafka-log-dirs.sh to take admin client property file TLDR: Adds a `--command-config` argument to both `kafka-reassign-partitions.sh` and `kafka-log-dirs.sh` so that arbitrary `AdminClient` properties (SSL, SASL, etc.) can be supplied via a properties file. Both tools hardcode `AdminClient` construction with only `bootstrap.servers`, making them unusable against TLS-secured clusters. |
Admin | Discussion | Dong Lin | 2018-07-11 | 2018-08-07 | KAFKA-7147 | 2.1 | |
| 339 | Create a new IncrementalAlterConfigs API TLDR: Introduces an IncrementalAlterConfigs RPC that supports SET, DELETE, APPEND, and SUBTRACT operations on individual config keys rather than replacing the entire config. The existing AlterConfigs RPC is a full replace — callers must first DescribeConfigs to avoid accidentally resetting unknown keys — making concurrent config changes from multiple clients unsafe and error-prone. |
Admin Protocol | Accepted | Colin McCabe | 2018-07-11 | 2019-04-13 | KAFKA-7466 | 2.3 | IncrementalAlterConfigs AlterConfigs |
| 338 | Support to exclude the internal topics in kafka-topics.sh command TLDR: Adds an `--exclude-internal` flag to `kafka-topics.sh` to suppress internal topics (e.g., `__consumer_offsets`, `__transaction_state`) from `--list` and `--describe` output. Previously, operators had to pipe output through `grep -v '^__'` to filter internal topics, with no built-in mechanism. |
Admin | Accepted | Chia-Ping Tsai | 2018-07-11 | 2018-10-24 | KAFKA-7139 | 2.1 | |
| 333 | Add faster mode of rebalancing TLDR: Proposes a fast-path rebalance mechanism for consumers that are returning from failure, allowing them to resume from their last committed offset rather than waiting for a full group rebalance to complete. After a crash and restart, the standard rebalance protocol introduces a lag proportional to the gap between the last committed offset and the current LEO, which is unacceptable for low-latency applications. |
Consumer | Discussion | Richard Yu | 2018-07-05 | 2018-07-23 | KAFKA-7132 | JoinGroup LeaveGroup | |
| 332 | Update AclCommand to use AdminClient API TLDR: Updates kafka-acls.sh to use the AdminClient API (CreateAcls/DeleteAcls/DescribeAcls RPCs) instead of writing directly to ZooKeeper via SimpleAclAuthorizer. Previously the tool required ZooKeeper credentials and bypassed the broker, meaning it couldn't work with custom authorizer implementations or in environments without ZK access. |
Security Admin | Accepted | Manikumar Reddy O. | 2018-07-01 | 2019-01-03 | KAFKA-7117 | 2.1 | |
| 331 | Add default implementation to close() and configure() for Serializer, Deserializer and Serde TLDR: Adds default (no-op) implementations to `Serializer.close()`, `Serializer.configure()`, `Deserializer.close()`, `Deserializer.configure()`, and their `Serde` equivalents as Java 8 default interface methods. Users implementing custom serdes are forced to write boilerplate no-op methods for lifecycle hooks they do not need. |
Client | Discussion | Chia-Ping Tsai | 2018-07-01 | 2018-10-23 | KAFKA-6161 | 2.3 | |
| 330 | Add retentionPeriod in SessionBytesStoreSupplier TLDR: KIP-330 adds a `retentionPeriod()` method to `SessionBytesStoreSupplier`, parallel to the existing method on `WindowBytesStoreSupplier`. Without it, the `StoreBuilder` wrapper cannot determine the appropriate changelog topic retention period for session stores, causing the changelog to use an incorrect or default retention that may not cover the full session window. |
Streams | Accepted | Guozhang Wang | 2018-06-27 | 2018-07-02 | KAFKA-7101 | 2.1 | |
| 328 | Ability to suppress updates for KTables TLDR: Introduces `KTable#suppress()` to Kafka Streams so that a KTable can buffer intermediate updates and emit only final or time-delimited results downstream, replacing the cache-based deduplication with an explicit, event-time-aware suppression operator. The existing record cache provided only a best-effort, commit-interval-based suppression that could not guarantee that only a single final result per key was emitted within a time window. |
Streams | Accepted | John Roesler | 2018-06-20 | 2020-07-07 | KAFKA-6556 | ||
| 325 | Extend Consumer Group Command to Show Beginning Offsets TLDR: Extends `kafka-consumer-groups.sh --describe` to include the beginning offset (log-start-offset) alongside current-offset, log-end-offset, and lag for each partition. Without the beginning offset, operators cannot determine whether lag is due to slow consumption or to data being retained/expired from the partition. |
Consumer Admin | Discussion | Vahid Hashemian | 2018-06-26 | 2018-08-16 | KAFKA-7089 | ||
| 324 | Add method to get metrics() in AdminClient TLDR: Adds a metrics() method to the AdminClient class, returning the same JMX metrics map available on producer, consumer, and streams clients. AdminClient was the only Kafka client without a metrics() method, making it impossible to monitor connection pool health, request rates, or error rates for admin operations. |
Admin Metrics | Accepted | Yishun Guan | 2018-06-26 | 2018-10-24 | KAFKA-6986 | 2.1 | |
| 321 | Update TopologyDescription to better represent Source and Sink Nodes TLDR: Updates `TopologyDescription` to expose `SourceNode` and `SinkNode` as typed objects with structured accessors (topics, pattern, etc.) instead of returning string representations only. Users who want to perform runtime introspection on topology source/sink nodes are forced to parse `toString()` output, which is brittle and not part of the stable API contract. |
Streams | Accepted | Nishanth Pradeep | 2018-06-20 | 2018-08-09 | KAFKA-6966 | 2.1 | |
| 320 | Allow fetchers to detect and handle log truncation TLDR: Adds client-side log truncation detection to the consumer and replica fetcher: when a fetch response reveals that the broker's log end offset has moved backward (indicating leader truncation after unclean election), the client detects the divergence using epoch-fenced offset validation and resets to the last safe offset. Without this, consumers silently re-read or skip records after a leader failover with unclean election enabled, causing data corruption or loss that is invisible to the application. |
Broker Consumer | Discussion | Jason Gustafson | 2018-06-20 | 2019-08-23 | KAFKA-6880 | 2.2 | Fetch v9 |
| 319 | Replace segments with segmentInterval in WindowBytesStoreSupplier TLDR: Replaces the segments() method on WindowBytesStoreSupplier with a segmentInterval() method returning the duration of each segment in milliseconds rather than the segment count. The store cache uses segment interval (not count) to size itself correctly; returning segment count required callers to do the division themselves and made the interface semantically incorrect. |
Streams | Accepted | John Roesler | 2018-06-20 | 2018-08-23 | KAFKA-7080 | 2.1 | |
| 317 | Add end-to-end data encryption functionality to Apache Kafka TLDR: Proposes a pluggable end-to-end encryption framework for Kafka that encrypts record payloads at the producer before they are written to the broker and decrypts them at the consumer, keeping plaintext away from brokers and ZooKeeper. This addresses GDPR and financial-sector at-rest encryption requirements that TLS-in-transit alone does not satisfy because brokers still store and process plaintext records. |
Security | Discussion | Sönke Liebau | 2018-06-18 | 2020-04-28 | |||
| 316 | Command-line overrides for ConnectDistributed worker properties TLDR: Adds `--override key=value` command-line argument support to `connect-distributed.sh` so that runtime-determined properties (e.g., `rest.port`) can be injected without modifying the properties file. Currently, `ConnectDistributed` accepts exactly one argument (the properties file path), requiring wrapper scripts to generate a temporary file for any runtime-variable config. |
Connect | Discussion | Kevin Lafferty | 2018-06-14 | 2018-06-15 | KAFKA-7060 | ||
| 315 | Stream Join Sticky Assignor TLDR: Proposes a StickyJoinAssignor for Kafka consumers doing stream-stream joins, ensuring co-partitioned topics always assign the same partition number to the same consumer while minimizing reassignment churn on rebalance. Standard assignors (RangeAssignor, RoundRobinAssignor) don't guarantee that co-partitioned topics land on the same consumer, which is a hard requirement for correct stream join semantics. |
Streams Consumer | Discussion | Mike Freyberger | 2018-06-13 | 2019-04-08 | |||
| 313 | Add KStream.flatTransform and KStream.flatTransformValues TLDR: KIP-313 adds `KStream.flatTransform()` and `KStream.flatTransformValues()` operators to the Kafka Streams DSL, enabling stateful one-to-many record transformations. The existing `KStream.transform()` only supports one-to-one (or zero-to-one) transformations via `Transformer`, requiring users to drop to the lower-level Processor API for stateful fan-out operations. |
Streams | Accepted | Bruno Cadonna | 2018-06-08 | 2019-04-16 | KAFKA-4217 | 2.2 | |
| 312 | Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties TLDR: Adds an overloaded `StreamsBuilder.build(Properties)` method so that topology optimization settings (from KIP-295) can be passed at build time without requiring a `StreamsConfig` object. The workaround was to call `build()` without properties and pass optimization config separately in the `KafkaStreams` constructor, which is awkward and inconsistent. |
Streams | Accepted | Bill Bejeck | 2018-06-07 | 2018-06-29 | KAFKA-7027 | 2.1 | |
| 309 | Add toUpperCase support to sasl.kerberos.principal.to.local rule TLDR: Extends the sasl.kerberos.principal.to.local.rules parser to support a /U modifier that uppercases the resulting Linux username, mirroring the existing Hadoop auth_to_local rule syntax (HADOOP-13984). Some enterprise environments require uppercase usernames when mapping Kerberos principals, which was not possible with the existing lowercase-only /L support. |
Security | Accepted | Manikumar Reddy O. | 2018-06-04 | 2019-09-20 | KAFKA-6883 | 2.4 | |
| 307 | Allow to define custom processor names with KStreams DSL TLDR: Tracks the implementation status of named operations in Kafka Streams DSL introduced in KIP-307, including the `NamedOperation` interface and `as(String)` factory methods on `Produced`, `Consumed`, `Joined`, `Grouped`, and `Suppressed`. Without named operations, processor topology node names are auto-generated as opaque identifiers, making it difficult to correlate application logic with topology diagrams and metrics. |
Admin | Discussion | John Roesler | 2019-05-01 | 2019-05-01 | 2.3 | ||
| 306 | Configuration for Delaying Response to Failed Authentication TLDR: Adds a `connection.failed.authentication.delay.ms` broker config that delays the error response to a client that fails authentication, throttling brute-force and misconfigured reconnect storms. A misconfigured application reconnecting with invalid credentials can saturate broker network threads with authentication work, causing a denial-of-service for legitimate clients. |
Security Broker | Accepted | Dhruvil Shah | 2018-05-19 | 2018-08-31 | KAFKA-6950 | 2.1 | |
| 305 | Add Connect primitive number converters TLDR: Adds IntegerConverter, LongConverter, FloatConverter, DoubleConverter, and ShortConverter to Kafka Connect's built-in converter library, enabling numeric primitives as record keys or values without JSON or Avro overhead. Connect had String and bytes converters but no primitive numeric converters, forcing connector authors to implement their own or convert numbers to strings. |
Connect | Accepted | Randall Hauch | 2018-05-17 | 2018-05-30 | KAFKA-6913 | 2.0 | |
| 304 | Connect runtime mode improvements for container platforms TLDR: Proposes a new Connect runtime mode where each connector runs in its own isolated worker process, eliminating cross-connector interference from rebalances, crashes, and JAR conflicts in the current shared distributed mode. In distributed mode, deploying or crashing one connector triggers a full rebalance affecting all other connectors on all workers, introducing unpredictable latency spikes in unrelated pipelines. |
Connect | Discussion | Saulius Valatka | 2018-05-16 | 2018-05-17 | |||
| 303 | Add Dynamic Routing in Streams Sink TLDR: Adds a TopicNameExtractor interface to the Kafka Streams DSL's to() sink operator, allowing the output topic name to be computed dynamically per-record from the record's key, value, and headers at runtime. All Streams topologies required output topic names to be known at topology build time, making it impossible to route records to dynamically chosen topics based on message content. |
Streams | Accepted | Guozhang Wang | 2018-05-15 | 2018-05-25 | KAFKA-4936 | 2.0 | |
| 302 | Enable Kafka clients to use all DNS resolved IP addresses TLDR: KIP-302 adds a `use_all_dns_ips` option to the existing `client.dns.lookup` config, making Kafka clients iterate over all A records returned by DNS for a hostname rather than connecting to only the first resolved IP. This enables HA patterns where broker hostnames resolve to multiple load-balancer IPs, so a client can fail over to an alternate IP transparently instead of marking the broker as unavailable. |
Client | Accepted | Edoardo Comar | 2018-05-14 | 2018-10-24 | KAFKA-6863 | 2.1 | |
| 301 | Schema Inferencing for JsonConverter TLDR: KIP-301 adds schema inference to `JsonConverter` in Kafka Connect, allowing it to automatically derive a Connect schema from arbitrary JSON payloads when `schemas.enable=false`. Without schema inference, `JsonConverter` with `schemas.enable=true` requires the specific `{schema, payload}` envelope format and rejects plain JSON, while `schemas.enable=false` produces schemaless `Struct` objects that downstream SMTs and converters cannot inspect structurally. |
Connect | Discussion | Allen Tang | 2018-05-12 | 2018-05-15 | KAFKA-6895 | ||
| 300 | Add Windowed KTable API in StreamsBuilder TLDR: Adds StreamsBuilder#table() overloads that materialize a windowed changelog topic into a windowed KTable backed by either a persistent or in-memory windowed state store. Kafka Streams had no DSL method to consume a pre-existing windowed changelog topic (e.g., output of another Streams app) directly into a local windowed store for downstream join or query operations. |
Streams | Accepted | Boyang Chen | 2018-05-11 | 2019-05-17 | KAFKA-6840 | ||
| 298 | Error Handling in Connect TLDR: KIP-298 adds configurable error handling to Kafka Connect covering deserialization errors, transformation failures, and sink write failures, with options to log, skip (dead-letter queue), or fail on each error type. Before this change, any record-level error causes the entire task to fail and restart, providing no way to skip or quarantine bad records. |
Connect | Accepted | Arjun Satish | 2018-05-08 | 2019-04-28 | KAFKA-6738 | 2.0 | |
| 297 | Externalizing Secrets for Connect Configurations TLDR: Introduces a ConfigProvider SPI for Kafka Connect that resolves ${provider:path:key} placeholders in connector configs from external secret stores like Vault, AWS Secrets Manager, or Kubernetes Secrets. Connector configurations containing passwords and credentials were stored in plaintext in Connect's internal config topic, accessible to anyone with Read permission on that topic. |
Connect Security | Accepted | Robert Yokota | 2018-05-07 | 2018-08-28 | KAFKA-6886 | 2.0 | |
| 296 | Add connector level configurability for producer/consumer client configs TLDR: Allows individual Kafka Connect connectors to override worker-level producer/consumer client configurations by prefixing properties with producer. or consumer. in connector config. Previously all source/sink tasks on a worker inherited a single set of producer/consumer settings, offering no per-connector flexibility. |
Connect | Discussion | Allen Tang | 2018-05-07 | 2018-05-12 | KAFKA-6890 | ||
| 295 | Add Streams Configuration Allowing for Optional Topology Optimization TLDR: Adds a `topology.optimization` config to Kafka Streams (defaulting to `none`) that enables optional DSL-level topology optimizations such as merging redundant repartition topics. Without an explicit opt-in config, enabling automatic topology rewrites would be a breaking change for existing applications that depend on the current physical topology structure. |
Streams | Accepted | Bill Bejeck | 2018-05-07 | 2018-06-07 | KAFKA-6874 | 2.0 | |
| 294 | Enable TLS hostname verification by default TLDR: Changes the default value of ssl.endpoint.identification.algorithm from empty string (hostname verification disabled) to 'https' (hostname verification enabled). Without hostname verification, TLS connections are vulnerable to man-in-the-middle attacks even with valid certificates, as the client does not verify that the certificate's CN/SAN matches the broker hostname it connected to. |
Security | Accepted | Rajini Sivaram | 2018-05-04 | 2018-05-21 | KAFKA-3665 | 2.0 | |
| 292 | Add transformValues() method to KTable TLDR: Adds transformValues() methods to the KTable interface in Kafka Streams DSL, allowing stateful, per-task value transformation with access to state stores and the full ValueTransformerWithKey lifecycle (init/close). The KTable interface previously only had mapValues(), which is stateless, thread-unsafe across tasks, and lacks state store access. |
Streams | Accepted | Andrew Coates | 2018-05-02 | 2018-05-14 | KAFKA-6849 | 2.0 | |
| 291 | Separating controller connections and requests from the data plane TLDR: Introduces a dedicated `control.plane.listener.name` config that routes controller-to-broker RPC traffic over a separate listener, isolating it from the client data-plane listener. Prevents noisy client traffic from starving controller requests (UpdateMetadata, LeaderAndIsr, StopReplica) on the same network threads, improving controller responsiveness during high load. |
Broker Protocol | Accepted | Lucas Wang | 2018-05-02 | 2018-12-05 | KAFKA-4453 | 2.2 | Metadata UpdateMetadata |
| 290 | Support for Prefixed ACLs TLDR: KIP-290 adds prefix-based ACL resource patterns to Kafka's authorization model, allowing a single ACL to grant access to all resources whose names start with a given prefix. Without prefix ACLs, granting a principal access to a namespace of topics requires creating one ACL per topic name, which is operationally expensive and requires ACL updates whenever new topics are added. |
Security | Accepted | Piyush Vijay | 2018-05-01 | 2018-06-13 | KAFKA-6841 | 2.0 | DescribeAcls CreateAcls DeleteAcls |
| 289 | Improve the default group id behavior in KafkaConsumer TLDR: Changes `KafkaConsumer` behavior so that using the default empty `group.id` raises a clear `InvalidGroupIdException` at construction time when subscribe-based APIs are used, rather than at the first `poll()` call with a confusing error. The inconsistency between `subscribe` (which fails at poll) and `assign` (which works fine without a group.id) causes unexpected runtime failures for users who omit `group.id`. |
Consumer | Accepted | Vahid Hashemian | 2018-04-27 | 2018-08-16 | KAFKA-6774 | 2.2 | JoinGroup |
| 285 | Connect Rest Extension Plugin TLDR: Introduces a ConnectRestExtension SPI allowing third parties to register custom JAX-RS resources, filters, and providers into the Connect REST server for authentication, authorization, and custom endpoints. Connect's REST API had no extension points for security middleware, forcing teams to proxy the REST API or modify the Connect source to enforce access control. |
Connect | Accepted | Magesh kumar Nandakumar | 2018-04-11 | 2018-05-30 | KAFKA-6776 | 2.0 | |
| 284 | Set default retention ms for Streams repartition topics to Long.MAX_VALUE TLDR: Changes the default `retention.ms` for Kafka Streams auto-created repartition topics from 7 days to `Long.MAX_VALUE` (effectively infinite). After KIP-220/KIP-204, repartition topics are transient and managed by Streams' own compaction/purge logic; a finite retention can prematurely delete records with old timestamps during reprocessing or bootstrapping scenarios. |
Streams | Accepted | Khaireddine Rezgui | 2018-04-07 | 2018-04-12 | KAFKA-6535 | 2.0 | |
| 283 | Efficient Memory Usage for Down-Conversion TLDR: Implements lazy, chunked down-conversion for older fetch protocol clients, converting only the requested bytes at a time rather than materializing the entire partition's unconverted data in heap memory. When brokers must down-convert messages to an older format for legacy consumers, they previously loaded entire log segments into JVM heap, causing OutOfMemoryErrors under concurrent down-conversion load. |
Broker Protocol | Accepted | Dhruvil Shah | 2018-04-06 | 2018-06-12 | KAFKA-6927 | 2.0 | Fetch |
| 282 | Add the listener name to the authentication context TLDR: Adds the listener name to the AuthenticationContext passed to Kafka's pluggable PrincipalBuilder, allowing principal mapping logic to vary per listener (e.g., map clients differently on the internal vs external listener). Without the listener name, PrincipalBuilder implementations had to infer the listener from the client's IP address, which is fragile in NAT or load-balanced environments. |
Security Broker | Accepted | Mickael Maison | 2018-04-05 | 2018-05-16 | KAFKA-6750 | 2.0 | |
| 281 | ConsumerPerformance: Increase Polling Loop Timeout and Make It Reachable by the End User TLDR: KIP-281 increases the default `kafka-consumer-perf-test.sh` polling loop timeout from 1 second to 10 seconds and exposes it as a configurable `--timeout` command-line parameter. On topics with large partition counts, iterating over polled batches takes longer than 1 second, causing the benchmark tool to exit prematurely and report far fewer consumed records than actually exist. |
Consumer | Accepted | Alex Dunayevsky | 2018-04-04 | 2018-06-09 | KAFKA-6743 | 2.0 | |
| 280 | Enhanced log compaction TLDR: KIP-280 enhances log compaction to support a client-side ordering guarantee: records produced with a sequence number are compacted respecting producer order, not just offset order, preventing a compacted log from presenting a consumer with a state that never existed on the producer. Current compaction is based purely on broker-assigned offsets, which can produce semantically incorrect state for producers that send causally ordered updates. |
Broker | Accepted | Luís Cabral | 2018-04-05 | 2020-04-30 | KAFKA-7061 | ||
| 279 | Fix log divergence between leader and follower after fast leader fail over TLDR: KIP-279 fixes log divergence scenarios not addressed by KIP-101 (clean and unclean leader elections) by adding epoch validation to the `OffsetForLeaderEpoch` response: the leader now returns an error if the requested epoch is larger than its own, forcing the follower to fetch updated leader epoch information before truncating. Without this, rapid double-failover sequences could cause a follower to skip truncation or truncate to the wrong offset, leaving permanently diverged log lineages. |
Broker | Accepted | Anna Povzner | 2018-04-04 | 2020-06-10 | KAFKA-6361 | 2.0 | OffsetForLeaderEpoch |
| 278 | Add version option to Kafka's commands TLDR: Adds a --version flag to all Kafka CLI tools (kafka-topics.sh, kafka-console-producer.sh, etc.) that prints the tool's version string and exits. There was no standard way to determine which Kafka version a CLI binary corresponded to, making it difficult to diagnose version mismatches in mixed-version deployments or when tools are copied to remote hosts. |
Admin | Accepted | Sasaki Toru | 2018-04-02 | 2018-05-24 | KAFKA-2061 | 2.0 | |
| 277 | Fine Grained ACL for CreateTopics API TLDR: Changes CreateTopics ACL checks from requiring Describe+Create on the Cluster resource to requiring Create permission on the specific Topic resource being created, enabling per-topic create ACLs. The cluster-level ACL for topic creation was asymmetric with DeleteTopics (which used topic-level ACLs) and prevented granting users the ability to manage only their own topic namespace. |
Security Admin | Accepted | Edoardo Comar | 2018-03-29 | 2018-05-25 | KAFKA-6726 | 2.0 | CreateTopics DeleteTopics |
| 276 | Add StreamsConfig prefix for different consumers TLDR: Adds main.consumer., restore.consumer., and global.consumer. config prefixes to StreamsConfig, allowing users to set different config values for each of Kafka Streams' three internal consumer clients. Streams previously used a single consumer. prefix shared by all three consumers, making it impossible to tune fetch sizes, session timeouts, or security settings independently for restore vs main consumption. |
Streams Consumer | Accepted | Boyang Chen | 2018-03-29 | 2019-01-01 | KAFKA-6657 | 2.0 | |
| 274 | Kafka Streams Skipped Records Metrics TLDR: Adds per-operator skipped-records-rate and skipped-records-total metrics to Kafka Streams, broken down by thread, task, and processor node. The existing skipped-records metrics were only aggregated at the thread level, making it impossible to identify which specific operator or topology node was encountering and skipping malformed or null-key records. |
Streams Metrics | Accepted | John Roesler | 2018-03-28 | 2018-05-20 | KAFKA-6376 | 2.0 | |
| 273 | Kafka to support using ETCD beside Zookeeper TLDR: Proposes making etcd an alternative metadata store to ZooKeeper by abstracting the ZooKeeper client calls behind a pluggable metadata store interface. In Kubernetes and other cloud-native environments, etcd is the native distributed coordination system; requiring ZooKeeper adds operational complexity for operators who want to avoid running a separate ZooKeeper ensemble. |
Broker | Discussion | Balint Molnar | 2018-03-28 | 2018-03-29 | KAFKA-6598 | ||
| 272 | Add API version tag to broker's RequestsPerSec metric TLDR: Adds an API version tag to the broker's `RequestsPerSec` JMX metric, breaking it down into per-API-version counters (e.g., `ProduceRequest.v5`, `FetchRequest.v9`). During rolling upgrades, operators cannot determine what fraction of clients have upgraded to a new API version, making it unsafe to change `message.format.version` or deprecate old request versions. |
Metrics Protocol | Accepted | Allen Wang | 2018-03-19 | 2018-04-02 | KAFKA-6514 | 2.0 | |
| 271 | Add NetworkClient redirector TLDR: KIP-271 proposes a `NetworkClientRedirector` plugin interface that intercepts broker address resolution in `NetworkClient`, allowing clients to remap internal broker hostnames to externally reachable addresses. Clients connecting to Kafka from outside a private network (e.g., from outside Kubernetes or AWS VPC) fail after the initial bootstrap because subsequent metadata responses contain internal DNS names unreachable from the client. |
Client | Discussion | Sergey Korytnikov | 2018-03-16 | 2018-12-18 | KAFKA-6669 | ||
| 270 | A Scala Wrapper Library for Kafka Streams TLDR: Provides an official Scala wrapper library over the Kafka Streams Java DSL to improve type inference, reduce boilerplate, and add compile-time serde type safety. The raw Java API requires excessive type annotations and cannot type-check serde mismatches at compile time when used from Scala. |
Streams | Discussion | Debasish Ghosh | 2018-03-16 | 2018-03-23 | KAFKA-6670 | 2.1 | |
| 268 | Simplify Kafka Streams Rebalance Metadata Upgrade TLDR: Redesigns the Kafka Streams rebalance metadata versioning scheme to support rolling upgrades between metadata versions without requiring full application shutdowns, enabling single rolling bounce upgrades. The original metadata upgrade mechanism (used in 0.10.1.0) required the entire application to be stopped before upgrading, because mixed-version clusters would fail to interpret newer metadata formats. |
Streams Consumer | Accepted | Matthias J. Sax | 2018-03-15 | 2021-07-08 | KAFKA-6054 | 2.0 | |
| 267 | Add Processor Unit Test Support to Kafka Streams Test Utils TLDR: KIP-267 adds a `MockProcessorContext` to the `kafka-streams-test-utils` module so developers can unit-test `Processor`, `Transformer`, and `ValueTransformer` implementations in isolation without starting a full `TopologyTestDriver`. Previously writing such tests required custom mock code for `ProcessorContext` that reproduced complicated internal behavior including state store registration, forwarding capture, and punctuator scheduling. |
Streams Testing | Accepted | John Roesler | 2018-03-08 | 2018-03-27 | KAFKA-6473 | 2.0 | |
| 266 | Fix consumer indefinite blocking behavior TLDR: Introduces a default.api.timeout.ms config for KafkaConsumer and adds Duration-typed overloads to blocking APIs (poll, commitSync, partitionsFor, etc.), replacing the previous indefinite-blocking behavior. Consumer APIs like partitionsFor() and commitSync() with no timeout argument could block indefinitely, causing application threads to hang permanently on broker connectivity issues. |
Consumer | Accepted | Richard Yu | 2018-03-04 | 2018-09-22 | KAFKA-6608 | 2.0 | |
| 265 | Make Windowed Serde to public APIs TLDR: Promotes windowed Serdes (TimeWindowedSerde, SessionWindowedSerde) from internal packages to the public o.a.k.streams.kstream package and adds DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS / DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS to StreamsConfig. Users consuming windowed KTable changelog topics had to implement their own windowed deserialization since the internal serdes were not part of the public API. |
Streams | Accepted | Guozhang Wang | 2018-03-01 | 2018-04-16 | KAFKA-4831 | 2.0 | |
| 264 | Add a consumer metric to record raw fetch size TLDR: Adds a `fetch-size-total-bytes` consumer metric that tracks raw (compressed, on-wire) bytes received from the broker, separate from the existing `bytes-fetched` metric which measures decompressed bytes. The decompressed metric overstates wire bandwidth consumption, preventing accurate measurement of actual network utilization and compression ratio effectiveness. |
Consumer Metrics | Discussion | Vahid Hashemian | 2018-02-27 | 2018-05-25 | KAFKA-3999 | ||
| 261 | Add Single Value Fetch in Window Stores TLDR: Adds a `fetch(K key, long time)` single-point lookup to the `WindowStore` interface that retrieves the value for an exact window start time, in addition to the existing range query APIs. Without this, fetching the value for a known window required a `fetch(key, from, to)` range query even when the exact window start timestamp is known, unnecessarily scanning multiple windows. |
Streams | Accepted | Guozhang Wang | 2018-02-22 | 2019-01-19 | KAFKA-6560 | 1.1 | |
| 258 | Allow to Store Record Timestamps in RocksDB TLDR: Stores record timestamps alongside key-value data in RocksDB state stores for Kafka Streams KTables, enabling time-aware operations and stream-time punctuators on materialized state. Previously KTables had no access to record timestamps, blocking correct event-time semantics and DSL features that depend on per-record time. |
Streams | Accepted | Matthias J. Sax | 2018-02-21 | 2022-05-20 | KAFKA-6521 | 2.3 | |
| 257 | Configurable Quota Management TLDR: KIP-257 extends the quota management API in `kafka-configs.sh` to support quota entities identified by user principal and/or client ID with `--user-defaults` and `--client-defaults` hierarchical fallbacks configured via `AdminClient`. The existing tool stored quotas directly in ZooKeeper without going through the broker, bypassing access control and making quota management inconsistent with other admin operations. |
Admin Broker | Accepted | Rajini Sivaram | 2018-02-21 | 2018-04-06 | KAFKA-6576 | 2.0 | |
| 254 | JsonConverter Exception Handeling TLDR: KIP-254 proposes adding a configurable error handler to `JsonConverter` in Kafka Connect so that individual malformed JSON messages can be skipped or dead-lettered rather than crashing the task. A single unparseable JSON record in an input topic causes the entire Connect task to fail and enter a restart loop, blocking all subsequent records. |
Connect | Discussion | Prasanna Subburaj | 2018-02-12 | 2018-02-12 | KAFKA-6490 | 2.0 | |
| 253 | Support in-order message delivery with partition expansion TLDR: Proposes a mechanism to guarantee key-ordered delivery even when a topic's partition count changes, by maintaining a stable mapping from key to old partition during the transition window. Expanding partitions today causes records with the same key to be routed to a different partition (since `hash(key) % numPartitions` changes), breaking the ordering guarantee relied upon by stateful consumers. |
Producer Broker | Discussion | Dong Lin | 2018-02-09 | 2018-07-21 | Produce Fetch Metadata LeaderAndIsr StopReplica UpdateMetadata SyncGroup | ||
| 252 | Extend ACLs to allow filtering based on ip ranges and subnets TLDR: Extends the Kafka ACL model to support CIDR-based IP range matching in the `host` field of an ACL entry, in addition to exact IP addresses and the `*` wildcard. Currently, granting access to a subnet requires creating one ACL entry per individual IP address, which is unmanageable in environments with dynamic or numerous client IP addresses. |
Security | Discussion | Sönke Liebau | 2018-01-31 | 2018-05-03 | KAFKA-4759 | ||
| 251 | Allow timestamp manipulation in Processor API TLDR: Adds a ProcessorContext#setTimestamp() method to the Streams Processor API, allowing processors to override the timestamp that will be used for output records and punctuation scheduling. Without this, the timestamp of output records is always inherited from the input record, making it impossible to implement custom timestamp semantics like event-time enrichment or timestamp correction in the low-level API. |
Streams | Accepted | Matthias J. Sax | 2018-01-31 | 2019-07-31 | KAFKA-6454 | 2.0 | |
| 250 | Add Support for Quorum-based Producer Acknowledgment TLDR: Adds a `acks=quorum` producer acknowledgment mode that waits for a configurable quorum (e.g., majority) of in-sync replicas to acknowledge rather than all ISRs, reducing tail latency caused by the slowest ISR member. With `acks=all`, produce P999 latency spikes to `replica.lag.time.max.ms` (often 10s) because a lagging-but-still-in-ISR follower determines the worst-case latency. |
Producer Broker | Discussion | Bryan D | 2018-01-23 | 2020-07-26 | KAFKA-6477 | ||
| 249 | Add Delegation Token Operations to KafkaAdminClient TLDR: Adds CreateDelegationToken, RenewDelegationToken, ExpireDelegationToken, and DescribeDelegationToken operations to the Java AdminClient API. KIP-48 implemented the broker-side protocol and request/response types but omitted the AdminClient surface, leaving users without a programmatic way to manage delegation tokens. |
Admin Security | Discussion | Manikumar Reddy O. | 2018-01-14 | 2018-01-17 | KAFKA-6447 | 2.0 | |
| 247 | Add public test utils for Kafka Streams TLDR: Introduces a public `TopologyTestDriver` (and supporting `ConsumerRecordFactory`, `OutputVerifier`) in a stable `kafka-streams-test-utils` artifact for unit-testing Streams topologies without a real Kafka cluster. Previously, the only test utilities were internal classes in the test JAR with no stability guarantees, making it impractical to unit-test Streams applications. |
Streams Testing | Accepted | Matthias J. Sax | 2018-01-08 | 2018-01-24 | KAFKA-3625 | 1.1 | |
| 245 | Use Properties instead of StreamsConfig in KafkaStreams constructor TLDR: Deprecates `KafkaStreams` constructors that accept a `StreamsConfig` argument, retaining only the `Properties`-based constructors. Since `StreamsConfig` is immutable and itself constructed from `Properties`, passing a `StreamsConfig` to the constructor is pure boilerplate with no added value. |
Streams | Accepted | Boyang Chen | 2017-12-25 | 2018-03-26 | KAFKA-6386 | 2.0 | |
| 244 | Add Record Header support to Kafka Streams Processor API TLDR: Adds Headers access to the Kafka Streams Processor API via ProcessorContext#headers() and exposes headers in ConsumerRecordFactory and OutputVerifier for testing, enabling processors to read, write, and propagate record headers. Headers were available in the producer and consumer APIs but inaccessible in Streams, blocking use cases like distributed tracing context propagation through stream topologies. |
Streams | Accepted | Jorge Esteban Quilcate Otoya | 2017-12-21 | 2022-09-29 | KAFKA-6395 | 2.0 | |
| 243 | Make ProducerConfig and ConsumerConfig constructors public TLDR: Makes ProducerConfig(Properties/Map) and ConsumerConfig(Properties/Map) constructors public, enabling programmatic access to default configuration values. Kafka Streams previously had to hard-code producer/consumer default values because the only available constructors were package-private, risking drift when upstream defaults changed. |
Client | Accepted | Matthias J. Sax | 2017-12-18 | 2018-01-12 | KAFKA-6382 | 1.1 | |
| 242 | Mask password in Kafka Connect Rest API response TLDR: Masks `PASSWORD`-type config fields in all Kafka Connect REST API responses (`/connectors/{name}`, `/connectors/{name}/config`, `/tasks`) by replacing their values with `[hidden]`. The REST API currently returns connector configurations including database passwords and secret keys in plaintext, exposing credentials to anyone with HTTP access to the Connect REST port. |
Connect Security | Discussion | Vincent Meng | 2017-12-18 | 2018-04-12 | KAFKA-5117 | 2.0.2 | |
| 240 | AdminClient.listReassignments() AdminClient.describeReassignments() TLDR: Adds `AdminClient.listReassignments()` and `AdminClient.describeReassignments()` methods (and corresponding protocol) to expose in-progress partition reassignments to admin tooling. Kafka's admin client has no visibility into ongoing partition reassignments, making it impossible to programmatically check current reassignment state before issuing new ones or building tooling that coordinates replica movements. |
Admin | Discussion | Tom Bentley | 2017-12-15 | 2017-12-18 | KAFKA-6379 | ||
| 239 | Add queryableStoreName() to GlobalKTable TLDR: KIP-239 adds `GlobalKTable.queryableStoreName()` to the Kafka Streams API, mirroring the same method already available on `KTable`. Without it, code that constructs a `GlobalKTable` cannot retrieve the state store name for use in Interactive Queries without storing it separately at construction time. |
Streams | Accepted | Zhihong Yu | 2017-12-12 | 2018-01-03 | KAFKA-6265 | 1.1 | |
| 238 | Expose Kafka cluster ID in Connect REST API TLDR: Exposes the underlying Kafka cluster ID in the Connect REST API response for `GET /` (the root resource). Without this, operators managing multiple Connect clusters connected to different Kafka clusters cannot verify the Kafka cluster association from the Connect REST API alone, requiring access to broker configs or ZooKeeper. |
Connect Admin | Accepted | Ewen Cheslack-Postava | 2017-12-11 | 2018-01-08 | KAFKA-6311 | 1.1 | |
| 237 | More Controller Health Metrics TLDR: Adds controller health metrics including EventQueueSize and EventQueueTimeMs gauges for the ControllerEventManager and RequestRateAndQueueTimeMs timers per broker for the ControllerChannelManager. Operators had limited visibility into controller processing latency and queue depth, making it difficult to diagnose controller overload or stalls. |
Metrics Broker | Accepted | Dong Lin | 2017-12-07 | 2018-01-04 | KAFKA-3473 | 2.0 | |
| 236 | Interruptible Partition Reassignment TLDR: KIP-236 proposes making partition reassignment interruptible and cancellable via the admin API, allowing in-flight reassignments to be stopped without the dangerous manual ZooKeeper manipulation currently required. A single active reassignment batch cannot be aborted cleanly, and large reassignments can cause sustained cluster performance degradation with no escape valve. |
Admin Broker | Discussion | Tom Bentley | 2017-12-06 | 2019-03-31 | KAFKA-6359 | ||
| 235 | Add DNS alias support for secured connection TLDR: Allows Kafka clients to resolve all A/CNAME records behind a DNS alias when using SASL/Kerberos, so the Kerberos service principal is constructed from the canonical hostname rather than the alias. When a DNS alias is listed in `bootstrap.servers`, the Java client performs Kerberos authentication against the alias string, which has no corresponding Kerberos service principal, causing `SaslException`. |
Security | Accepted | Jonathan Skrzypek | 2017-12-05 | 2018-10-24 | KAFKA-6195 | 2.1 | |
| 234 | add support for getting topic defaults from AdminClient TLDR: Adds a `describeClusterDefaultConfigs()` method to `AdminClient` that returns the broker-level default values for all topic configs without requiring an existing topic. Users creating topics via the AdminClient cannot determine broker default values ahead of time, leading to topics being created with incorrect configs (e.g., `delete` cleanup policy instead of `compact`) that must be fixed post-creation. |
Admin | Discussion | dan norwood | 2017-12-04 | 2017-12-04 | KAFKA-6309 | ||
| 233 | Simplify StreamsBuilder#addGlobalStore TLDR: KIP-233 simplifies `StreamsBuilder.addGlobalStore()` by removing the `sourceName` and `processorName` string parameters, which were internal Processor API implementation details leaking into the DSL API. Users had to supply arbitrary internal node names that served no semantic purpose at the DSL level, making the API unnecessarily complex and error-prone. |
Streams | Accepted | Panuwat Anawatmongkhon | 2017-12-04 | 2018-03-21 | KAFKA-6138 | 1.1 | |
| 231 | Improve the Required ACL of ListGroups API TLDR: Relaxes the ACL requirement for ListGroups from requiring Describe permission on the Cluster resource to returning only groups the caller has Describe permission on, silently filtering others. The existing design required Cluster-level Describe to list any groups, which is an overly broad privilege — service accounts only needed to see their own groups but had to be granted cluster-wide visibility. |
Security Admin | Accepted | Vahid Hashemian | 2017-11-29 | 2018-10-24 | KAFKA-5638 | 2.1 | ListGroups |
| 229 | DeleteGroups API TLDR: Adds a `DeleteGroups` admin API and corresponding `--delete` option to the `kafka-consumer-groups.sh` tool for explicitly deleting consumer groups backed by the new offset storage (`__consumer_offsets`). In the new consumer group management model, groups can only be removed when their committed offsets expire (default 1-day retention), leaving the offset store inflated for inactive groups with no explicit deletion path. |
Consumer Admin Protocol | Accepted | Vahid Hashemian | 2017-11-28 | 2018-01-26 | KAFKA-6275 | 1.1 | |
| 228 | Negative record timestamp support TLDR: Allows negative timestamps in Kafka records (representing dates before Unix epoch 1970-01-01) by removing the current validation that rejects records with timestamp < 0. Kafka is increasingly used to store historical data including pre-1970 events, but negative timestamps were unconditionally rejected, forcing workarounds like timestamp offsets. |
Streams Producer | Discussion | Konstantin Chukhlomin | 2017-11-28 | 2018-08-17 | KAFKA-6048 | ||
| 227 | Introduce Incremental FetchRequests to Increase Partition Scalability TLDR: Introduces incremental FetchRequests where followers only send partition-level deltas (additions/removals) rather than the full partition list on every fetch, and allows leaders to omit unchanged partition metadata from responses. Sending the full partition set in every FetchRequest and FetchResponse creates O(partitions) overhead that scales poorly as partition count grows into the hundreds of thousands. |
Protocol Consumer | Accepted | Colin McCabe | 2017-11-21 | 2022-09-11 | KAFKA-6254 | 1.1 | Fetch |
| 226 | Dynamic Broker Configuration TLDR: KIP-226 introduces dynamic broker configuration via the `AdminClient` and `kafka-configs.sh`, allowing broker properties like SSL keystores, thread pool sizes, and log configurations to be updated at runtime without a broker restart. Before this change, every broker configuration change required a rolling restart, making tuning and credential rotation expensive operations. |
Admin Broker | Accepted | Rajini Sivaram | 2017-11-20 | 2018-05-01 | KAFKA-6240 | 1.1 | Metadata UpdateMetadata DescribeConfigs |
| 224 | Add configuration parameter `retries` to Streams API TLDR: Adds `retries` and `retry.backoff.ms` configuration parameters to the Kafka Streams API, governing how Streams retries `TimeoutException` errors from global state store bootstrap and `KafkaProducer` timeouts. Without a retry mechanism, a transient broker unavailability during global state bootstrap kills the entire `KafkaStreams` instance with no recovery. |
Streams | Accepted | Matthias J. Sax | 2017-11-09 | 2017-11-20 | KAFKA-6122 | 1.1 | |
| 223 | Add per-topic min lead and per-partition lead metrics to KafkaConsumer TLDR: KIP-223 adds per-partition consumer lead metrics (`records-lead`, `records-lead-avg`, `records-lead-min`) and a global `records-lead-min` metric to `KafkaConsumer`'s `consumer-fetch-manager-metrics` group. Lag measures how far behind the high watermark a consumer is, but there was no metric for how far ahead of the log-start-offset the consumer is—a value near zero signals that the consumer is about to stall or lose data due to log retention. |
Consumer Metrics | Accepted | huxihx | 2017-11-09 | 2017-12-13 | KAFKA-6184 | 2.0 | |
| 222 | Add Consumer Group operations to Admin API TLDR: Adds `describeConsumerGroups`, `listConsumerGroups`, and `deleteConsumerGroups` operations to the `AdminClient` Java API. The existing tooling for consumer group inspection existed only in CLI scripts that used internal Scala classes or ZooKeeper directly, making programmatic group management impossible without shell-outs. |
Admin Consumer | Accepted | Jorge Esteban Quilcate Otoya | 2017-11-05 | 2018-11-08 | KAFKA-6058 | 2.0 | |
| 221 | Enhance DSL with Connecting Topic Creation and Repartition Hint TLDR: Allows Kafka Streams DSL users to specify a `repartitionFactor` hint on repartition operations and to explicitly name and configure repartition topics via a `Repartitioned` object, giving control over the number of repartition topic partitions independent of the upstream sub-topology. Without this, downstream parallelism was rigidly tied to upstream source partition counts, preventing optimal scaling of stateful operations. |
Streams | Accepted | Jeyhun Karimov | 2017-11-04 | 2020-05-19 | KAFKA-6037 | 2.6 | |
| 220 | Add AdminClient into Kafka Streams' ClientSupplier TLDR: Adds AdminClient to the Kafka Streams KafkaClientSupplier interface so users can inject a custom AdminClient instance and Streams can use it for internal topic management, replacing the internal StreamsKafkaClient. Streams had duplicated admin functionality in StreamsKafkaClient that predated the public AdminClient, and lacked a hook for users to supply a custom configured instance. |
Streams Admin | Accepted | Guozhang Wang | 2017-11-03 | 2017-12-19 | KAFKA-6170 | 1.1 | |
| 219 | Improve quota communication TLDR: KIP-219 changes Kafka's quota enforcement to send a throttle notification header in the response that tells the client how long it will be throttled, rather than delaying the response itself. Delaying the response ties up broker I/O threads for the throttle duration and causes producer `max.block.ms` timeouts when throttle times are long, since the producer cannot distinguish a throttled response from a network failure. |
Broker Admin | Accepted | Jiangjie Qin | 2017-11-01 | 2018-04-26 | KAFKA-6028 | 2.0 | Produce Fetch |
| 218 | Make KafkaFuture.Function java 8 lambda compatible TLDR: Converts `KafkaFuture.Function` and `KafkaFuture.BiConsumer` from abstract classes to `@FunctionalInterface`-annotated interfaces. As abstract classes with a single abstract method, they cannot be used with Java 8 lambda expressions, forcing users to write verbose anonymous class syntax in `AdminClient` callback code. |
Admin Client | Accepted | Steven Aerts | 2017-11-01 | 2018-01-30 | KAFKA-6018 | 1.1 | |
| 217 | Expose a timeout to allow an expired ZK session to be re-created TLDR: Proposed adding a `zookeeper.reconnect.timeout.ms` config to bound how long a broker retries ZooKeeper reconnection before giving up and exiting, to allow Kubernetes/orchestrators to detect the failure and restart the pod. The existing behavior retries ZK session re-creation indefinitely with no way to signal failure to an external health checker. (Cancelled — resolved by doing infinite retry instead.) |
Broker | Discarded | Jun Rao | 2017-10-27 | 2017-11-06 | KAFKA-5473 | 1.1 | |
| 216 | IQ should throw different exceptions for different errors TLDR: Introduces typed subclasses of `InvalidStateStoreException` for Interactive Queries—`StreamsNotRunningException`, `InvalidStateStorePartitionException`, and `UnknownStateStoreException`—to distinguish the root cause of a failed state store lookup. The single `InvalidStateStoreException` thrown for all IQ errors forces callers to parse exception messages or use catch-all retry logic regardless of whether the error is transient (rebalance) or permanent (wrong store name). |
Streams | Accepted | vitojeng | 2017-10-27 | 2020-02-02 | KAFKA-5876 | 3.0 | |
| 215 | Add topic regex support for Connect sinks TLDR: Adds a `topics.regex` configuration option to Connect sink connectors allowing them to subscribe to a dynamic set of topics via a Java regex pattern, in addition to the existing static `topics` list. Connect sinks hardcoded to a topic list require connector reconfiguration (and a rebalance) whenever new matching topics are created, making dynamic topic consumption impossible. |
Connect | Accepted | Jeff Klukas | 2017-10-27 | 2018-01-08 | KAFKA-3073 | 1.1 | |
| 214 | Add zookeeper.max.in.flight.requests config to the broker TLDR: Adds a broker-side zookeeper.max.in.flight.requests config to cap the number of concurrent outstanding ZooKeeper requests sent by the pipelined ZooKeeperClient wrapper. The scatter-gather ZooKeeperClient introduced in KAFKA-5501 could overwhelm ZooKeeper by issuing unbounded parallel requests, and ZooKeeper's own zookeeper.globalOutstandingLimit provided only cluster-wide, cross-connection throttling insufficient to protect against a single misbehaving broker. |
Broker | Accepted | Onur Karaman | 2017-10-25 | 2017-11-01 | KAFKA-5894 | 1.1 | |
| 213 | Support non-key joining in KTable | Discussion | Adam Bellemare | 2021-05-16 | 2021-05-16 | 2.4 | |||
| 212 | Enforce set of legal characters for connector names TLDR: Enforces a whitelist of legal characters for Kafka Connect connector names (alphanumeric, `.`, `-`, `_`) and rejects empty names at creation time. Special characters in connector names break REST API URL routing and make it impossible to delete or modify connectors created with those names. |
Connect | Accepted | Sönke Liebau | 2017-10-23 | 2018-01-31 | KAFKA-4930 | 1.1 | |
| 211 | Revise Expiration Semantics of Consumer Group Offsets TLDR: Changes offset expiration semantics so that committed offsets for a consumer group are only expired after both `offsets.retention.minutes` has elapsed since the last commit AND the consumer group itself is empty (no active members). Under the old semantics, offsets for inactive partitions within an otherwise active group expired on their own schedule, causing active consumers to lose their committed position and reset to `auto.offset.reset`. |
Consumer | Accepted | Vahid Hashemian | 2017-10-18 | 2019-05-14 | KAFKA-4682 | 2.1 | |
| 210 | Provide for custom error handling when Kafka Streams fails to produce TLDR: KIP-210 adds a `ProductionExceptionHandler` interface to Kafka Streams with a `handle(ProducerRecord, Exception)` method that returns `CONTINUE` or `FAIL`, plus a `default.production.exception.handler` Streams config to register it. Without this hook, any `ApiException` thrown when Streams writes to a downstream topic (e.g. due to a record exceeding `max.request.size`) tears down the entire `StreamThread`, requiring a manual restart. |
Streams | Accepted | Matt Farmer | 2017-10-18 | 2018-06-13 | KAFKA-6086 | 1.1 | |
| 209 | Connection String Support TLDR: KIP-209 proposes adding a URI-style connection string format (e.g., `kafka://user:pass@broker1:9092,broker2:9092?acks=all`) as an alternative to `Properties` maps for configuring Kafka clients. The `HashMap`-based configuration API requires verbose boilerplate and makes it harder to store and share connection configs as a single string. |
Client | Discussion | Clebert Suconic | 2017-10-16 | 2017-11-04 | |||
| 208 | Add SSL support to Kafka Connect REST interface TLDR: Adds TLS/SSL and optional mutual TLS (mTLS) authentication to the Kafka Connect REST interface via new `listeners.https.*` and `rest.ssl.*` configs. The Connect REST API transmits connector configurations (which often contain passwords) over plain HTTP, and has no authentication mechanism to prevent unauthorized connector creation or modification. |
Connect Security | Accepted | Jakub Scholz | 2017-10-09 | 2018-01-30 | KAFKA-4029 | 1.1 | |
| 207 | Offsets returned by ListOffsetsResponse should be monotonically increasing even during a partition leader change TLDR: KIP-207 changes the new partition leader to refuse `ListOffsets` requests for a partition until its high watermark has caught up to its own log-end offset at the time of election, returning a retriable `LeaderNotAvailableException` (or a new precise error code in newer protocol versions) during that brief window. After a leader election, the new leader's HWM may be behind the old leader's HWM, causing `ListOffsets` to return a lower offset than before the election, breaking the monotonicity assumption that consumers and connectors (e.g. Spark Streaming) rely on. |
Broker Protocol | Accepted | Colin McCabe | 2017-10-05 | 2018-12-03 | KAFKA-2334 | 2.2 | ListOffsets |
| 206 | Add support for UUID serialization and deserialization TLDR: Adds UUIDSerializer and UUIDDeserializer as built-in Kafka client SerDes for java.util.UUID. Users frequently use UUIDs as record keys but had no native support, requiring conversion to String (lossy) or byte[] (manual 16-byte encoding/decoding) as a workaround. |
Client | Accepted | Jakub Scholz | 2017-09-26 | 2018-09-10 | KAFKA-4932 | 2.1 | |
| 205 | Add all() and range() API to ReadOnlyWindowStore TLDR: Adds all() and range(K from, K to, long fromTime, long toTime) APIs to ReadOnlyWindowStore in Kafka Streams, enabling full scans and key-range scans over windowed state stores. The existing ReadOnlyWindowStore only offered fetch-by-key methods, requiring callers to know keys in advance and making it impossible to iterate or scan the store without prior knowledge. |
Streams | Accepted | Richard Yu | 2017-09-25 | 2017-11-17 | KAFKA-4499 | 1.1 | |
| 204 | Adding records deletion operation to the new Admin Client API TLDR: Adds `deleteRecords(Map<TopicPartition, RecordsToDelete>)` to the new Java `AdminClient` API, exposing the existing `DeleteRecordsRequest` (API key 21) that was previously accessible only via the legacy Scala admin client. The new `AdminClient` introduced in KIP-117 lacked record deletion support, forcing users to fall back to the deprecated Scala tool or rely on time-based retention. |
Admin | Accepted | Paolo Patierno | 2017-09-18 | 2017-11-15 | KAFKA-5925 | 1.1 | |
| 203 | Add toLowerCase support to sasl.kerberos.principal.to.local rule TLDR: Extends sasl.kerberos.principal.to.local.rules with a /L modifier that lowercases the resulting principal name, enabling mapping of Kerberos principals like user@REALM to lowercase Linux usernames. Without this, the default principal extraction produces case-sensitive names that don't match lowercase Linux users, causing authentication failures. |
Security | Accepted | Manikumar Reddy O. | 2017-09-18 | 2019-09-02 | KAFKA-5764 | 1.1 | |
| 202 | Move merge() from StreamsBuilder to KStream TLDR: Moves the merge(KStream) method from StreamsBuilder to the KStream interface itself, so two streams are merged via stream1.merge(stream2) instead of builder.merge(stream1, stream2). The original placement on StreamsBuilder was unintuitive, as merging is a stream-to-stream operation that should be expressed on the KStream object. |
Streams | Accepted | Richard Yu | 2017-09-17 | 2017-10-29 | KAFKA-5765 | 1.0 | |
| 201 | Rationalising Policy interfaces TLDR: KIP-201 proposes rationalising Kafka's broker-side policy interfaces (`CreateTopicPolicy`, `AlterConfigPolicy`) into a unified, consistent API with a common lifecycle and configuration contract. The existing policy interfaces have inconsistent method signatures and lack a `close()` lifecycle hook, making it difficult to implement policies that hold resources. |
Admin Broker | Discussion | Tom Bentley | 2017-09-12 | 2019-08-13 | KAFKA-5693 | CreateTopics DeleteTopics DeleteRecords AlterConfigs CreatePartitions | |
| 198 | Remove ZK dependency from Streams Reset Tool TLDR: Replaces the `--zookeeper` parameter in `kafka-streams-application-reset.sh` with `--bootstrap-servers`, migrating the Streams reset tool from direct ZooKeeper access to the `AdminClient` API. Retaining ZooKeeper access in client-side tools couples them to the ZooKeeper topology and prevents future ZooKeeper removal from Kafka. |
Streams Admin | Accepted | Matthias J. Sax | 2017-09-08 | 2017-09-12 | KAFKA-5862 | 1.0 | |
| 197 | Connect REST API should include the connector type when describing a connector TLDR: KIP-197 adds a `type` field (`source` or `sink`) to the JSON responses of `GET /connectors/{name}` and `GET /connectors/{name}/status` in the Kafka Connect REST API. Without the type field, client tools and UIs must infer connector type from naming conventions in the connector class name, which is unreliable. |
Connect Admin | Accepted | Zhihong Yu | 2017-09-08 | 2017-09-20 | KAFKA-5657 | 1.0 | |
| 196 | Add metrics to Kafka Connect framework TLDR: Adds comprehensive JMX metrics to Kafka Connect covering connector state, task state, worker utilization, source record rates, sink record rates, and error rates at the worker, connector, and task level. The existing Connect metrics were limited to counts of connectors and tasks per worker, providing no actionable data for monitoring SLAs, detecting failures, or capacity planning. |
Connect Metrics | Accepted | Randall Hauch | 2017-09-07 | 2021-01-04 | KAFKA-2376 | 1.0 | |
| 195 | AdminClient.createPartitions TLDR: Adds `AdminClient.createPartitions()` (backed by new `CreatePartitionsRequest`/`CreatePartitionsResponse` protocol APIs) to programmatically increase the partition count of existing topics. The only existing mechanism to increase partition count is the `kafka-topics.sh --alter` shell script, which communicates directly with ZooKeeper; a proper network API is needed to decouple topic management from ZooKeeper access. |
Admin | Accepted | Tom Bentley | 2017-09-07 | 2017-09-19 | KAFKA-5856 | 1.0 | CreatePartitions |
| 192 | Provide cleaner semantics when idempotence is enabled TLDR: Fixes two idempotent producer edge cases: ensures `RecordMetadata` returned for records in a failed batch contains the correct assigned sequence number instead of -1, and enforces that `enable.idempotence=true` also requires `acks=all` and disables retries from being set to 0. The existing behavior allowed users to configure idempotence with settings that silently undermined its guarantees, and returned uninformative metadata on failure. |
Producer Transactions | Accepted | Apurva Mehta | 2017-08-29 | 2017-09-21 | KAFKA-5793 | 1.0 | Produce Metadata |
| 191 | KafkaConsumer.subscribe() overload that takes just Pattern TLDR: Adds a subscribe(Pattern pattern) overload to KafkaConsumer that uses a no-op ConsumerRebalanceListener, making pattern-based subscriptions usable without boilerplate. The only existing pattern-based subscribe required passing a ConsumerRebalanceListener even when no rebalance handling was needed, forcing users to instantiate a dummy listener. |
Consumer | Accepted | Attila Kreiner | 2017-08-24 | 2017-09-01 | KAFKA-5726 | 1.0 | |
| 190 | Handle client-ids consistently between clients and brokers TLDR: Removes the overly restrictive character whitelist for `client.id` enforced by the Java client, aligning with broker behavior that imposes no such restrictions. The character restrictions prevented third-party clients (e.g., librdkafka) using valid but non-whitelisted client IDs from being handled consistently, causing quota and metric tagging failures. |
Client | Accepted | Mickael Maison | 2017-08-21 | 2018-01-03 | KAFKA-5735 | 1.0 | |
| 189 | Improve principal builder interface and add support for SASL TLDR: Replaces the existing `PrincipalBuilder` interface (which only supports SSL) with an extended `KafkaPrincipalBuilder` interface that provides typed callbacks for both SSL (`SslAuthenticationContext`) and SASL (`SaslAuthenticationContext`) authentication contexts, enabling custom principal extraction from SASL credentials. The old interface had no SASL support and relied on `TransportLayer`/`Authenticator` internals that are not stable public API. |
Security | Accepted | Jason Gustafson | 2017-08-17 | 2017-09-13 | KAFKA-5783 | 1.0 | |
| 188 | Add new metrics to support health checks TLDR: Adds broker-side error rate metrics per request type and per error code, and a `NetworkProcessorAvgIdlePercent` metric per network thread, to expose health signals useful for automated health checks. Without error rate metrics, the only way to detect systematic request failures is by examining consumer/producer logs or observing indirect symptoms like lag. |
Metrics Broker | Accepted | Rajini Sivaram | 2017-08-16 | 2017-10-05 | KAFKA-5746 | 1.0 | Fetch |
| 187 | Add cumulative count metric for all Kafka rate metrics TLDR: Adds a cumulative `total` attribute alongside the existing rate attribute for all JMX rate metrics in Kafka (e.g., `MessagesInPerSec` gains a `total` counter). Kafka's JMX rate metrics expose only the windowed rate, not the running total, whereas tools like Yammer expose both — forcing operators to integrate the rate externally to compute totals. |
Metrics | Accepted | Rajini Sivaram | 2017-08-16 | 2018-06-22 | KAFKA-5738 | 1.0 | |
| 186 | Increase offsets retention default to 7 days TLDR: Increases the default `offsets.retention.minutes` from 1440 minutes (24 hours) to 10080 minutes (7 days). New users are surprised to find their consumer group offsets evicted after 24 hours of inactivity, causing unexpected data reprocessing or skipping when the application restarts. |
Consumer Broker | Accepted | Ewen Cheslack-Postava | 2017-08-09 | 2018-06-15 | KAFKA-3806 | 2.0 | |
| 185 | Make exactly once in order delivery the default producer setting TLDR: KIP-185 proposes changing the default producer configuration to enable idempotent delivery (`enable.idempotence=true`) and set `acks=all`, `max.in.flight.requests.per.connection=5`, and `retries=MAX_INT`, making exactly-once in-order delivery per partition the default behavior. The historical default (at-most-once, no ordering guarantee) was the weakest delivery semantic Kafka offers and was chosen for throughput, but Kafka 0.11 introduced idempotence at near-zero overhead, making the weak default no longer justified. |
Producer Transactions | Discussion | Apurva Mehta | 2017-08-08 | 2023-09-06 | KAFKA-5494 | 1.0 | Produce |
| 183 | Change PreferredReplicaLeaderElectionCommand to use AdminClient TLDR: Adds an AdminClient RPC and a corresponding --bootstrap-server option to kafka-preferred-replica-election.sh for triggering preferred leader elections without ZooKeeper access. The tool previously required --zookeeper access, forcing operators to have ZooKeeper credentials even in environments where direct ZK access was restricted. |
Admin | Accepted | Tom Bentley | 2017-08-02 | 2019-01-28 | KAFKA-5692 | 2.2 | Metadata UpdateMetadata |
| 182 | Reduce Streams DSL overloads and allow easier use of custom storage engines TLDR: KIP-182 reduces the KafkaStreams DSL API surface by replacing many overloaded per-parameter methods with a small set of parameter-object overloads using `Produced`, `Consumed`, `Serialized`, `Joined`, and `Materialized` config objects. The explosion of method overloads (e.g. 8 for `KStream.print()`) made the DSL hard to navigate in IDEs and made it difficult to add per-operator state-store options without adding yet more overloads. |
Streams | Accepted | Damian Guy | 2017-07-26 | 2018-12-12 | KAFKA-5651 | 1.0 | |
| 180 | Add a broker metric specifying the number of consumer group rebalances in progress TLDR: KIP-180 adds JMX metrics counting the number of consumer groups in each coordinator state (`PreparingRebalance`, `CompletingRebalance`, `Stable`, `Dead`, `Empty`) and renames the confusingly named `AwaitingSync` state to `CompletingRebalance`. Without per-state metrics, operators cannot distinguish between a cluster that is continuously rebalancing and one that is stable, making it impossible to set meaningful rebalance-rate alerts. |
Consumer Metrics | Accepted | Colin McCabe | 2017-07-20 | 2017-08-18 | KAFKA-5565 | 1.1 | |
| 178 | Size-based log directory selection strategy | Discussion | Tom Bentley | 2017-07-19 | 2017-07-19 | ||||
| 177 | Consumer perf tool should count rebalance time TLDR: Adds rebalance duration tracking to the `ConsumerPerformance` tool, reporting time spent in rebalance separately from message consumption time. Without this, performance benchmarks that trigger rebalances conflate consumer group coordination overhead with actual throughput, making cross-version comparisons misleading when rebalance behavior changes. |
Consumer Metrics | Accepted | huxihx | 2017-07-13 | 2017-08-09 | KAFKA-5358 | 1.0 | |
| 176 | Remove deprecated new-consumer option for tools TLDR: Removes the `--new-consumer` flag from `ConsoleConsumer`, `ConsumerPerformance`, and `ConsumerGroupCommand` CLI tools, as the new consumer is now the only supported implementation. The `--new-consumer` flag was already the default and its presence was misleading; the old ZooKeeper-based consumer it distinguished against was already removed. |
Admin Consumer | Accepted | Paolo Patierno | 2017-07-17 | 2018-06-06 | KAFKA-5588 | 2.0 | |
| 175 | Additional '--describe' views for ConsumerGroupCommand TLDR: Extends kafka-consumer-groups.sh --describe with additional flags: --members (show group members and their assignments), --offsets (show committed offsets and lag), and --state (show high-level group state). The original single-view output mixed members without assignments and partitions without consumers, producing confusing tables with many empty columns for large groups. |
Admin Consumer | Accepted | Vahid Hashemian | 2017-07-03 | 2017-12-14 | KAFKA-5526 | 1.1 | |
| 174 | Deprecate and remove internal converter configs in WorkerConfig TLDR: Deprecates and then removes `internal.key.converter` and `internal.value.converter` from Kafka Connect `WorkerConfig`, hardcoding `JsonConverter` as the only supported internal format. These configs were exposed prematurely before classloader isolation existed; they have caused more operator confusion than flexibility, and no one needs to swap internal converters in practice. |
Connect | Accepted | Umesh | 2017-07-03 | 2018-05-22 | KAFKA-5540 | 2.0 | |
| 173 | Add prefix to StreamsConfig to enable setting default internal topic configs TLDR: KIP-173 adds a `topic.` prefix to `StreamsConfig` so any property starting with that prefix is forwarded as a default topic config when Kafka Streams creates internal topics (repartition, changelog). Previously there was no way to set default topic-level configs (e.g. `topic.replication.factor`, `topic.min.insync.replicas`) globally for all internally managed topics without using the cumbersome `StateStoreSupplier` API. |
Streams | Accepted | Damian Guy | 2017-06-30 | 2017-07-21 | KAFKA-3741 | 1.0 | |
| 171 | Extend Consumer Group Reset Offset for Stream Application TLDR: Extends the `kafka-streams-application-reset` tool to support all the offset reset strategies already available in `kafka-consumer-groups --reset-offsets` (e.g., `--to-datetime`, `--by-duration`, `--to-offset`, `--shift-by`), not just `--to-earliest`. The Streams resetter only supported returning to the earliest offset, making it impossible to reset to a specific timestamp or relative offset without switching to the lower-level consumer groups tool. |
Consumer Streams Admin | Accepted | Jorge Esteban Quilcate Otoya | 2017-06-26 | 2020-02-08 | KAFKA-5520 | 1.1 | |
| 169 | Lag-Aware Partition Assignment Strategy TLDR: Proposes a LagAwareAssignor partition assignment strategy that distributes partitions across consumer group members to equalize total consumer lag rather than partition count. RangeAssignor and RoundRobinAssignor assign by count; when partitions have highly asymmetric lag, count-balanced assignments cause some consumers to be idle while others are overwhelmed. |
Consumer | Discussion | Grant Neale | 2017-06-18 | 2017-06-18 | KAFKA-5337 | ||
| 168 | Add GlobalTopicCount and GlobalPartitionCount metric per cluster TLDR: Adds `GlobalTopicCount` and `GlobalPartitionCount` gauge metrics on the Kafka Controller, reflecting the total number of topics and partitions across the entire cluster. No existing metric gives a cluster-wide topic/partition count; operators must call `listTopics()` or query ZooKeeper to answer this basic capacity question. |
Metrics Broker | Accepted | Abhishek Mendhekar | 2017-06-16 | 2017-07-19 | KAFKA-5461 | 1.0 | |
| 167 | Add interface for the state store restoration process TLDR: Adds a `StateRestoreListener` interface to Kafka Streams with callbacks for restoration start, progress, and completion, plus a bulk `restoreAll(Collection<KeyValue>)` method on `BatchingStateRestoreCallback` for stores that can optimize bulk writes (e.g., RocksDB's `WriteBatch`). State store restoration currently restores records one-by-one and provides no progress visibility, making recovery time unpredictable and RocksDB's efficient bulk write path unused. |
Streams | Accepted | Bill Bejeck | 2017-06-01 | 2017-09-28 | KAFKA-5363 | 1.0 | |
| 166 | Add a tool to make amounts of replicas and leaders on brokers balanced TLDR: Proposes a `kafka-replica-balancer.sh` tool that rebalances the distribution of partition replicas and preferred leaders evenly across all brokers. After broker additions, decommissions, or rolling restarts, replicas become imbalanced (some brokers hold more leaders than others), degrading throughput; the existing `kafka-preferred-replica-election.sh` only elects preferred leaders without rebalancing replica assignment. |
Admin Broker | Discussion | Ma Tianchi | 2017-06-01 | 2017-06-22 | KAFKA-5319 | ||
| 164 | Add UnderMinIsrPartitionCount and per-partition UnderMinIsr metrics TLDR: Adds a `UnderMinIsrPartitionCount` gauge metric at the broker level and per-partition `UnderMinIsr` indicators, tracking how many partition-leaders have fewer in-sync replicas than `min.insync.replicas`. The existing `UnderReplicatedPartitions` metric counts partitions below full replication but does not distinguish between partitions that merely lag versus partitions that will reject `acks=all` produce requests, making availability monitoring imprecise. |
Metrics Broker | Accepted | Dong Lin | 2017-05-28 | 2017-11-08 | KAFKA-5341 | 1.0 | |
| 163 | Lower the Minimum Required ACL Permission of OffsetFetch TLDR: Lowers the minimum ACL permission required for the `OffsetFetch` API from `READ` on the group to `DESCRIBE` on the group, so read-only operators can inspect consumer group offsets without being granted full `READ` access. The `READ` permission on a consumer group also grants the ability to join the group and consume data, which is excessive for monitoring-only use cases. |
Security Protocol | Accepted | Vahid Hashemian | 2017-05-26 | 2017-09-07 | KAFKA-4585 | 1.0 | |
| 162 | Enable topic deletion by default TLDR: Changes the default value of delete.topic.enable from false to true and removes the commented-out line from the default server.properties. Topic deletion had been gated behind an opt-in flag for years; the feature had been stable for two years and the default-off behavior was a consistent source of confusion in development environments. |
Admin Broker | Accepted | Gwen Shapira | 2017-05-26 | 2017-07-20 | KAFKA-5384 | 1.0 | |
| 161 | streams deserialization exception handlers TLDR: Introduces a DeserializationExceptionHandler SPI for Kafka Streams that lets users define custom behavior (log-and-continue or fail) when a record fails to deserialize. Without this, a single malformed record (poison pill) in a Streams input topic terminates the entire stream task, requiring manual offset manipulation or application code changes to skip bad records. |
Streams | Accepted | Eno Thereska | 2017-05-25 | 2018-06-13 | KAFKA-5157 | 1.0 | |
| 160 | Augment KStream.print(), KStream.writeAsText() to allow users pass in extra parameters in the printed string TLDR: KIP-160 adds overloads to `KStream.print()` and `KStream.writeAsText()` that accept a `KeyValueMapper<K, V, String>` so callers can produce custom-formatted output strings instead of the fixed `[key, value]` default. The hardcoded format prevents users from emitting contextually useful representations (e.g. including a timestamp prefix or field projection) without forking Streams internals. |
Streams | Accepted | jameschien | 2017-05-22 | 2017-07-06 | KAFKA-4830 | 1.0 | |
| 158 | Kafka Connect should allow source connectors to set topic-specific settings for new topics TLDR: KIP-158 enables Kafka Connect source connectors to declare per-topic configuration (replication factor, partition count, cleanup policy, etc.) so the Connect runtime creates destination topics automatically if they do not exist, controlled by a new `topic.creation.enable` worker config. Previously, Connect relied on broker auto-topic-creation or required manual pre-creation, which was error-prone and impossible when `auto.create.topics.enable=false`. |
Connect | Accepted | Randall Hauch | 2017-05-19 | 2020-06-05 | KAFKA-5295 | 2.6 | |
| 154 | Add Kafka Connect configuration properties for creating internal topics TLDR: Adds `offset.storage.*`, `config.storage.*`, and `status.storage.*` config prefixes to Kafka Connect `WorkerConfig`, allowing operators to set replication factor, cleanup policy, compaction settings, and other topic configs for the three internal Connect topics. Previously, internal topic creation used hardcoded defaults (replication.factor=1 for offsets, etc.), making production deployments with replication impossible without manual topic pre-creation. |
Connect | Accepted | Randall Hauch | 2017-05-05 | 2017-05-18 | KAFKA-4667 | 1.0 | |
| 153 | Include only client traffic in BytesOutPerSec metric TLDR: Adds `ReplicationBytesInPerSec` and `ReplicationBytesOutPerSec` metrics to the broker, and changes `BytesOutPerSec` to exclude inter-broker replication traffic (counting only client consumer bytes). The existing `BytesOutPerSec` mixes consumer client traffic with follower replication traffic, making it impossible to measure actual client-facing egress bandwidth. |
Metrics Broker | Accepted | Jun Rao | 2017-05-05 | 2017-05-11 | KAFKA-5194 | 1.0 | |
| 152 | Improve diagnostics for SASL authentication failures TLDR: Changes broker SASL authentication to send a structured SaslAuthenticateResponse with an error code and human-readable message before closing the connection on authentication failure. Previously the broker silently closed the TCP connection on auth failure; clients could not distinguish authentication failures from network errors, making security incidents hard to diagnose. |
Security | Accepted | Rajini Sivaram | 2017-05-04 | 2019-02-12 | KAFKA-4764 | 1.0 | |
| 151 | Expose Connector type in REST API TLDR: Adds a type field (source or sink) and a version field to the Kafka Connect REST API responses for connector plugin listings. The Connect REST API previously returned only connector class names, with no reliable way to determine connector type without brittle class-name heuristics. |
Connect Admin | Accepted | dan norwood | 2017-05-02 | 2017-07-28 | KAFKA-4343 | 1.0 | |
| 150 | Kafka-Streams Cogroup TLDR: KIP-150 introduces the `CogroupedKStream` API to the Kafka Streams DSL, allowing multiple input streams to be co-grouped and aggregated into a single KTable using a single shared state store per key space. Without co-grouping, combining N streams into a single aggregate requires N intermediate KTables each with its own state store plus a chain of outer joins, creating high state amplification and complex topologies. |
Streams | Accepted | Kyle Winkelman | 2017-05-01 | 2019-10-29 | KAFKA-6049 | 2.5 | |
| 149 | Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner TLDR: Extends `ValueTransformer`, `ValueMapper`, and `ValueJoiner` interfaces in Kafka Streams to expose the associated record key as a read-only parameter alongside the value. Without key access, developers must use the heavier `Transformer`/`KeyValueMapper` interfaces even when only inspecting the key, producing unnecessary `KeyValue` object allocations. |
Streams | Accepted | Jeyhun Karimov | 2017-05-01 | 2021-02-19 | KAFKA-4001 | 1.1 | |
| 148 | Add a connect timeout for client TLDR: Proposes a socket.connection.setup.timeout.ms configuration to time out TCP connection attempts that stall without completing the three-way handshake. When a broker's OS crashed mid-connection, the client's non-blocking connect could stall for minutes until the TCP stack timed out, blocking metadata updates and halting produce/consume. |
Client | Discussion | pengwei | 2017-04-23 | 2017-05-29 | KAFKA-4862 | ||
| 146 | Classloading Isolation in Connect TLDR: KIP-146 introduces classloader isolation for Kafka Connect plugins (connectors, transformations, converters), giving each plugin its own isolated `ClassLoader` so conflicting library versions between plugins or between a plugin and the Connect framework do not interfere. Without isolation, loading multiple connectors with conflicting transitive dependencies into the same JVM causes `ClassCastException` or incorrect behavior that is hard to diagnose. |
Connect | Accepted | Konstantine Karantasis | 2017-04-28 | 2017-05-18 | KAFKA-3487 | 1.0 | |
| 145 | Expose Record Headers in Kafka Connect TLDR: Exposes Kafka record headers to Kafka Connect by adding a Headers abstraction to ConnectRecord with typed values (via Schema) and a HeaderConverter SPI for serialization/deserialization. KIP-82 introduced headers at the broker/client level, but the Connect framework had no mechanism to propagate or transform header data in connectors and SMTs. |
Connect | Accepted | Michael André Pearce | 2017-04-29 | 2018-01-31 | KAFKA-5142 | 1.1 | |
| 144 | Exponential backoff for broker reconnect attempts TLDR: Adds a reconnect.backoff.max.ms configuration to all Kafka clients, enabling exponential backoff with ±20% jitter on broker reconnect attempts, capped at the new maximum. The existing constant reconnect.backoff.ms (default 100ms) caused connection storms against brokers that were down for extended periods. |
Client | Accepted | Ismael Juma | 2017-04-27 | 2017-05-10 | KAFKA-3878 | 1.0 | |
| 143 | Controller Health Metrics TLDR: KIP-143 adds controller health metrics including `ActiveControllerCount`, `EventQueueSize`, `EventQueueTimeMs`, and `ControllerState` to detect slow progress or deadlocks in the Kafka controller. Existing controller metrics are insufficient to distinguish a healthy controller from one that is alive but making no progress on the event queue. |
Metrics Broker | Accepted | Ismael Juma | 2017-04-25 | 2017-05-24 | KAFKA-5135 | 1.0 | |
| 142 | Add ListTopicsRequest to efficiently list all the topics in a cluster TLDR: Proposes a dedicated `ListTopicsRequest` protocol RPC that returns only topic names (and optionally a single metadata flag per topic), avoiding the full broker+partition metadata payload of `MetadataRequest`. In large clusters, calling `MetadataRequest` to list topics returns megabytes of broker and partition detail not needed for a simple topic enumeration. |
Admin Protocol | Discussion | Colin McCabe | 2017-04-20 | 2017-04-24 | Metadata | ||
| 140 | Add administrative RPCs for adding, deleting, and listing ACLs TLDR: Adds `CreateAcls`, `DeleteAcls`, and `DescribeAcls` protocol RPCs and exposes them via the `AdminClient` API as part of the KIP-117 admin API rollout. Without these RPC endpoints, ACL management requires direct ZooKeeper access or the Kafka command-line tools, preventing programmatic ACL management from JVM clients. |
Security Admin Protocol | Accepted | Colin McCabe | 2017-04-14 | 2017-07-20 | KAFKA-3266 | 1.0 | DescribeAcls CreateAcls DeleteAcls |
| 138 | Change punctuate semantics TLDR: KIP-138 refactors the Kafka Streams `Punctuator` API to support two independent scheduling modes: `STREAM_TIME` (driven by record timestamps) and `WALL_CLOCK_TIME` (driven by system wall-clock time), each schedulable independently per processor. Previously `punctuate()` was triggered only by stream-time advance, so wall-clock-based periodic actions (e.g., flushing caches) stalled whenever any input partition stopped receiving records. |
Streams | Accepted | Michal Borowiecki | 2017-04-03 | 2017-09-08 | KAFKA-5233 | 1.0 | |
| 137 | Enhance TopicCommand --describe to show topics marked for deletion TLDR: Extends `kafka-topics.sh --describe` to include a `MarkedForDeletion: true` indicator for topics pending deletion, and ensures these topics are labeled correctly when combined with `--under-replicated-partitions` or `--unavailable-partitions`. Previously, `--describe` showed no deletion status (visible only in `--list`), causing confusion when a topic appeared under-replicated because it was being deleted. |
Admin | Accepted | Mickael Maison | 2017-03-30 | 2017-05-18 | KAFKA-4291 | ||
| 136 | Add Listener name to SelectorMetrics tags TLDR: Adds listenerName and securityProtocol tags to the socket-server-metrics group so per-network-processor metrics (connection-count, connection-close-rate, etc.) can be filtered by listener. With multiple listeners on a broker, network processor IDs were bare integers, making it impossible to attribute socket-level metrics to a specific listener without knowing the processor-to-listener mapping. |
Metrics Broker | Accepted | Edoardo Comar | 2017-03-30 | 2017-05-12 | KAFKA-4982 | 1.0 | |
| 135 | Send of null key to a compacted topic should throw non-retriable error back to user TLDR: KIP-135 changes the broker to return a new non-retriable `INVALID_RECORD_KEY` error code when a producer sends a null key to a log-compacted topic, instead of the retriable `CorruptRecordException` currently returned. The retriable error causes the producer to exhaust its retry budget and eventually throw a `TimeoutException`, hiding the real cause and delaying failure feedback. |
Producer Broker | Discussion | Mayuresh Gharat | 2017-03-21 | 2017-04-19 | KAFKA-4808 | Produce | |
| 134 | Delay initial consumer group rebalance TLDR: Introduces a group.initial.rebalance.delay.ms config that delays the initial rebalance of a new consumer group, giving members time to join before the first assignment is computed. Without this delay, each member joining triggers a new rebalance, causing O(N) rebalances for an N-member group startup — each one requiring stateful Streams apps to replay changelog topics. |
Consumer | Accepted | Damian Guy | 2017-03-21 | 2019-04-03 | KAFKA-4925 | 1.0 | JoinGroup |
| 133 | Describe and Alter Configs Admin APIs TLDR: Adds DescribeConfigs and AlterConfigs admin API endpoints to the Kafka wire protocol, enabling clients to read and modify topic and broker configurations without direct ZooKeeper access. Before this, tools like kafka-configs.sh required ZooKeeper connectivity, preventing config management in environments where ZooKeeper was locked down. |
Admin Protocol | Accepted | Ismael Juma | 2017-03-16 | 2018-05-12 | KAFKA-3267 | 1.0 | |
| 131 | Add access to OffsetStorageReader from SourceConnector TLDR: Exposes an `OffsetStorageReader` to `SourceConnector` (not just `SourceTask`) so connector implementations can read committed source offsets in the `Connector` class itself. Without this, a `SourceConnector` background thread that scans for new source data (e.g. new files) cannot determine which files have already been fully processed, preventing task reconfiguration based on offset state. |
Connect | Accepted | Florian Hussonnois | 2017-03-07 | 2020-05-04 | KAFKA-4794 | 2.6 | |
| 129 | Streams Exactly-Once Semantics TLDR: Adds exactly-once processing semantics to Kafka Streams by building on KIP-98 transactions and idempotent producer: each stream task wraps its consume-process-produce cycle in a transaction so output records and consumer offsets are committed atomically. Without this, a Streams task crash could re-process input after recovery and produce duplicate output records, violating at-least-once at the application level. |
Streams Transactions | Accepted | Guozhang Wang | 2017-02-28 | 2017-03-30 | KAFKA-4923 | 1.0 | |
| 128 | Add ByteArrayConverter for Kafka Connect TLDR: Adds a ByteArrayConverter to Kafka Connect that passes raw byte[] values through without schema or conversion overhead. Connect's existing converters (JsonConverter, AvroConverter) always deserialize and re-serialize data through the Connect data model, making exact byte-for-byte mirroring and pass-through pipelines inefficient. |
Connect | Accepted | Ewen Cheslack-Postava | 2017-02-25 | 2017-03-14 | KAFKA-4783 | 1.0 | |
| 125 | ZookeeperConsumerConnector to KafkaConsumer Migration and Rollback TLDR: Defines a migration and rollback protocol for upgrading consumer groups from the ZooKeeperConsumerConnector (old high-level consumer using ZooKeeper for group coordination) to the new KafkaConsumer (broker-coordinated group protocol) without downtime. The two consumers use incompatible coordination systems, making a direct in-place migration risky and previously requiring full consumer group downtime. |
Consumer | Discussion | Onur Karaman | 2017-02-20 | 2017-02-21 | KAFKA-4513 | OffsetCommit DescribeGroups | |
| 124 | Request rate quotas TLDR: KIP-124 introduces request-rate quotas to Kafka that throttle clients based on the CPU time their requests consume (network thread + I/O thread time), in addition to the existing byte-rate quotas. A consumer with `fetch.max.wait.ms=0` can send an unbounded number of small fetch requests, overwhelming broker threads without triggering byte-rate quotas. |
Broker Admin | Accepted | Rajini Sivaram | 2017-02-17 | 2017-03-30 | KAFKA-4195 | Fetch OffsetFetch | |
| 122 | Add Reset Consumer Group Offsets tooling TLDR: Adds a `kafka-consumer-groups.sh --reset-offsets` subcommand that can reset consumer group offsets to earliest, latest, a specific offset, a specific datetime, or a relative shift, with `--dry-run` and `--execute` modes. Previously, offset resets required writing custom consumer code that called `KafkaConsumer#seek()`, redeploy the app, then roll it back — a slow and error-prone process. |
Consumer Admin | Accepted | Jorge Esteban Quilcate Otoya | 2017-02-08 | 2018-11-08 | KAFKA-4743 | 1.0 | |
| 121 | Add KStream peek method TLDR: Adds a peek(ForeachAction) method to KStream that applies a side-effect action (e.g., logging, metrics increment) to each record and passes the record unchanged to the next operator. KStream lacked a non-terminal inspection point; developers had to use map() with an identity transformation to observe records mid-pipeline, which could trigger unnecessary repartitioning. |
Streams | Accepted | Steven Schlansker | 2017-02-06 | 2017-04-27 | KAFKA-4720 | ||
| 120 | Cleanup Kafka Streams builder API TLDR: Deprecates TopologyBuilder and KStreamBuilder and replaces them with Topology and StreamsBuilder in the org.apache.kafka.streams package, and adds a TopologyDescription interface for inspecting the processor graph. The old classes leaked internal DSL methods on the processor API builder and mixed DSL and PAPI abstractions, making the public API inconsistent and hard to evolve. |
Streams | Accepted | Matthias J. Sax | 2017-02-03 | 2017-09-08 | KAFKA-3856 | 1.0 | |
| 119 | Drop Support for Scala 2.10 in Kafka 0.11 TLDR: Drops Scala 2.10 support from Kafka's build system starting with Kafka 0.11, retaining only Scala 2.11 and 2.12 cross-compilation. Scala 2.10 reached end-of-life in March 2015; maintaining it triples CI build time per PR, and library authors have already dropped 2.10 support, creating dependency resolution issues. |
Client | Accepted | Ismael Juma | 2017-02-03 | 2017-03-08 | KAFKA-4422 | 1.0 | |
| 118 | Drop Support for Java 7 TLDR: Drops official support and build compatibility for Java 7 in Kafka 2.0, requiring Java 8 as the minimum runtime. Java 7 reached end-of-life in April 2015 and contains unpatched security vulnerabilities; supporting it constrained Kafka's ability to use Java 8 language features (lambdas, streams, default methods) and APIs like CompletableFuture. |
Client | Accepted | Ismael Juma | 2017-02-03 | 2018-05-22 | KAFKA-4423 | 2.0 | |
| 117 | Add a public AdminClient API for Kafka admin operations TLDR: Introduces a stable public `AdminClient` Java API backed by the Kafka binary protocol (not ZooKeeper directly), exposing operations like `createTopics`, `deleteTopics`, `describeTopics`, `listTopics`, and `describeCluster`. Management tools and proxies previously had no stable programmatic API for admin operations and were forced to use internal Scala classes or spawn shell scripts. |
Admin | Accepted | Colin McCabe | 2017-02-02 | 2017-07-20 | KAFKA-3265 | 1.0 | Metadata |
| 115 | Enforce offsets.topic.replication.factor upon __consumer_offsets auto topic creation TLDR: Enforces the `offsets.topic.replication.factor` config during auto-creation of `__consumer_offsets` by blocking auto-creation until enough brokers are available to satisfy the configured replication factor. Previously, `__consumer_offsets` was auto-created with however many brokers were live at the time, potentially creating a single-replica offsets topic in a cluster configured for RF=3. |
Consumer Broker | Accepted | Onur Karaman | 2017-01-25 | 2017-02-02 | KAFKA-3959 | 1.0 | Metadata CreateTopics |
| 114 | KTable state stores and improved semantics TLDR: Clarifies and formalizes KTable materialization semantics in Kafka Streams, introducing a Materialized class that allows users to explicitly name and configure state stores for any KTable operation, and making store creation optional for operations that do not require it. The existing API required providing a store name for aggregation KTables but silently ignored it for filter/map KTables, causing confusion and preventing Interactive Queries on derived tables. |
Streams | Accepted | Eno Thereska | 2017-01-16 | 2017-04-27 | KAFKA-5045 | 1.0 | |
| 113 | Support replicas movement between log directories TLDR: Adds intra-broker replica directory movement support, allowing partition replicas to be migrated between log directories on the same broker (JBOD disks) via a new `AlterReplicaLogDirs` Admin API and inter-broker `ReplicaAlterDirRequest`. On JBOD setups, a failed disk takes all partitions hosted on it offline; without cross-directory movement, the only recovery option is inter-broker replica reassignment, which is far more disruptive and slower. |
Broker Admin | Accepted | Dong Lin | 2017-01-12 | 2019-02-28 | KAFKA-5163 | 1.0 | LeaderAndIsr DescribeLogDirs |
| 112 | Handle disk failure for JBOD TLDR: Enables Kafka brokers to handle individual disk failures in a JBOD configuration by taking the failed disk's log directories offline and continuing to serve partitions on healthy disks, with controllers reassigning replicas away from the failed disk. Without this, a single disk failure on a multi-disk broker kills all partitions on that broker, requiring 4x storage overhead via RAID-10 to achieve the same fault tolerance at higher cost. |
Broker | Accepted | Dong Lin | 2017-01-12 | 2020-01-16 | KAFKA-4763 | 1.0 | Produce Fetch Metadata LeaderAndIsr StopReplica UpdateMetadata |
| 110 | Add Codec for ZStandard Compression TLDR: Adds Zstandard (zstd) as a supported compression codec for Kafka message batches. Zstd offers significantly better compression ratios than LZ4/Snappy at comparable or faster speeds, and major big data frameworks (Hadoop, Spark, HBase) had already adopted it, making its absence a gap for Kafka deployments that prioritize storage efficiency. |
Producer Broker | Accepted | Dongjin Lee | 2017-01-06 | 2018-11-18 | KAFKA-4514 | 2.2 | Produce v7 Fetch v10 |
| 109 | Old Consumer Deprecation TLDR: KIP-109 formally deprecates the old Scala-based Kafka consumer (`kafka.consumer.Consumer`) in favor of the new Java consumer introduced in 0.9.0. Maintaining two consumer implementations with different feature sets and behavior divides engineering effort and creates confusion for users about which consumer to use. |
Consumer | Accepted | Vahid Hashemian | 2017-01-05 | 2017-02-07 | KAFKA-3264 | 1.0 | |
| 108 | Create Topic Policy TLDR: KIP-108 introduces a `CreateTopicPolicy` plugin interface that brokers invoke before creating a topic, allowing administrators to enforce naming conventions, minimum replication factors, partition counts, or other cluster-specific constraints. Without this policy hook, `AdminClient.createTopics()` and auto-topic creation bypass all administrative controls, making it impossible to enforce governance rules without ZooKeeper-level intervention. |
Admin Broker | Accepted | Ismael Juma | 2017-01-04 | 2017-01-26 | KAFKA-4591 | 1.0 | |
| 107 | Add deleteRecordsBefore() API in AdminClient TLDR: Adds a deleteRecords(Map<TopicPartition, RecordsToDelete>) API to AdminClient that issues DeleteRecordsRequest to leaders, truncating partition logs before a given offset. Kafka only offered time-based and size-based log retention, which are consumer-agnostic and unsuitable for stream-processing pipelines that need to reclaim disk space immediately after downstream consumption. |
Admin | Accepted | Dong Lin | 2017-01-03 | 2018-03-26 | KAFKA-4586 | 1.0 | Produce Fetch DeleteRecords |
| 106 | Change Default unclean.leader.election.enabled from True to False TLDR: Changes the default value of `unclean.leader.election.enabled` from `true` to `false`, making durability the default and requiring an explicit opt-in for availability-over-durability mode. With the old default, a broker with a stale replica could silently become leader on ISR failure, causing data loss without operators realizing this was the configured behavior. |
Broker | Accepted | Ben Stopford | 2017-01-03 | 2017-05-16 | KAFKA-5257 | 1.0 | |
| 105 | Addition of Record Level for Sensors TLDR: Introduces a RecordingLevel (INFO vs DEBUG) per Sensor in Kafka's metrics framework and a metrics.recording.level client configuration to filter which sensors are active at runtime. Recording all metrics unconditionally imposed up to 50% CPU overhead in Kafka Streams microbenchmarks due to the cost of computing and storing fine-grained debug metrics in production. |
Metrics | Accepted | aarti gupta | 2016-12-31 | 2017-01-10 | KAFKA-3715 | ||
| 104 | Granular Sensors for Streams TLDR: KIP-104 adds per-processor-node latency and throughput metrics and per-task record count metrics to Kafka Streams at `DEBUG` recording level, togglable via `metrics.record.level`. Existing Streams metrics only provided aggregate global rates, making it impossible to identify which processor node in a topology was a performance bottleneck. |
Streams Metrics | Accepted | aarti gupta | 2016-12-31 | 2017-01-24 | KAFKA-3715 | 1.0 | |
| 103 | Separation of Internal and External traffic TLDR: Introduces named listener configurations (listener.security.protocol.map) that decouple listener names from security protocols, allowing multiple listeners with the same security protocol and enabling separate named listeners for client, internal, and replication traffic. The existing model allowed at most one listener per security protocol, preventing operators from isolating inter-broker replication traffic onto a dedicated network interface. |
Broker Security | Accepted | Ismael Juma | 2016-12-21 | 2017-06-05 | KAFKA-1809 | 1.0 | Metadata UpdateMetadata |
| 102 | Add close with timeout for consumers TLDR: Adds a close(long timeout, TimeUnit unit) overload to the Consumer interface, giving callers a bounded shutdown window during which pending commits and group leave requests are attempted before forcible closure. The existing close() had a hard-coded 5-second graceful shutdown timeout with no way for callers to extend or reduce it, creating symmetry issues with the producer's two-variant close API. |
Consumer | Accepted | Rajini Sivaram | 2016-12-15 | 2017-01-10 | KAFKA-4426 | 1.0 | |
| 101 | Alter Replication Protocol to use Leader Epoch rather than High Watermark for Truncation TLDR: KIP-101 replaces the high watermark (HWM) as the truncation reference point for follower log recovery with a per-epoch `OffsetForLeaderEpoch` RPC, so followers truncate to the first offset of the next leader epoch rather than their locally cached HWM. The HWM-based truncation allowed data loss and log divergence in scenarios where a follower became leader before it had received the HWM update (e.g. during rapid double failovers). |
Broker Protocol | Accepted | Ben Stopford | 2016-12-09 | 2021-04-30 | KAFKA-1211 | 1.0 | Produce Fetch LeaderAndIsr OffsetForLeaderEpoch |
| 100 | Relax Type constraints in Kafka Streams API TLDR: Relaxes generic type bounds in Kafka Streams API methods (e.g., `KStream.filter`, `KStream.map`, `KTable.join`) to use use-site variance (`? super K`, `? super V`, `? extends R`) so that functions typed on supertypes or subtypes are accepted without unchecked casting. The existing invariant bounds force callers to write explicit type casts or duplicate lambda logic when working with type hierarchies. |
Streams | Accepted | Xavier Léauté | 2016-12-08 | 2017-01-19 | KAFKA-4481 | 1.0 | |
| 98 | Exactly Once Delivery and Transactional Messaging TLDR: Introduces exactly-once semantics (EOS) to Kafka via idempotent producers (deduplication via producer ID + sequence number) and multi-partition atomic transactions (committed via a two-phase protocol through the transaction coordinator). Kafka previously guaranteed at-least-once delivery, meaning producer retries on broker failures could produce duplicate records—a correctness problem for stream processing pipelines requiring exactly-once processing guarantees. |
Transactions Producer | Accepted | Apurva Mehta | 2016-11-30 | 2026-03-04 | KAFKA-1639 | 1.0 | Produce Fetch OffsetCommit FindCoordinator AddPartitionsToTxn AddOffsetsToTxn EndTxn WriteTxnMarkers TxnOffsetCommit |
| 97 | Improved Kafka Client RPC Compatibility Policy TLDR: Formalizes a bidirectional client-broker RPC compatibility policy so that new clients can negotiate down to older broker API versions, not just new brokers supporting old clients. The prior one-way policy forced users to upgrade all brokers to a new version before any clients could be upgraded to a matching new version, coupling two independent deployment lifecycles. |
Protocol Client | Accepted | Colin McCabe | 2016-11-29 | 2017-02-01 | KAFKA-4462 | 1.0 | SaslHandshake ApiVersions |
| 96 | Add per partition metrics for in-sync and assigned replica count TLDR: Adds two per-partition JMX gauge metrics exposed by the partition leader: `InSyncReplicasCount` and `ReplicasCount` (total assigned replicas), tagged by topic and partition. The existing metrics only provide cluster-wide or broker-wide replica health summaries; operators cannot identify which specific partition has lost ISR members without querying ZooKeeper or using the AdminClient. |
Metrics Broker | Accepted | Xavier Léauté | 2016-11-28 | 2016-12-07 | KAFKA-4458 | 1.0 | |
| 94 | Session Windows TLDR: Adds session window support to Kafka Streams DSL via `KGroupedStream.windowedBy(SessionWindows.with(...))`, grouping records into variable-length windows that extend whenever a new record arrives within an inactivity gap. Fixed-size (tumbling/hopping) windows cannot model user activity sessions, which have no predetermined duration and are defined by gaps in activity rather than a fixed time interval. |
Streams | Accepted | Damian Guy | 2016-11-21 | 2016-12-09 | KAFKA-3452 | ||
| 93 | Improve invalid timestamp handling in Kafka Streams TLDR: KIP-93 changes the `TimestampExtractor` interface to receive the previously extracted timestamp as a hint and return a non-negative timestamp instead of propagating negative values, and adds a configurable `default.timestamp.extractor` that uses the hint to carry forward the last valid timestamp. Negative timestamps (returned for pre-0.10 message-format records or by custom extractors) caused Kafka Streams to crash with an unhandled exception because time-based operators like windowed aggregates and joins cannot handle them. |
Streams | Accepted | Matthias J. Sax | 2016-11-18 | 2016-12-02 | KAFKA-4393 | ||
| 92 | Add per partition lag metrics to KafkaConsumer TLDR: KIP-92 adds per-partition consumer lag metrics (`records-lag`, `records-lag-avg`, `records-lag-max`) scoped by topic and partition to the `consumer-fetch-manager-metrics` group in `KafkaConsumer`. The existing metric only exposes the maximum lag across all assigned partitions, making it impossible to detect which specific partition is falling behind. |
Consumer Metrics | Accepted | Jiangjie Qin | 2016-11-13 | 2017-01-06 | KAFKA-4381 | ||
| 91 | Provide Intuitive User Timeouts in The Producer TLDR: Introduces delivery.timeout.ms as the single user-visible bound covering the full lifecycle of a produce record—from send() return to callback fire—replacing the confusing interaction between request.timeout.ms and linger.ms. Eliminates the previous non-intuitive behavior where record expiry was anchored to batch readiness rather than the call to send(). |
Producer | Accepted | Sumant Tambe | 2016-11-08 | 2020-05-11 | KAFKA-5886 | 1.1 | |
| 90 | Remove zkClient dependency from Streams TLDR: Removes Kafka Streams' direct ZooKeeper dependency by replacing internal topic creation via ZooKeeper with AdminClient RPC calls to the broker. Streams directly accessed ZooKeeper to create internal topics (repartition, changelog), coupling it to ZK availability and access credentials, and bypassing broker-side topic validation and quota enforcement. |
Streams | Accepted | Hojjat Jafarpour | 2016-11-08 | 2018-05-31 | KAFKA-4060 | ||
| 89 | Allow sink connectors to decouple flush and offset commit TLDR: KIP-89 adds a `SinkTask.preCommit()` method and a `SinkTaskContext.requestCommit()` method to decouple offset commits from periodic buffer flushes in Kafka Connect sink connectors. Previously, the runtime called `flush()` on the fixed `offset.flush.interval.ms` schedule and committed offsets immediately after, forcing connectors that manage their own buffering (size-based, time-based, or not at all) to flush unnecessarily. |
Connect | Accepted | Shikhar Bhushan | 2016-11-04 | 2017-01-02 | KAFKA-4161 | 1.0 | |
| 88 | OffsetFetch Protocol Update TLDR: KIP-88 extends the `OffsetFetchResponse` to return committed offsets for consumer group members even when the group is in the `Empty` state (no active consumers). Previously the CLI and admin tooling returned an error for empty groups, providing no visibility into committed offsets for groups that had previously consumed and then shut down. |
Consumer Protocol | Accepted | Vahid Hashemian | 2016-10-31 | 2018-05-22 | KAFKA-3853 | 1.0 | |
| 86 | Configurable SASL callback handlers TLDR: Introduces pluggable SASL callback handlers for both client and server sides, decoupling credential verification logic from the SASL mechanism implementation via configurable sasl.server.callback.handler.class and sasl.client.callback.handler.class. Kafka's built-in SASL/PLAIN and SASL/SCRAM handlers hard-coded credential lookups (JAAS config or ZooKeeper), making it impossible to integrate alternative credential stores without replacing the entire SaslServer. |
Security | Accepted | Rajini Sivaram | 2016-10-11 | 2018-04-25 | KAFKA-4292 | 2.0 | |
| 85 | Dynamic JAAS configuration for Kafka clients TLDR: Adds a sasl.jaas.config property to Kafka client and broker configs, allowing JAAS login configuration to be specified inline in the Kafka properties file rather than requiring a separate JAAS file. Managing separate JAAS files is cumbersome in containerized and cloud deployments where property-based configuration is standard; also enables different SASL configs per listener without shared JVM state. |
Security | Accepted | Rajini Sivaram | 2016-10-06 | 2019-02-18 | KAFKA-4259 | 1.0 | |
| 84 | Support SASL SCRAM mechanisms TLDR: Adds SASL/SCRAM-SHA-256 and SASL/SCRAM-SHA-512 mechanisms to Kafka brokers and clients, storing hashed credentials in ZooKeeper and allowing dynamic credential rotation without broker restart. Kafka's existing SASL options (GSSAPI and PLAIN) required either a Kerberos infrastructure or storing plaintext passwords in JAAS config, leaving a gap for password-based auth without Kerberos. |
Security | Accepted | Rajini Sivaram | 2016-10-04 | 2017-01-05 | KAFKA-3751 | SaslHandshake | |
| 82 | Add Record Headers TLDR: KIP-82 adds first-class header support to the Kafka record format, allowing producers to attach arbitrary key-value metadata to records separate from the payload. Most messaging systems (JMS, AMQP, HTTP) support headers for transport-level metadata such as routing hints, tracing context, and content type; Kafka had no equivalent, forcing producers to embed this metadata inside the record value. |
Protocol Client | Accepted | Michael André Pearce | 2016-09-22 | 2017-10-19 | KAFKA-4208 | Produce Fetch | |
| 81 | Bound Fetch memory usage in the consumer TLDR: Adds a `max.bytes` consumer fetch configuration that caps the total bytes returned across all partitions in a single `FetchResponse`, independent of per-partition limits. Without a global cap, a consumer subscribed to many partitions would issue parallel fetches to all brokers owning those partitions, making total memory usage unbounded relative to the number of partitions. |
Consumer | Accepted | Mickael Maison | 2016-09-10 | 2021-07-30 | KAFKA-4133 | Fetch | |
| 78 | Cluster Id TLDR: Introduces a `clusterId` — a base64-encoded UUID stored in ZooKeeper at cluster initialization — exposed via the `ApiVersions` response and broker metadata. With multiple clusters becoming common, there is no reliable cluster-unique identifier for monitoring, auditing, or preventing a broker/client from accidentally connecting to the wrong cluster. |
Admin Broker | Accepted | Ismael Juma | 2016-08-27 | 2016-11-07 | KAFKA-4093 | 1.0 | Metadata |
| 77 | Improve Kafka Streams Join Semantics TLDR: Revises Kafka Streams KStream-KStream, KStream-KTable, and KTable-KTable join semantics to be more predictable: specifying that KTable-KTable joins produce results only when both sides have a value for a key, and clarifying windowed stream-stream join behavior for out-of-order records. The original join semantics produced counterintuitive results (e.g., joins triggered on tombstones, asymmetric left-join behavior) that diverged from user expectations rooted in SQL semantics. |
Streams | Accepted | Matthias J. Sax | 2016-08-17 | 2017-01-02 | KAFKA-4001 | 1.0 | |
| 75 | Add per-connector Converters TLDR: Allows per-connector overrides of key.converter and value.converter in Kafka Connect, falling back to the worker-level defaults when not specified. The worker-level converter configuration was global and immutable per Connect cluster, forcing operators to run separate clusters when different connectors required different data formats. |
Connect | Accepted | Ewen Cheslack-Postava | 2016-08-12 | 2016-08-18 | KAFKA-3845 | 1.0 | |
| 74 | Add Fetch Response Size Limit in Bytes TLDR: Adds a `max.partition.fetch.bytes` replacement: introduces `fetch.max.bytes` as a total fetch response size limit across all partitions in a single `FetchRequest`, alongside the existing per-partition `max.partition.fetch.bytes`. Without a global fetch size cap, a consumer subscribed to thousands of partitions must allocate `max.partition.fetch.bytes * numPartitions` of heap for fetch responses, which can be gigabytes. |
Consumer Protocol | Accepted | Andrey Neporada | 2016-08-08 | 2017-01-23 | KAFKA-2063 | 1.0 | Fetch |
| 73 | Replication Quotas TLDR: Introduces replication throttling via per-topic `leader.replication.throttled.replicas`/`follower.replication.throttled.replicas` configurations and per-broker `leader.replication.throttled.rate`/`follower.replication.throttled.rate` rate limits. Without throttling, data-intensive administrative operations (partition rebalancing, broker add/remove) saturate inter-broker network links and degrade producer/consumer latency unpredictably. |
Broker Admin | Accepted | Ben Stopford | 2016-08-08 | 2016-11-07 | KAFKA-1464 | 1.0 | Fetch |
| 72 | Allow putting a bound on memory consumed by Incoming request TLDR: Introduces queued.max.request.bytes, a byte-level memory cap on the broker's incoming request queue, complementing the existing queued.max.requests count limit. The count-based limit cannot prevent OOM errors when a burst of large requests (e.g., from Hadoop) simultaneously exhausts JVM heap before the count threshold is reached. |
Broker | Accepted | Radai Rosenblatt | 2016-08-04 | 2022-12-08 | KAFKA-4602 | 1.0 | |
| 71 | Enable log compaction and deletion to co-exist TLDR: KIP-71 extends the `cleanup.policy` topic config to accept a comma-separated list of policies (`compact,delete`), allowing both log compaction and time/size-based segment deletion to run concurrently on the same topic. Previously the policies were mutually exclusive, making it impossible to retain only the latest value per key while still expiring segments older than a configurable window (a key requirement for Kafka Streams join windows). |
Broker | Accepted | Damian Guy | 2016-08-03 | 2016-11-07 | KAFKA-4015 | ||
| 70 | Revise Partition Assignment Semantics on New Consumer's Subscription Change TLDR: Changes `KafkaConsumer` behavior on subscription changes (`subscribe`/`unsubscribe`) so that removed partitions have their offsets committed before the rebalance is triggered, rather than losing uncommitted offsets immediately. When `subscribe()` is called with a new topic list, partitions removed from the subscription lose any pending uncommitted offsets because the rebalance is triggered before the commit can happen. |
Consumer | Accepted | Vahid Hashemian | 2016-06-08 | 2016-11-07 | KAFKA-4033 | 1.0 | |
| 67 | Queryable state for Kafka Streams TLDR: Introduces Interactive Queries (IQ) in Kafka Streams, exposing read-only access to local state stores (key-value, windowed, session) via KafkaStreams.store() and making store host metadata discoverable via KafkaStreams.allMetadataForStore() for cross-instance routing. Application state was previously fully encapsulated inside Streams tasks with no supported API for external querying, forcing developers to duplicate state into external databases. |
Streams | Accepted | Eno Thereska | 2016-06-28 | 2016-11-07 | KAFKA-3909 | ||
| 66 | Single Message Transforms for Kafka Connect TLDR: Introduces a Single Message Transforms (SMT) API for Kafka Connect that allows lightweight, stateless, per-record transformations (InsertField, ReplaceField, MaskField, ValueToKey, Cast, TimestampConverter, etc.) to be chained in a connector configuration. Without SMTs, any record transformation required writing a full custom connector or post-processing outside Connect, adding operational complexity. |
Connect | Accepted | Nisarg Shah | 2016-07-09 | 2017-06-14 | KAFKA-3209 | 1.0 | |
| 65 | Expose timestamps to Connect TLDR: KIP-65 propagates Kafka record timestamps (added in KIP-32) through the Connect framework by exposing them on `SourceRecord` and `SinkRecord`. Connect used its own record wrapper that omitted timestamps, so source connectors could not set record timestamps and sink connectors could not inspect them for time-based processing. |
Connect | Accepted | Shikhar Bhushan | 2016-06-23 | 2016-11-07 | KAFKA-3846 | 1.0 | |
| 63 | Unify store and downstream caching in streams TLDR: Unifies the downstream forwarding cache with the local state store cache in Kafka Streams, so both write paths share the same record.cache.max.bytes.buffering budget and produce a single deduplicated stream of downstream emissions per cache flush. The separate downstream cache resulted in redundant Change-value records being forwarded (e.g., intermediate aggregation results) and caused unnecessary Kafka writes and downstream processing load. |
Streams | Accepted | Eno Thereska | 2016-06-02 | 2016-11-07 | KAFKA-3776 | ||
| 62 | Allow consumer to send heartbeats from a background thread TLDR: Moves consumer group heartbeat sending to a dedicated background `HeartbeatThread`, decoupling heartbeat liveness from the speed of the user's `poll()` processing loop. When message processing in a `poll()` loop takes longer than `session.timeout.ms`, the consumer misses heartbeats, causing a spurious rebalance even if the consumer is healthy and actively processing. |
Consumer | Accepted | Jason Gustafson | 2016-05-25 | 2017-10-19 | KAFKA-3888 | 1.0 | |
| 60 | Make Java client classloading more flexible TLDR: KIP-60 changes Kafka client classloading so that default plugin classes use static classloading and `Type.LIST` configs can accept actual `Class` objects in addition to class-name strings. This allows OSGi and other modular classloading environments to supply pre-loaded class instances directly, bypassing the thread-context classloader (TCCL) that OSGi environments do not set. |
Client | Discussion | Rajini Sivaram | 2016-05-09 | 2016-11-07 | KAFKA-3680 | ||
| 59 | Proposal for a kafka broker command TLDR: Proposes a new kafka-broker.sh command-line tool to query cluster and broker metadata (broker list, rack info, controller) through the AdminClient protocol. Kafka lacked a dedicated CLI for broker-level introspection, forcing operators to rely on JMX or indirect topic metadata commands. |
Admin Broker | Discussion | Jayesh Thakrar | 2016-05-09 | 2017-10-13 | KAFKA-3663 | ||
| 58 | Make Log Compaction Point Configurable TLDR: Adds a min.compaction.lag.ms configuration that guarantees a minimum time window during which messages in a compacted topic's log head are retained uncompacted. Without a lower bound on the uncompacted head, the compactor could compact a topic down to the last message written, causing consumers that fell even slightly behind to read only the final compacted value instead of intermediate updates. |
Broker | Accepted | Eric Wasserman | 2016-05-09 | 2016-11-07 | KAFKA-1981 | ||
| 57 | Interoperable LZ4 Framing TLDR: KIP-57 fixes Kafka's LZ4 compression to conform to the standard LZ4 framing specification (LZ4F), replacing the non-standard framing used since initial implementation. Non-conformant LZ4 framing prevents third-party clients from using standard LZ4 libraries to compress or decompress Kafka messages, breaking interoperability. |
Producer Broker | Accepted | Dana Powers | 2016-04-25 | 2016-05-08 | KAFKA-3160 | Produce Fetch | |
| 56 | Allow cross origin HTTP requests on all HTTP methods TLDR: KIP-56 adds an `access.control.allow.methods` worker config to the Kafka Connect REST server so that CORS preflight responses include the operator-specified HTTP methods beyond the defaults (GET, POST, HEAD). Without it, browser-based tooling and dashboards cannot issue PUT, DELETE, or PATCH requests to the Connect REST API from a different origin. |
Connect | Accepted | Liquan Pei | 2016-04-18 | 2016-04-29 | KAFKA-3578 | ||
| 55 | Secure Quotas for Authenticated Users TLDR: Extends the Kafka quota system to support per-user and per-user-per-client-id quota bindings enforced by the broker based on the authenticated principal, in addition to the existing per-client-id quotas. Prior to this, quotas were enforced by client-id alone, which is an unauthenticated field any client can forge, making it impossible to enforce fair resource allocation in multi-tenant secure clusters. |
Security Broker | Accepted | Rajini Sivaram | 2016-04-18 | 2016-11-01 | KAFKA-3492 | ||
| 54 | Sticky Partition Assignment Strategy TLDR: Introduces the StickyAssignor, a partition assignment strategy that produces balanced assignments while minimizing partition movement across rebalances by preferring to keep existing assignments stable. Standard assignors recompute assignments from scratch on every rebalance, unnecessarily moving partitions between consumers even when the existing assignment was already balanced. |
Consumer | Accepted | Vahid Hashemian | 2016-04-14 | 2018-10-17 | KAFKA-2273 | 1.0 | |
| 52 | Connector Control APIs TLDR: KIP-52 adds pause, resume, and restart control APIs to Kafka Connect, allowing operators to temporarily suspend a connector or one of its tasks and later resume it without removing and re-adding the connector configuration. After submission, a connector can only be reconfigured or deleted; there is no way to pause processing during upstream/downstream maintenance without losing the connector's state. |
Connect Admin | Discussion | Jason Gustafson | 2016-03-30 | 2016-04-07 | KAFKA-2370 | ||
| 51 | List Connectors REST API TLDR: Adds a GET /connector-plugins REST endpoint to Kafka Connect that returns the list of instantiable connector classes available on the worker's classpath. There was no API-based way to discover which connector implementations were loaded in a Connect cluster, requiring manual classpath inspection or documentation lookup. |
Connect Admin | Accepted | Ewen Cheslack-Postava | 2016-03-22 | 2016-03-31 | KAFKA-3316 | ||
| 48 | Delegation token support for Kafka TLDR: Introduces delegation tokens as lightweight, broker-shared secrets derived from Kerberos TGTs, enabling clients to authenticate without repeated KDC round-trips via a new CreateDelegationToken / RenewDelegationToken / ExpireDelegationToken / DescribeDelegationToken protocol. Kerberos-only setups forced every client to hold a keytab or TGT, creating performance bottlenecks on the KDC, large blast radius on credential compromise, and high operational overhead for distributed processing frameworks. |
Security | Accepted | Parth | 2016-02-17 | 2018-01-13 | KAFKA-1696 | CreateDelegationToken RenewDelegationToken ExpireDelegationToken DescribeDelegationToken | |
| 43 | Kafka SASL enhancements TLDR: KIP-43 adds a pluggable SASL mechanism framework to Kafka, allowing operators to configure SASL mechanisms beyond GSSAPI/Kerberos (e.g., PLAIN, SCRAM, OAuth) by implementing a `SaslClientCallbackHandler` / `SaslServerCallbackHandler` interface. Kafka 0.9's SASL support was hardcoded to GSSAPI, preventing integration with non-Kerberos authentication infrastructure. |
Security | Accepted | Rajini Sivaram | 2016-01-25 | 2016-05-03 | KAFKA-3149 | SaslHandshake | |
| 42 | Add Producer and Consumer Interceptors TLDR: Introduces ProducerInterceptor and ConsumerInterceptor plugin interfaces that are invoked before send and after poll respectively, enabling cross-cutting concerns (tracing, monitoring, header injection) to be applied without modifying application code. Before this, end-to-end message-level observability required modifying every application individually, which was impractical for shared infrastructure. |
Client | Accepted | Anna Povzner | 2016-01-23 | 2016-02-29 | KAFKA-2950 | ||
| 41 | Consumer Max Records TLDR: KIP-41 introduces the `max.poll.records` configuration for `KafkaConsumer`, capping the number of records returned by a single `poll()` call. Without this limit, `poll()` returns all buffered records up to `fetch.max.bytes`, which can be arbitrarily large, making it impossible to bound per-poll processing time and causing consumer poll timeout violations when downstream processing is slow. |
Consumer | Accepted | Jason Gustafson | 2015-12-22 | 2016-01-20 | KAFKA-2986 | 1.0 | |
| 40 | ListGroups and DescribeGroup TLDR: Adds `ListGroups` and `DescribeGroup` broker APIs (key=16 and key=15) to expose consumer group membership and partition assignment information for new-consumer groups whose metadata is stored in Kafka rather than ZooKeeper. Without these APIs, tooling and operators have no way to inspect the state of new-consumer groups through the broker protocol, blocking monitoring and administration. |
Consumer Protocol Admin | Accepted | Jason Gustafson | 2015-10-25 | 2016-08-23 | KAFKA-2687 | Metadata ListGroups | |
| 38 | ZooKeeper Authentication TLDR: Adds SASL-based ZooKeeper authentication to Kafka brokers so that the metadata stored in ZooKeeper is only accessible to authenticated clients, using ZooKeeper ACLs to restrict write access. ZooKeeper metadata is currently world-readable and world-writable, allowing any client with ZooKeeper ensemble access to corrupt cluster metadata. |
Security | Accepted | Flavio Paiva Junqueira | 2015-10-19 | 2016-01-06 | KAFKA-2639 | ||
| 36 | Rack aware replica assignment TLDR: Adds a broker.rack configuration and rack-aware replica assignment logic that spreads partition replicas across as many distinct racks as possible when creating or reassigning partitions. Without rack awareness, Kafka's replica placement was purely broker-count-balanced and could place all replicas in the same rack, losing fault isolation for a full rack failure (e.g., an AWS availability zone outage). |
Broker | Accepted | Allen Wang | 2015-09-27 | 2016-03-04 | KAFKA-3100 | Metadata UpdateMetadata | |
| 35 | Retrieving protocol version TLDR: KIP-35 introduces the `ApiVersions` request/response RPC, allowing clients to query a broker for the min and max supported version of each API. Without version negotiation, clients must hardcode the broker API version they target, preventing a single client binary from working correctly across multiple Kafka broker versions. |
Protocol Client | Accepted | Magnus Edenhill | 2015-09-25 | 2018-04-07 | KAFKA-3304 | 1.0 | Metadata |
| 33 | Add a time based log index TLDR: Adds a time-based index (.timeindex file) per log segment that maps message timestamps to file offsets, enabling accurate offset-by-timestamp lookups (ListOffsets by timestamp) and time-based log rolling/retention at per-message-timestamp granularity. The previous implementation relied on file modification times for time-based operations, which broke on replica reassignment and provided only segment-level (coarse) timestamp resolution. |
Broker | Accepted | Jiangjie Qin | 2015-09-10 | 2016-09-01 | KAFKA-3163 | 1.0 | |
| 32 | Add timestamps to Kafka message TLDR: Adds a per-message timestamp field to the Kafka message format (both CreateTime and LogAppendTime modes), enabling log retention and rolling to be driven by message timestamps rather than file modification times. File modification time was unreliable after replica reassignment (timestamps reset to current time) and insufficient for streaming processing use cases that require event-time semantics. |
Broker Protocol | Accepted | Jiangjie Qin | 2015-09-09 | 2019-05-23 | KAFKA-2511 | 1.0 | Produce Fetch |
| 31 | Move to relative offsets in compressed message sets TLDR: Changes the message format in compressed message sets to use relative (inner) offsets instead of absolute offsets, so the broker can append the message set without decompressing and re-compressing it just to assign offsets. The broker was forced to decompress every compressed batch, assign per-message absolute offsets, and recompress — a significant CPU cost on the hot write path. |
Broker Protocol | Accepted | Jiangjie Qin | 2015-09-03 | 2017-04-06 | KAFKA-2511 | 1.0 | Produce Fetch |
| 28 | Add a processor client TLDR: Surveys and compares related stream processing frameworks (MillWheel, Spark Streaming, Storm, Samza, Flink) to provide design context for Kafka Streams. Understanding prior art in watermarking, out-of-order handling, state management, and windowing is necessary before defining Kafka Streams' own approach to these problems. |
Streams | Discussion | Guozhang Wang | 2015-07-28 | 2015-10-24 | 1.0 | ||
| 25 | System test improvements TLDR: KIP-25 overhauls Kafka's system test infrastructure by migrating from the legacy shell-script-based test suite to Ducktape, a Python-based distributed testing framework with service abstraction, failure injection, and cross-version compatibility testing. The existing tests were brittle, hard to maintain, and incapable of systematically testing rolling upgrades or multi-version scenarios. |
Testing | Accepted | Geoff Anderson | 2015-05-21 | 2015-07-18 | KAFKA-2276 | 1.0 | |
| 21 | Dynamic Configuration TLDR: Introduces a unified `DynamicConfigManager` backed by ZooKeeper that handles runtime configuration changes for multiple entity types (topics, clients) without rolling restarts, and adds `AlterConfig`/`DescribeConfig` network APIs. The existing configuration system required a full cluster rolling restart to change broker-level settings; only topic configs had a limited ZooKeeper-based dynamic update path. |
Admin Broker | Accepted | Aditya Auradkar | 2015-04-22 | 2015-07-18 | KAFKA-2204 | ||
| 20 | Enable log preallocate to improve consume performance under windows and some old Linux file system TLDR: KIP-20 adds a `log.preallocate` broker config that pre-allocates log segment files at creation time (using the full `log.segment.bytes` size) rather than growing them incrementally via appends. This reduces filesystem fragmentation on Windows and older Linux filesystems (ext2/ext3), improving consume performance for sequential reads. |
Broker | Accepted | Honghai Chen | 2015-04-21 | 2015-07-18 | KAFKA-1646 | ||
| 19 | Add a request timeout to NetworkClient TLDR: Adds a request.timeout.ms configuration to NetworkClient that fails in-flight requests that have been outstanding longer than the timeout and removes them from the connection. Without a client-side request timeout, a stalled TCP connection could hold pending produce requests indefinitely, blocking KafkaProducer.flush() and close() with no bound. |
Client | Accepted | Jiangjie Qin | 2015-04-13 | 2015-07-21 | KAFKA-2120 | 1.0 | |
| 16 | Automated Replica Lag Tuning TLDR: Replaces replica.lag.max.messages with time-only ISR eligibility via replica.lag.time.max.ms: a replica is dropped from ISR if it has not caught up to the leader's log end offset within the configured time window. Measuring ISR lag in message count caused high-volume topics to experience spurious ISR shrinks on every large batch (e.g., a single batch exceeding the threshold), while low-volume topics detected dead replicas very slowly. |
Broker Producer | Accepted | Aditya Auradkar | 2015-03-12 | 2015-07-18 | KAFKA-1546 | ||
| 15 | Add a close method with a timeout in the producer TLDR: Adds a close(long timeout, TimeUnit unit) method to KafkaProducer that attempts graceful shutdown within the specified bound, aborting unsent records after the timeout expires. The original close() waited indefinitely for all buffered records to be sent, which was unacceptable for deployment tools requiring bounded shutdown or for mirror-maker scenarios where reordering must be avoided after a send failure. |
Producer | Accepted | Jiangjie Qin | 2015-03-07 | 2015-07-18 | KAFKA-1660 | ||
| 13 | Quota Design TLDR: Introduces per-client produce and consume bandwidth quotas enforced by the broker, with throttle time reported back via `ThrottleTimeMs` fields in Fetch and Produce responses. Without quotas, a single producer or consumer can monopolize broker I/O and network capacity, causing latency spikes and resource starvation for all other clients sharing the cluster. |
Broker Admin | Accepted | Aditya Auradkar | 2015-02-24 | 2015-07-18 | KAFKA-2083 | ||
| 11 | Kafka Authorizer design TLDR: Introduces a pluggable `Authorizer` interface for Kafka that brokers call to authorize produce, fetch, and admin operations based on session attributes (user, IP, certificate CN). As enterprise adoption grows, there is demand for fine-grained access control to topics that cannot be met by embedding a single hardcoded authorization implementation in the broker. |
Security | Accepted | Bosco | 2015-01-28 | 2015-10-27 | KAFKA-1688 | Fetch OffsetFetch | |
| 8 | Add a flush method to the producer API TLDR: Adds a KafkaProducer.flush() method that blocks until all buffered records in the RecordAccumulator have been sent and acknowledged. Without flush(), callers had to iterate over all returned Future objects to wait for send completion, and with linger.ms > 0 the last batch was artificially delayed waiting for more records that would never arrive. |
Producer | Accepted | Jay Kreps | 2015-02-08 | 2015-07-18 | KAFKA-1865 | ||
| 4 | Metadata Protocol Changes TLDR: Introduces an AdminClient Java API backed by a wire protocol, providing programmatic operations for topic management, ACL management, and configuration changes without requiring direct ZooKeeper access. Prior to this, all administrative operations (kafka-topics.sh, kafka-acls.sh, etc.) wrote directly to ZooKeeper, coupling admin tooling to ZooKeeper connectivity and making multi-language admin clients impractical. |
Admin Protocol | Accepted | Joe Stein | 2015-01-21 | 2017-03-16 | KAFKA-1912 | Metadata CreateTopics DeleteTopics | |
| 3 | Mirror Maker Enhancement TLDR: KIP-3 redesigns MirrorMaker to use `producer.send()` with a callback and defer consumer offset commits until producer acknowledgment, eliminating the data loss window between consumer commit and producer delivery. The original MirrorMaker committed consumer offsets before confirming producer delivery, so a crash between those two points silently dropped messages. |
MirrorMaker | Accepted | Jiangjie Qin | 2015-01-21 | 2015-07-18 | KAFKA-1997 | 1.0 | |
| 2 | Refactor brokers to allow listening on multiple ports and IPs TLDR: KIP-2 refactors brokers to support multiple listeners by introducing a comma-separated `listeners` config (e.g. `PLAINTEXT://host:9092,SSL://host:9093`) and updating the `UpdateMetadataRequest` wire protocol to carry multiple endpoint entries per broker. This enables different security protocols on different ports, laying the foundation for TLS and SASL support alongside plaintext traffic. |
Broker Protocol Security | Accepted | Gwen Shapira | 2015-01-20 | 2015-07-18 | KAFKA-1809 | Metadata UpdateMetadata | |
| 1 | Remove support of request.required.acks TLDR: KIP-1 removes the legacy `request.required.acks` producer configuration and replaces it with `acks`, unifying the acknowledgment semantics around `acks=-1` (all ISR) in combination with `min.insync.replicas`. The old `request.required.acks=N` (for N > 1) was semantically misleading because it counted raw replicas rather than ISR members, creating a false sense of durability when ISR was smaller than N. |
Producer Broker | Accepted | Gwen Shapira | 2015-01-16 | 2015-07-18 | KAFKA-1697 |