KIP-112 — Handle disk failure for JBOD
Accepted Kafka 1.0 Broker
Enables Kafka brokers to handle individual disk failures in a JBOD configuration by taking the failed disk's log directories offline and continuing to serve partitions on healthy disks, with controllers reassigning replicas away from the failed disk. Without this, a single disk failure on a multi-disk broker kills all partitions on that broker, requiring 4x storage overhead via RAID-10 to achieve the same fault tolerance at higher cost.
Protocol Impact
Produce · Fetch · Metadata · LeaderAndIsr · StopReplica · UpdateMetadata
Details
| Author | Dong Lin |
| Status | Accepted |
| Kafka Version | 1.0 |
| JIRA | KAFKA-4763 |
| Wiki | View on Apache Wiki |
| Created | 2017-01-12 |
| Last Modified | 2020-01-16 |
Explore how this KIP affects the Kafka protocol in the Protocol Explorer, or see the full KIP database.