Tags: linkedin/kafka
Tags
[LI-HOTFIX] Ensure ReplicationFactor is at least (minIsr + Redundancy… …Factor + 1) in topic creation (#532) This PR is for "[LIKAFKA-65517] Improve the Kafka broker controlled shutdown". If a topic has replication factor (RF) that is smaller than (minIsr + controlledShutdownSafetyCheckRedundancyFactor + 1), the controlled shutdown would be stuck due to error NOT_ENOUGH_REPLICAS. e.g., a topic with RF=3 and minISR=2 with controlledShutdownSafetyCheckRedundancyFactor=1 , it would cause broker stuck in controlled shutdown. This PR auto updates RF to be at least (minIsr + controlledShutdownSafetyCheckRedundancyFactor + 1) during topic creation if replica assigments are not provided; if replica assignments are provided, it would throw error (topic creation will be failed) if the replica set size is smaller than (minIsr + controlledShutdownSafetyCheckRedundancyFactor + 1). Also updated log level from "info" to "warn" if controlled shutdown could not proceed due to NOT_ENOUGH_REPLICAS. Added unittests and fixed breaking unittests due to this change.
[LI-HOTFIX] Ensure ReplicationFactor is at least (minIsr + Redundancy… …Factor + 1) in topic creation (#532) This PR is for "[LIKAFKA-65517] Improve the Kafka broker controlled shutdown". If a topic has replication factor (RF) that is smaller than (minIsr + controlledShutdownSafetyCheckRedundancyFactor + 1), the controlled shutdown would be stuck due to error NOT_ENOUGH_REPLICAS. e.g., a topic with RF=3 and minISR=2 with controlledShutdownSafetyCheckRedundancyFactor=1 , it would cause broker stuck in controlled shutdown. This PR auto updates RF to be at least (minIsr + controlledShutdownSafetyCheckRedundancyFactor + 1) during topic creation if replica assigments are not provided; if replica assignments are provided, it would throw error (topic creation will be failed) if the replica set size is smaller than (minIsr + controlledShutdownSafetyCheckRedundancyFactor + 1). Also updated log level from "info" to "warn" if controlled shutdown could not proceed due to NOT_ENOUGH_REPLICAS. Added unittests and fixed breaking unittests due to this change.
[LI-HOTFIX] throw ConfigException when resolved addresses is empty (#527 ) * throw RetriableKafkaClientConstructionException when resolved addresses is empty * adds RetriableKafkaClientConstructionException for Retriable exception * misc --------- Co-authored-by: Qi Liu <qliu2+LinkedIn@linkedin.com>
Catch exceptions for LogDirFailureHandler thread during handleLogDirF… ( #530) LITICKET=ACTIONITEM-5825 -Description We recently found an issue that Kafka didn't shutdown itself when there is bad disk issue detected. The current assumption is Kafka detected the issue but the LogDirFailureHandler thread was not working correctly. Unfortunately we don't have enough logs or heapdump to identify what exactly happens, so our best bet is to catch the exceptions for that thread to avoid it happens in the future. We also add a new metric to notify when such exception is thrown, which usually indicate some critical issues. A more aggressive strategy is to exit the program whenever detecting offline log dir, which might make sense for LinkedIn setup. I don't see huge risk of doing it, but we may still want to start from being a bit conservative. So I decided to add metrics to indicate when a server detects failed disks, so at least our monitoring systems could detect it and potentially taking further actions.
Fixed logging order issue in GroupCoordinator (#526) Currently the GroupCoordinator always log "Preparing to rebalance group XXX in state PreparingRebalance", because it sets the state to PreparingRebalance right before log. This makes the previous state of the group being lost at the logging moment. We need to log first, and then change the state.
ACTIONITEM-3738: Improve the log message for the "There are no nodes … …in the Kafka cluster" error. (#523) * ACTIONITEM-3738: Improve the log message for the "There are no nodes in the Kafka cluster" error. * Improve error msg. * More improvements. --------- Co-authored-by: yuhayang@linkedin.com <alias@linkedin.com>
PreviousNext