apache zookeeper - Kafka partition leader not updated after broker removed -
i have kafka cluster managed marathon/mesos had 3 brokers version 0.10.2.1
. docker images based on wurstmeister/kafka-docker. broker.id=-1
assigned automatically , sequentially @ start , leaders auto-rebalanced auto.leader.rebalance.enable=true
. clients version 0.8.2.1
.
zookeeper configuration:
➜ zkcli -server zookeeper.example.com:2181 ls /brokers/ids [1106, 1105, 1104] ➜ zkcli -server zookeeper.example.com:2181 /brokers/ids/1104 {"listener_security_protocol_map":{"plaintext":"plaintext"}, "endpoints":["plaintext://host1.mesos-slave.example.com:9092"], "jmx_port":9999,"host":"host1.mesos-slave.example.com", "timestamp":"1500987386409", "port":9092,"version":4} ➜ zkcli -server zookeeper.example.com:2181 /brokers/ids/1105 {"listener_security_protocol_map":{"plaintext":"plaintext"}, "endpoints":["plaintext://host2.mesos-slave.example.com:9092"], "jmx_port":9999,"host":"host2.mesos-slave.example.com", "timestamp":"1500987390304", "port":9092,"version":4} ➜ zkcli -server zookeeper.example.com:2181 /brokers/ids/1106 {"listener_security_protocol_map":{"plaintext":"plaintext"}, "endpoints":["plaintext://host3.mesos-slave.example.com:9092"], "jmx_port":9999,"host":"host3.mesos-slave.example.com", "timestamp":"1500987390447","port":9092,"version":4} ➜ bin/kafka-topics.sh --zookeeper zookeeper.example.com:2181 --create --topic test-topic --partitions 2 --replication-factor 2 created topic "test-topic". ➜ bin/kafka-topics.sh --zookeeper zookeeper.example.com:2181 --describe --topic test-topic topic:test-topic partitioncount:2 replicationfactor:2 configs: topic: test-topic partition: 0 leader: 1106 replicas: 1106,1104 isr: 1106 topic: test-topic partition: 1 leader: 1105 replicas: 1104,1105 isr: 1105
consumers can consume producers outputting.
➜ /opt/kafka_2.10-0.8.2.1 bin/kafka-console-producer.sh --broker-list 10.0.1.3:9092,10.0.1.1:9092 --topic test-topic [2017-07-25 12:57:17,760] warn property topic not valid (kafka.utils.verifiableproperties) hello 1 hello 2 hello 3 ... ➜ /opt/kafka_2.10-0.8.2.1 bin/kafka-console-consumer.sh --zookeeper zookeeper.example.com:2181 --topic test-topic --from-beginning hello 1 hello 2 hello 3 ...
then broker 1104 , 1105 (host1 , host2) go out , 1 coming online, 1107 (host 1), manually using marathon interface
➜ zkcli -server zookeeper.example.com:2181 ls /brokers/ids [1107, 1106] ➜ zkcli -server zookeeper.example.com:2181 /brokers/ids/1107 {"listener_security_protocol_map":{"plaintext":"plaintext"}, "endpoints":["plaintext://host1.mesos-slave.example.com:9092"], "jmx_port":9999,"host":"host1.mesos-slave.example.com", "timestamp":"1500991298225","port":9092,"version":4}
consumer still gets messages producer topics description looks out of date:
topic:test-topic partitioncount:2 replicationfactor:2 configs: topic: test-topic partition: 0 leader: 1106 replicas: 1106,1104 isr: 1106 topic: test-topic partition: 1 leader: 1105 replicas: 1104,1105 isr: 1105
i tried rebalancing kafka-preferred-replica-election.sh
, kafka-reassign-partitions.sh
.
➜ $cat all_partitions.json { "version":1, "partitions":[ {"topic":"test-topic","partition":0,"replicas":[1106,1107]}, {"topic":"test-topic","partition":1,"replicas":[1107,1106]} ] } ➜ bin/kafka-reassign-partitions.sh --zookeeper zookeeper.example.com:2181 --reassignment-json-file all_partitions.json --execute ➜ bin/kafka-reassign-partitions.sh --zookeeper zookeeper.example.com:2181 --reassignment-json-file all_partitions.json --verify status of partition reassignment: reassignment of partition [test-topic,0] completed reassignment of partition [test-topic,1] still in progress ➜ $cat all_leaders.json { "partitions":[ {"topic": "test-topic", "partition": 0}, {"topic": "test-topic", "partition": 1} ] } ➜ bin/kafka-preferred-replica-election.sh --zookeeper zookeeper.example.com:2181 --path-to-json-file all_leaders.json created preferred replica election path {"version":1,"partitions":[{"topic":"test-topic","partition":0},{"topic":"test-topic","partition":1}]} started preferred replica election partitions set([test-topic,0], [test-topic,1])
the partition leader partitions 1 still 1105 doesn't make sense:
➜ bin/kafka-topics.sh --zookeeper zookeeper.example.com:2181 --describe --topic test-topic topic:test-topic partitioncount:2 replicationfactor:2 configs: topic: test-topic partition: 0 leader: 1106 replicas: 1106,1107 isr: 1106,1107 topic: test-topic partition: 1 leader: 1105 replicas: 1107,1106,1104,1105 isr: 1105
why partition 1 thinks leader still 1105 although host2 not alive?
Comments
Post a Comment