Bootstrap server vs zookeeper in kafka?

Kafka consumer need to commit the offset to kafka and fetch the offset from kafka. Since kafka moved the offset storage from zookeeper to kafka brokers, a kafka consumer does not need to directly communicate with zookeeper, so the new kafka consumer does not need to config the zookeeper.

However, a kafka consumer always needs to connect to kafka brokers (cluster) to send the request to server, the bootstrap-server is just some brokers of this cluster, and using this, consumer could find all the brokers.

In the older version of Kafka (0.9.0) Kafka use to store data on Kafka server and all offset related information like (current partition offsets) were stored in zookeeper. So for a consumer to run it requires both data and metadata. So for getting metadata, it has to call zookeeper. That's why it is using both zookeeper and Kafka. Ex Old Consumer Code
In new versions of Kafka (i.e 0.10.0 and above) it stores all topic metadata information(total partitions and their current offsets) in the __consumer_offset topic on the same Kafka server. So now only Kafka broker needs to communicate with zookeeper and consumer gets all data and metadata from Kafka broker itself, so it now no longer need to communicate with zookeeper.

Advantage of the current architecture: it's easier to manage data and metadata when they are at the same place.

In the current kafka-consumer tool using the --zookeeper or --bootstrap-server arguments distinguish between using the old and the new consumer. The old consumer needs Zookeeper connection because offset are saved there. The new consumer doesn't need Zookeeper anymore because offsets are saved to __consumer_offset topics on Kafka brokers. Using the old consumer is discouraged today so for new applications it's better using the new implementation.

Bootstrap server vs zookeeper in kafka?

Tags:

Apache Zookeeper

Apache Kafka

Related

Recent Posts