Offsets stored in Zookeeper or Kafka?
Older versions of Kafka (pre 0.9) store offsets in ZK only, while newer version of Kafka, by default store offsets in an internal Kafka topic called __consumer_offsets
(newer version might still commit to ZK though).
The advantage of committing offsets to the broker is, that the consumer does not depend on ZK and thus clients only need to talk to brokers which simplifies the overall architecture. Also, for large deployments with a lot of consumers, ZK can become a bottleneck while Kafka can handle this load easily (committing offsets is the same thing as writing to a topic and Kafka scales very well here -- in fact, by default __consumer_offsets
is created with 50 partitions IIRC).
I am not familiar with NodeJS or kafka-node -- it depend on the client implementation how offsets are committed.
Long story short: if you use brokers 0.10.1.0
you could commit offsets to topic __consumer_offsets
. But it depends on your client, if it implements this protocol.
In more detail, it depends on your broker and client version (and which consumer API you are using), because older clients can talk to newer brokers. First, you need to have broker and client version 0.9
or larger to be able to write offsets into the Kafka topics. But if an older client is connecting to a 0.9
broker, it will still commit offsets to ZK.
For Java consumers:
It depends what consumer are using: Before 0.9 there are two "old consumer" namely "high level consumer" and "low level consumer". Both, commit offsets directly to ZK. Since 0.9
, both consumers got merged into single consumer, called "new consumer" (it basically unifies low level and high level API of both old consumers -- this means, in 0.9
there a three types of consumers). The new consumer commits offset to the brokers (ie, the internal Kafka topic)
To make upgrading easier, there is also the possibility to "double commit" offsets using old consumer (as of 0.9
). If you enable this via dual.commit.enabled
, offsets are committed to ZK and the __consumer_offsets
topic. This allows you to switch from old consumer API to new consumer API while moving you offsets from ZK to __consumer_offsets
topic.
It all depends on which consumer you're using. You should choose the right consumer based on your Kafka version.
for version 0.8
brokers use the HighLevelConsumer
. The offsets for your groups are stored in zookeeper.
For brokers 0.9
and higher you should use the new ConsumerGroup
. The offsets are stored with kafka brokers.
Keep in mind that HighLevelConsumer
will still work with versions past 0.8 but they have been deprecated in 0.10.1
and support will probably go away soon. The ConsumerGroup
has rolling migration options to help move from HighLevelConsumer
if you were committed to using it.
Offsets in Kafka are stored as messages in a separate topic named '__consumer_offsets' . Each consumer commits a message into the topic at periodic intervals in latest versions of kafka.