From the course: Apache Kafka Essential Training: Building Scalable Applications

A Kafka cluster

- [Instructor] Central to Kafka scale and resilience is its ability to create and manage clusters of brokers. Let's explore more on cluster management in this video. What is a Kafka cluster? A Kafka cluster is a group of Kafka brokers working together to receive, store, and deliver data. Each Kafka cluster has a unique cluster ID which each of the brokers in the cluster know about. Each broker additionally has a unique node ID within this cluster. Brokers inside the cluster share work based on topic partitions. The brokers work with each other collaboratively to manage the cluster. Management and control information about the cluster is stored in a dataset called metadata. This metadata contains information about the cluster. This includes data about members of the cluster and their roles, topics in the cluster and their configurations, current status of the brokers and topics, and also the current status of consumers and consumer groups. This information is kept up to date by the cluster. Each broker node in the cluster has its own cached main memory copy of the metadata. Changes to the metadata are constantly communicated and updated between the nodes. Each node in the Kafka cluster can play one or more roles. The first role is that of a broker. The broker is the worker in the cluster. It is responsible for receiving data for the partitions it manages, storing them, and publishing them to subscribers. They also handle required replication. A controller in the cluster manages the cluster. It is responsible for administration activities. A Kafka node in the cluster can be just broker only, controller only, or can be both the broker and the controller. Typically, there are multiple controllers in the cluster and one of these controllers become the active controller that manages the cluster. We will discuss more on this in the next video.

Contents