How to install Apache Kafka 4 nodes is shown here. 4 ubuntu nodes are used for Apache Kafka.
- ▼1. What is Apache Kafka?
- ▼2. Installing Apache Kafka with 4 nodes
- 2-1. Preparing for 4 Ubuntu 20.04.01 LTS x64 nodes
- 2-2. Updating hosts file on 4 nodes
- 2-3. Installing Java 8 JDK on 4 nodes
- 2-4. Installing Apache Kafka on 4 nodes
- 2-5. Updating environment variables on 4 nodes
- 2-6. Only on Zookeeper node, zookeeper.properties is set.
- 2-7. Starting Zookeeper
- 2-8. Updating rc.local due to start of Zookeeper automatically when OS is started
- 2-9. Configuring Kafka Brokers
- 2-10. Updating rc.local each Brokers. the steps are the same as ones above 2-8.
- ▼3. Creating Topic, writing and read data
- ▼4. Reference
▼1. What is Apache Kafka?
Kafka combines three key capabilities.
- To publish (write) and subscribe to (read) streams of events, including continuous import/export of your data from other systems.
- To store streams of events durably and reliably for as long as you want.
- To process streams of events as they occur or retrospectively.
Ref: Apache Kafka
▼2. Installing Apache Kafka with 4 nodes
2-1. Preparing for 4 Ubuntu 20.04.01 LTS x64 nodes
One node is used for Zookeeper Server. other 3 nodes are used to provide Kafka Broker services.
2-2. Updating hosts file on 4 nodes
(e.g) /etc/hosts
xxx
10.0.0.1 zookeeper
10.0.0.2 kafkabroker1
10.0.0.3 kafkabroker2
10.0.0.4 kafkabroker3
2-3. Installing Java 8 JDK on 4 nodes
Azul Zulu OpenJDK Java 8 (LTS) is used for this Kafka. Download Azul Zulu Builds of OpenJDK | Azul
例)
sudo apt install ./zulu8.58.0.13-ca-jdk8.0.312-linux_amd64.deb
2-4. Installing Apache Kafka on 4 nodes
There is a download site for Apache Kafka (Kafka 3.0.0 2.13-3) https://www.apache.org/dyn/closer.cgi?path=/kafka/3.0.0/kafka_2.13-3.0.0.tgz
例) tar -xzf kafka_2.13-3.0.0.tgz cd kafka_2.13-3.0.0
2-5. Updating environment variables on 4 nodes
/home/hadoop/.profile is updated as below. ${HOME} is /home/hadoop in this case.
e.g)
PATH=$PATH:$HOME/.local/bin:$HOME/bin:$HOME/kafka_2.13-3.0.0/bin
2-6. Only on Zookeeper node, zookeeper.properties is set.
e.g) mkdir /home/hadoop/kafka_data/zookeeper_data vi $HOME/kafka_2.13-3.0.0/config/zookeeper.properties xxx dataDir=/home/hadoop/kafka_data/zookeeper_data
2-7. Starting Zookeeper
/home/hadoop/kafka_2.13-3.0.0/bin/zookeeper-server-start.sh $HOME/kafka_2.13-3.0.0/config/zookeeper.properties
2-8. Updating rc.local due to start of Zookeeper automatically when OS is started
a. Creating rc.local file
sudo vi /etc/rc.local
b, Adding below in rc-local.service
sudo vi /etc/systemd/system/rc-local.service
[Unit]
Description=/etc/rc.local Compatibility
ConditionPathExists=/etc/rc.local
[Service]
Type=forking
ExecStart=/etc/rc.local start
TimeoutSec=0
StandardOutput=tty
RemainAfterExit=yes
SysVStartPriority=99
[Install]
WantedBy=multi-user.target
d. Setting permission for this file.
sudo chmod +x /etc/rc.local
e. updating /etc/rc.local file
printf '%s\n' '#!/bin/bash' 'exit 0' | sudo tee -a /etc/rc.local
f. Adding below over exit 0 command
/home/hadoop/kafka_2.13-3.0.0/bin/kafka-server-start.sh /home/hadoop/kafka_2.13-3.0.0/config/server.properties> /dev/null 2>&1 &
g. Starting rc-local.service
sudo systemctl enable rc-local
sudo systemctl start rc-local.service
sudo systemctl status rc-local.service
h. Running zookeeper shell and Confirming if zookeeper server is online. at this time, it is no problem that Kafka broker doesn't exist and broker id doesn't exit.
kafka_2.12-2.0.0/bin/zookeeper-shell.sh localhost:2181 ls /brokers/ids

2-9. Configuring Kafka Brokers
2-9-1. Creating the folder which name is kafka_data each 3 kafka brokers
例)
mkdir /home/hadoop/kafka_data
2-9-2. Updating server.properties on 3 Kafka Brokers
The location of the file is /home/hadoop/kafka_2.13-3.0.0/config/server.properties in this case. server.properties are updated each 3 kafka brokers as below.
| 設定 | kafkabroker1 | kafkabroker2 | kafkabroker3 |
| broker.id | 0 | 1 | 2 |
| broker.rack | RAC1 | RAC2 | RAC3 |
| log.dirs | /home/hadoop/kafka_data | /home/hadoop/kafka_data | /home/hadoop/kafka_data |
| offsets.topic.num.partitions | 3 | 3 | 3 |
| offsets.topic.replication.factor | 2 | 2 | 2 |
| min.insync.replicas | 2 | 2 | 2 |
| default.replication.factor | 2 | 2 | 2 |
| zookeeper.connect | 10.0.0.1 | 10.0.0.1 | 10.0.0.1 |
(例)
//kafkabroker1
broker.id=0
broker.rack=RAC1
log.dirs=/home/hadoop/kafka_data
offsets.topic.num.partitions=3
offsets.topic.replication.factor=2
min.insync.replicas=2
default.replication.factor=2
zookeeper.connect=10.0.0.1:2181
//kafkabroker2
broker.id=1
broker.rack=RAC2
log.dirs=/home/hadoop/kafka_data
offsets.topic.num.partitions=3
offsets.topic.replication.factor=2
min.insync.replicas=2
default.replication.factor=2
zookeeper.connect=10.0.0.1:2181
//kafkabroker3
broker.id=2
broker.rack=RAC3
log.dirs=/home/hadoop/kafka_data
offsets.topic.num.partitions=3
offsets.topic.replication.factor=2
min.insync.replicas=2
default.replication.factor=2
zookeeper.connect=10.0.0.1:2181
2-10. Updating rc.local each Brokers. the steps are the same as ones above 2-8.
▼3. Creating Topic, writing and read data
(e.g) # Setting environment variables of KAFKA Brokers according to a hosts file export KAFKABROKERS=10.0.0.2:9092,10.0.0.3:9092,10.0.0.4:9092 echo $KAFKABROKERS (output) 10.0.0.2:9092,10.0.0.3:9092,10.0.0.4:9092 # Setting environment variables of KAFKA Zookeeper according to a host file export KAFKAZOOKEEPER=10.0.0.1:2181 echo $KAFKAZOOKEEPER (output) 10.0.0.1:2181 # Running zookeeper shell and confirming broker ids Zookeeper-shell.sh $KAFKAZOOKEEPER ls /brokers/ids (output) Connecting to 10.0.0.1:2181 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [0,1,2] # Creating the topic which name is topic01 kafka-topics.sh --create --replication-factor 3 --partitions 3 --topic test01 --bootstarp-server $KAFKABROKERS (output) Created topic test # listing the topic kafka-topics.sh --list --bootstrap-server $KAFKABROKERS (output) test # Showing the details of this topic kafka-topics.sh --describe --bootstrap-server $KAFKABROKERS --topic test (output) Topic: test TopicId: C6c29UKLQXxxXx2G6EmuyA PartitionCount: 3 ReplicationFactor: 3 Configs: min.insync.replicas=2,segment.bytes=1073741824 Topic: test Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0 Topic: test Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1 Topic: test Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2 # Generating data kafka-console-producer.sh --broker-list $KAFKABROKERS --topic test (input) >1 >2 >3 >a >b >c # Reading the generated data Kafka-console-consumer.sh --bootstrap-server $KAFKABROKERS --topic test --from-beginning (output) 1 2 b c 3 a Processed a total of 6 messages
▼4. Reference
- Apache Kafka
- Installing Multi-node Kafka Cluster – Learning Journal
- Apache Kafka/quickstart
- How to Enable /etc/rc.local with Systemd – LinuxBabe
- Quickstart: Set up Apache Kafka on HDInsight using Azure portal | Microsoft Docs
That’s all. Have a nice day ahead !!!