kafka

How to Install Apache Kafka on RHEL 8 / 9 in Disconnected Envrionments

A complete, production-grade setup guide — from pre-flight checks to a running, secured, firewall-ready distributed event streaming platform

Most Kafka installation guides hand you a tar -xzf and a kafka-server-start.sh command run in the foreground as root, with no systemd service, no ZooKeeper-vs-KRaft decision explained, no firewall rules, and no idea why producers can’t connect from another machine.

This guide doesn’t do that.

We’ll go through every step — choosing between ZooKeeper mode and KRaft mode, installing the correct Java version, configuring Kafka as a proper systemd service running under a dedicated non-root user, tuning the critical broker parameters, setting up firewall rules, creating topics, and verifying end-to-end message flow with producers and consumers. By the end, you’ll understand not just what to run, but exactly why each step exists.


What Is Apache Kafka?

Apache Kafka is an open-source distributed event streaming platform. It was originally built at LinkedIn to handle real-time activity tracking at massive scale. Today it’s the backbone of modern data pipelines — handling everything from application logs and metrics to financial transactions and user behavior streams.

Kafka stores streams of records in categories called topics. Producers write records to topics. Consumers read records from topics. Kafka retains records for a configurable period — unlike traditional message queues that delete messages after delivery, Kafka keeps them, enabling consumers to replay, rewind, and process the same data independently.

Kafka vs traditional message queues:

FeatureKafkaRabbitMQActiveMQ
Message retentionConfigurable (days/weeks)Deleted after consumptionDeleted after consumption
ThroughputMillions of messages/secThousands/secThousands/sec
Consumer modelPull-based, offset trackingPush-basedPush-based
OrderingPer-partition guaranteedPer-queuePer-queue
Replay capabilityYes — any offsetNoNo
Horizontal scaleNative partitioningPlugin-basedLimited
PersistenceDisk-first (always)Memory-first (optional disk)Memory-first
Use caseEvent streaming, log aggregationTask queues, RPCEnterprise messaging

Core Kafka concepts:

ConceptDefinition
TopicA named stream of records — like a table in a database but append-only
PartitionA topic is split into partitions for parallelism and scalability
OffsetA sequential ID for each record within a partition — consumers track their position
ProducerAn application that writes records to a topic
ConsumerAn application that reads records from a topic
Consumer GroupMultiple consumers sharing topic partitions — enables parallel processing
BrokerA single Kafka server — stores partitions and serves producer/consumer requests
ClusterMultiple brokers working together
Replication FactorNumber of brokers that hold a copy of each partition
LeaderThe broker handling reads and writes for a partition
ZooKeeper / KRaftCluster coordination — ZooKeeper (legacy) or KRaft (built-in, modern)

What we’re building:

  • Java 17 (OpenJDK) installed as Kafka’s runtime
  • Apache Kafka installed manually from a local binary file (air-gapped environment)
  • KRaft mode — no ZooKeeper dependency (Kafka 3.x+ recommended approach)
  • Running as a dedicated kafka system user (non-root)
  • Kafka broker managed by systemd — starts on boot, restarts on failure
  • Logs stored in /var/log/kafka/
  • Data stored in /var/lib/kafka/
  • Ports 9092 (broker) and 9093 (controller) open through firewalld
  • Producer and consumer verified with end-to-end message flow

ZooKeeper Mode vs KRaft Mode — Which Should You Use?

Before installation, you need to make a decision. Kafka has two cluster coordination modes:

ZooKeeper Mode (legacy)

  • Requires running Apache ZooKeeper as a separate service
  • ZooKeeper manages broker metadata, leader election, and cluster state
  • Every Kafka deployment prior to 2.8 used this mode
  • Still fully supported but deprecated — ZooKeeper support will be removed in a future major version

KRaft Mode (modern — recommended)

  • Kafka manages its own metadata using a Raft consensus protocol built into the broker
  • No ZooKeeper process to install, configure, or operate
  • Available as preview since Kafka 2.8, production-ready since Kafka 3.3
  • Required for Kafka 4.x and beyond
  • Simpler architecture — fewer processes, fewer failure points, faster controller failover

This guide uses KRaft mode. If you’re starting a new deployment on Kafka 3.x or later, KRaft is the right choice. ZooKeeper mode instructions are included at the end for environments that specifically require it.


Before You Install — Run These Checks First

Kafka has more system-level dependencies than most services. Every check here maps to a real failure mode.

1. Confirm your OS version

cat /etc/redhat-release

You need Red Hat Enterprise Linux release 8.x or 9.x. AlmaLinux and Rocky Linux follow the same steps.

2. Check your system architecture

uname -m
  • x86_64 → fully supported
  • aarch64 → fully supported — Kafka is JVM-based and architecture-portable

3. Check Java version (critical)

Kafka is a JVM application. The version of Java you install matters:

java -version 2>&1
  • Kafka 3.x requires Java 8 or later — but Java 17 LTS is strongly recommended
  • Kafka 4.x requires Java 11 or later — Java 17 LTS is the standard
  • Never use Java 8 for new deployments — it’s end-of-life for most distributions

If Java isn’t installed or the wrong version is present, we’ll install it in Step 2.

4. Verify ports are free

sudo ss -tulnp | grep -E '9092|9093|2181'

Kafka uses:

  • 9092 — Broker listener (producers and consumers connect here)
  • 9093 — Controller listener (KRaft mode — inter-broker controller communication)
  • 2181 — ZooKeeper port (ZooKeeper mode only — not needed for KRaft)

No output means all ports are available.

5. Check your firewall status

sudo systemctl status firewalld

Note whether it’s active. You’ll need to open ports 9092 and 9093 after installation.

6. Check SELinux mode

getenforce

Note: Enforcing, Permissive, or Disabled. Kafka spawns multiple JVM processes and writes to several directories — SELinux can block several of these if not configured correctly.

7. Check available disk space

df -h /var /opt

Kafka stores all message data on disk. Log retention means data accumulates:

EnvironmentMinimum Free SpaceNotes
Development10 GBLow throughput, short retention
Production (small)100 GBModerate throughput, 7-day retention
Production (large)500 GB+High throughput, depends on retention policy

Kafka data lives in /var/lib/kafka/ by default in this guide. Disk I/O speed directly impacts throughput — SSDs are strongly recommended for production.

8. Check available RAM

free -h

Kafka is a JVM application. Memory requirements:

DeploymentJVM HeapOS Page CacheTotal RAM
Development (single broker)1–2 GB2–4 GB4 GB minimum
Production (single broker)4–6 GB10–20 GB16 GB recommended
Production (high throughput)6–8 GB20–40 GB32–64 GB

Kafka is deliberately designed to rely on the OS page cache more than JVM heap. Don’t set the JVM heap larger than 6–8 GB — beyond that, JVM garbage collection pauses become longer than the time saved by caching more data in the heap. The OS page cache is more efficient for Kafka’s sequential read pattern.

9. Check CPU count

nproc

Kafka is I/O-bound, not CPU-bound. Even 2 CPUs handle significant throughput. More CPUs help with many concurrent connections and heavy compression.

10. Check the hostname

hostname -f

Kafka brokers advertise their hostname to clients. If the hostname doesn’t resolve from your client machines, connections will fail even if the initial connection succeeds. Verify your hostname is resolvable:

nslookup $(hostname -f)

If it’s not resolvable, add it to /etc/hosts on both the Kafka host and client hosts, or configure proper DNS before proceeding.

11. Verify the Kafka binary file is available locally

In an air-gapped environment, you need to have the Kafka binary tarball available on the server. The filename format is:

kafka_<scala_version>-<kafka_version>.tgz

For example: kafka_2.13-3.8.0.tgz

How to obtain the file:
On a machine with internet access, download it from kafka.apache.org/downloads, then transfer it via secure methods (SCP, SFTP, USB drive, etc.) to your RHEL server.

Place the file in a known location, for example /tmp/ or /opt/.


The Installation

Step 1 — Update your system

sudo dnf update -y

Always update before installing. JVM behavior on RHEL is tied to glibc and OpenSSL versions — stale system libraries cause subtle JVM failures.

Step 2 — Install Java 17 (OpenJDK)

# Install OpenJDK 17
sudo dnf install -y java-17-openjdk java-17-openjdk-devel

# Verify the installation
java -version

Expected output:

openjdk version "17.x.x" 202x-xx-xx LTS
OpenJDK Runtime Environment (Red Hat) ...
OpenJDK 64-Bit Server VM ...

If multiple Java versions are installed, set Java 17 as the default:

sudo alternatives --config java

Select the Java 17 option from the menu.

Add JAVA_HOME to the system environment:

JAVA_HOME=$(dirname $(dirname $(readlink -f $(which java))))
echo "export JAVA_HOME=${JAVA_HOME}" | sudo tee /etc/profile.d/java.sh
echo "export PATH=\$PATH:\$JAVA_HOME/bin" | sudo tee -a /etc/profile.d/java.sh
source /etc/profile.d/java.sh
echo $JAVA_HOME

Step 3 — Install required tools

sudo dnf install -y tar net-tools
  • tar — extracting the release archive
  • net-tools — provides netstat for port verification

Step 4 — Create a dedicated service account

sudo useradd \
  --no-create-home \
  --shell /bin/false \
  --system \
  --comment "Apache Kafka Service Account" \
  kafka

Running Kafka as root means any vulnerability in the broker or a malicious message payload could give an attacker full system access. The --shell /bin/false prevents anyone from ever logging in as this user. The --no-create-home skips creating a home directory — we create the Kafka directories explicitly.

Step 5 — Create the Kafka directory structure

# Installation directory
sudo mkdir -p /opt/kafka

# Data directory — where Kafka stores message logs
sudo mkdir -p /var/lib/kafka/data

# Log directory — where Kafka stores its own application logs
sudo mkdir -p /var/log/kafka

# Set ownership
sudo chown -R kafka:kafka /opt/kafka
sudo chown -R kafka:kafka /var/lib/kafka
sudo chown -R kafka:kafka /var/log/kafka

Why this structure?

PathPurpose
/opt/kafka/Kafka binaries, scripts, and configuration
/var/lib/kafka/data/Message log data — this is where your actual messages live
/var/log/kafka/Kafka broker logs, GC logs, controller logs

This follows the Linux Filesystem Hierarchy Standard: /opt for third-party applications, /var/lib for persistent application data, /var/log for logs.

Step 6 — Extract Apache Kafka from the local binary file

Assumption: You have placed the Kafka binary tarball in /tmp/ or another accessible location. If you placed it elsewhere, adjust the path accordingly.

# Set the version to match your file
KAFKA_VERSION=3.8.0
SCALA_VERSION=2.13

# Navigate to where the file is located
cd /tmp

# Extract the archive
tar -xzf kafka_${SCALA_VERSION}-${KAFKA_VERSION}.tgz

# Move to the installation directory
sudo cp -r kafka_${SCALA_VERSION}-${KAFKA_VERSION}/* /opt/kafka/

# Set correct ownership
sudo chown -R kafka:kafka /opt/kafka/

# Clean up
cd ~
rm -rf /tmp/kafka_${SCALA_VERSION}-${KAFKA_VERSION}*

Verify the installation:

ls /opt/kafka/

You should see: bin/, config/, libs/, licenses/, site-docs/

/opt/kafka/bin/kafka-topics.sh --version

Step 7 — Configure Kafka in KRaft Mode

KRaft mode eliminates ZooKeeper. The broker manages its own metadata using the Raft consensus protocol — Kafka nodes elect a controller from among themselves.

Generate a unique Cluster UUID:

KAFKA_CLUSTER_UUID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
echo "Cluster UUID: $KAFKA_CLUSTER_UUID"

Save this UUID — you’ll need it in the next step and for adding additional brokers to the cluster.

Create the KRaft configuration file:

sudo tee /opt/kafka/config/kraft/server.properties <<EOF
#---------------------------------------------------------------------------
# KRaft Mode Configuration
# This broker acts as both a controller and a broker (combined mode)
# For large clusters, separate controller and broker nodes
#---------------------------------------------------------------------------

# The role of this server — 'broker', 'controller', or 'broker,controller'
# 'broker,controller' is combined mode — suitable for single-node and small clusters
process.roles=broker,controller

# Unique broker ID — must be unique across the entire cluster
node.id=1

# The Kafka cluster ID generated above
# Replace with your actual UUID
cluster.id=REPLACE_WITH_YOUR_CLUSTER_UUID

#---------------------------------------------------------------------------
# LISTENERS
#---------------------------------------------------------------------------

# Listener names, protocols, and ports
# PLAINTEXT — unencrypted broker traffic (producers/consumers)
# CONTROLLER — inter-broker controller traffic (KRaft)
listeners=PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093

# What address the broker advertises to clients
# CRITICAL: Replace <your-server-ip-or-hostname> with your actual IP or hostname
# Clients use this address to connect — must be reachable from all clients
advertised.listeners=PLAINTEXT://<your-server-ip-or-hostname>:9092

# Listener security protocol mapping
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT

# Controller listener name
controller.listener.names=CONTROLLER

# KRaft controller quorum — list of controller nodes
# For single-node: node.id@host:controller_port
# For multi-node: comma-separated list of all controller nodes
controller.quorum.voters=1@localhost:9093

#---------------------------------------------------------------------------
# DATA AND LOG DIRECTORIES
#---------------------------------------------------------------------------

# Message log data directory — where actual messages are stored
log.dirs=/var/lib/kafka/data

#---------------------------------------------------------------------------
# TOPIC DEFAULTS
#---------------------------------------------------------------------------

# Default number of partitions for new topics
num.partitions=3

# Default replication factor for new topics
# Set to 3 for production clusters with 3+ brokers
# Must be <= number of brokers in the cluster
default.replication.factor=1

# Minimum number of in-sync replicas required for writes to succeed
min.insync.replicas=1

# Automatically create topics when a producer or consumer references them
# Set to false in production — create topics explicitly
auto.create.topics.enable=false

# Allow topic deletion via admin commands
delete.topic.enable=true

#---------------------------------------------------------------------------
# NETWORK AND PERFORMANCE
#---------------------------------------------------------------------------

# Number of threads handling network requests
num.network.threads=3

# Number of threads doing I/O
num.io.threads=8

# Send and receive buffer sizes
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600

# Number of threads for log recovery at startup and flushing at shutdown
num.recovery.threads.per.data.dir=1

#---------------------------------------------------------------------------
# LOG RETENTION
#---------------------------------------------------------------------------

# Keep messages for 7 days (168 hours)
log.retention.hours=168

# Maximum size per log segment file (1 GB)
log.segment.bytes=1073741824

# How often to check for logs eligible for deletion (5 minutes)
log.retention.check.interval.ms=300000

# Maximum total log size per partition (disabled = -1, use time-based only)
# Uncomment and set for size-based retention:
# log.retention.bytes=107374182400

#---------------------------------------------------------------------------
# ZOOKEEPER — Not used in KRaft mode, kept for reference
#---------------------------------------------------------------------------
# zookeeper.connect=localhost:2181
# zookeeper.connection.timeout.ms=18000

#---------------------------------------------------------------------------
# GROUP COORDINATOR
#---------------------------------------------------------------------------

# Delay initial consumer rebalance to allow consumers to join
group.initial.rebalance.delay.ms=0

# Offsets topic replication (set to 3 for production)
offsets.topic.replication.factor=1

# Transaction log settings
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
EOF

# Set ownership
sudo chown kafka:kafka /opt/kafka/config/kraft/server.properties

Replace the placeholder in advertised.listeners:

# Get your server IP
ip addr show | grep 'inet ' | grep -v '127.0.0.1' | awk '{print $2}' | cut -d/ -f1

Edit the config file and replace <your-server-ip-or-hostname> with your actual server IP or hostname:

sudo sed -i "s/<your-server-ip-or-hostname>/$(hostname -f)/g" \
  /opt/kafka/config/kraft/server.properties

Insert the cluster UUID into the config:

sudo sed -i "s/REPLACE_WITH_YOUR_CLUSTER_UUID/${KAFKA_CLUSTER_UUID}/g" \
  /opt/kafka/config/kraft/server.properties

Verify the config looks correct:

grep -E 'advertised.listeners|cluster.id|node.id|process.roles' \
  /opt/kafka/config/kraft/server.properties

Step 8 — Format the storage directory

Before KRaft mode can start, the storage directory must be formatted with the cluster UUID. This is analogous to initdb in PostgreSQL — it initializes the metadata storage.

sudo -u kafka /opt/kafka/bin/kafka-storage.sh format \
  --config /opt/kafka/config/kraft/server.properties \
  --cluster-id ${KAFKA_CLUSTER_UUID}

Expected output:

Formatting /var/lib/kafka/data with metadata.version x.x-IVx

If you see Directory /var/lib/kafka/data is already formatted, the directory was already initialized — this is fine if you’re rerunning the command intentionally.

Warning: Never re-format a directory that contains production data. The kafka-storage.sh format command overwrites the cluster metadata. Running it on a directory with existing messages will destroy all data.

Step 9 — Configure JVM settings and environment

Kafka’s JVM heap, GC algorithm, and performance flags have a significant impact on broker stability and throughput.

sudo tee /etc/kafka.env <<EOF
# Java home
JAVA_HOME=/usr/lib/jvm/java-17-openjdk

# Kafka home
KAFKA_HOME=/opt/kafka

# Kafka heap settings
# For development: 512m-1g. For production: 4g-6g
# DO NOT set above 6g — GC pauses dominate above this threshold
KAFKA_HEAP_OPTS="-Xms1g -Xmx1g"

# JVM performance flags
KAFKA_JVM_PERFORMANCE_OPTS="-server \
  -XX:+UseG1GC \
  -XX:MaxGCPauseMillis=20 \
  -XX:InitiatingHeapOccupancyPercent=35 \
  -XX:+ExplicitGCInvokesConcurrent \
  -XX:MaxInlineLevel=15 \
  -Djava.awt.headless=true"

# GC log settings — essential for diagnosing memory issues
KAFKA_GC_LOG_OPTS="-Xlog:gc*:file=/var/log/kafka/kafka-gc.log:time,tags:filecount=10,filesize=100m"

# Log4j settings
LOG_DIR=/var/log/kafka
KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/opt/kafka/config/log4j.properties"
EOF

# Restrict permissions (environment file)
sudo chmod 640 /etc/kafka.env
sudo chown root:kafka /etc/kafka.env

Update the Log4j configuration to write to our log directory:

sudo sed -i 's|${kafka.logs.dir}|/var/log/kafka|g' /opt/kafka/config/log4j.properties
sudo chown kafka:kafka /opt/kafka/config/log4j.properties

Step 10 — Create the systemd service file

sudo tee /etc/systemd/system/kafka.service <<EOF
[Unit]
Description=Apache Kafka Broker (KRaft Mode)
Documentation=https://kafka.apache.org/documentation/
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=kafka
Group=kafka

# Load environment variables
EnvironmentFile=/etc/kafka.env

# Working directory
WorkingDirectory=/opt/kafka

# Start command
ExecStart=/opt/kafka/bin/kafka-server-start.sh \
  /opt/kafka/config/kraft/server.properties

# Stop command — graceful shutdown
ExecStop=/opt/kafka/bin/kafka-server-stop.sh

# Restart on failure
Restart=on-failure
RestartSec=10s

# Give Kafka time to shut down gracefully (important for large log segments)
TimeoutStopSec=120

# File descriptor limits — Kafka opens many log segment files
LimitNOFILE=100000

# Standard output and error logging
StandardOutput=append:/var/log/kafka/kafka-server.log
StandardError=append:/var/log/kafka/kafka-server-error.log

[Install]
WantedBy=multi-user.target
EOF

Key service file settings explained:

SettingPurpose
EnvironmentFile=/etc/kafka.envLoads JVM heap and performance settings
Restart=on-failureAutomatically recovers from crashes
RestartSec=10sWait 10 seconds before restart — prevents rapid crash loops
TimeoutStopSec=120Allows 2 minutes for graceful shutdown — Kafka flushes pending data
LimitNOFILE=100000Each topic partition has open file handles — default limit (1024) is too low

LimitNOFILE=100000 is not optional. Kafka opens file descriptors for every log segment across every partition. With hundreds of topics and partitions, you’ll hit the default OS limit of 1024 very quickly — causing “Too many open files” errors that are difficult to diagnose. Always set this.

Increase the system-wide file descriptor limit as well:

sudo tee /etc/security/limits.d/kafka.conf <<EOF
kafka soft nofile 100000
kafka hard nofile 100000
EOF

Step 11 — Start and enable Kafka

sudo systemctl daemon-reload
sudo systemctl start kafka
sudo systemctl enable kafka
sudo systemctl status kafka

Run them in this exact order. daemon-reload picks up the new service file. start launches the broker now. enable ensures Kafka starts automatically after reboots.

In the status output, you’re looking for:

Active: active (running)

If you see active (failed), check the logs immediately:

# systemd journal
sudo journalctl -u kafka -n 100

# Kafka-specific logs
sudo tail -100 /var/log/kafka/kafka-server.log
sudo tail -50 /var/log/kafka/kafka-server-error.log

Wait 20–30 seconds after starting, then verify the broker is ready:

# Check the broker is listening on port 9092
sudo ss -tulnp | grep 9092

You should see a process listening on 0.0.0.0:9092.

# Check cluster metadata (KRaft)
/opt/kafka/bin/kafka-metadata-quorum.sh \
  --bootstrap-server localhost:9092 \
  describe --status

You should see the broker listed with status Leader.

Step 12 — Open the firewall

# Broker port — producers and consumers connect here
sudo firewall-cmd --permanent --add-port=9092/tcp

# Controller port — KRaft inter-broker controller communication
sudo firewall-cmd --permanent --add-port=9093/tcp

# Apply changes
sudo firewall-cmd --reload

Verify the rules were applied:

sudo firewall-cmd --list-ports

You should see both 9092/tcp and 9093/tcp.

Security consideration: Port 9093 (controller) handles internal cluster coordination. For single-broker setups, you can restrict 9093 to localhost only. For multi-broker clusters, 9093 must be reachable between all broker nodes but does not need to be accessible to clients.


Create and Verify Topics

With Kafka running, verify the complete pipeline by creating a topic, producing messages, and consuming them.

Create a test topic

/opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --create \
  --topic test-topic \
  --partitions 3 \
  --replication-factor 1

Expected output: Created topic test-topic.

List all topics:

/opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --list

Describe the topic (see partition and replication details):

/opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --describe \
  --topic test-topic

You should see 3 partitions, each with Leader assigned and Isr (in-sync replicas) showing your broker ID.

Produce messages

Open a terminal and start a producer session:

/opt/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server localhost:9092 \
  --topic test-topic

Type messages and press Enter after each one:

> Hello from Apache Kafka on RHEL!
> This is message 2
> Installation complete and working

Press Ctrl+C to exit the producer.

Consume messages

In a second terminal, start a consumer:

/opt/kafka/bin/kafka-console-consumer.sh \
  --bootstrap-server localhost:9092 \
  --topic test-topic \
  --from-beginning

You should see all three messages you produced:

Hello from Apache Kafka on RHEL!
This is message 2
Installation complete and working

Press Ctrl+C to exit the consumer.

--from-beginning reads from offset 0 — the start of the topic. Without this flag, the consumer starts reading from the latest offset and will only see new messages written after it connects. This is a common source of confusion for new Kafka users.

Verify consumer groups

Consumer groups allow multiple consumers to share partition processing. Check group offset tracking:

# List all consumer groups
/opt/kafka/bin/kafka-consumer-groups.sh \
  --bootstrap-server localhost:9092 \
  --list

# Describe a specific group (replace 'console-consumer-xxxxx' with your group name)
/opt/kafka/bin/kafka-consumer-groups.sh \
  --bootstrap-server localhost:9092 \
  --describe \
  --group console-consumer-$(date +%s)

Delete the test topic when done:

/opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --delete \
  --topic test-topic

Create a Production Topic

For real workloads, create topics with production settings:

# Example: application event stream with 6 partitions
/opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --create \
  --topic app-events \
  --partitions 6 \
  --replication-factor 1 \
  --config retention.ms=604800000 \
  --config retention.bytes=10737418240 \
  --config cleanup.policy=delete \
  --config min.insync.replicas=1 \
  --config compression.type=lz4

Topic configuration options explained:

ConfigValuePurpose
retention.ms604800000 (7 days)How long to keep messages
retention.bytes10737418240 (10 GB)Max total size per partition
cleanup.policydeleteDelete old messages (vs compact for log compaction)
min.insync.replicas1Minimum replicas that must acknowledge a write
compression.typelz4Compress messages — reduces disk and network usage

Performance Tuning

OS-level tuning for Kafka

These OS parameters significantly impact Kafka throughput and latency:

sudo tee /etc/sysctl.d/kafka-tuning.conf <<EOF
# Virtual memory — Kafka relies heavily on page cache
vm.swappiness=1
vm.dirty_ratio=80
vm.dirty_background_ratio=5

# Network buffer sizes — critical for high-throughput Kafka
net.core.wmem_default=131072
net.core.rmem_default=131072
net.core.wmem_max=2097152
net.core.rmem_max=2097152

# TCP settings for Kafka network performance
net.ipv4.tcp_wmem=4096 65536 2048000
net.ipv4.tcp_rmem=4096 65536 2048000

# File system — Kafka uses many concurrent file operations
fs.file-max=100000
EOF

sudo sysctl --system

Disable swap for Kafka (production):

# Check current swap
free -h

# Disable swap temporarily
sudo swapoff -a

# Disable swap permanently (comment out swap line in /etc/fstab)
sudo sed -i '/swap/s/^/#/' /etc/fstab

Why disable swap for Kafka? Kafka is heavily page-cache-dependent. If the kernel starts swapping, Kafka’s performance degrades catastrophically — not gradually. A Kafka broker that starts swapping is effectively dead from a latency perspective. It’s better to let the broker OOM-kill than to let it limp along on swap.

Mount data directory with optimal options

If /var/lib/kafka/ is on a dedicated disk:

# Add to /etc/fstab for the Kafka data disk:
# /dev/sdb1  /var/lib/kafka  ext4  defaults,noatime,nodiratime  0 2

The noatime option prevents the kernel from updating file access timestamps on every read — a significant write amplification reduction for Kafka’s read-heavy log segment access.

SELinux Considerations

If getenforce returns Enforcing, Kafka may encounter issues binding to ports, writing to data directories, or spawning child processes.

Check for SELinux denials:

sudo ausearch -m avc -ts recent | grep kafka

Allow Kafka to bind to its ports:

sudo semanage port -a -t syslogd_port_t -p tcp 9092
sudo semanage port -a -t syslogd_port_t -p tcp 9093

Restore the correct SELinux context on Kafka directories:

sudo restorecon -Rv /opt/kafka/
sudo restorecon -Rv /var/lib/kafka/
sudo restorecon -Rv /var/log/kafka/

Allow JVM network connections:

sudo setsebool -P httpd_can_network_connect 1

ZooKeeper Mode — Reference Installation

If your environment requires ZooKeeper mode (e.g., connecting to an existing ZooKeeper cluster, or using a Kafka version before 3.3):

Install and configure ZooKeeper

Kafka ships with a bundled ZooKeeper for development. For production, run a separate ZooKeeper ensemble (3 or 5 nodes).

# Use the bundled ZooKeeper for single-node development
sudo tee /opt/kafka/config/zookeeper.properties <<EOF
dataDir=/var/lib/kafka/zookeeper
clientPort=2181
maxClientCnxns=60
admin.enableServer=false
EOF

# Format ZooKeeper data directory
sudo -u kafka mkdir -p /var/lib/kafka/zookeeper

Create ZooKeeper systemd service:

sudo tee /etc/systemd/system/zookeeper.service <<EOF
[Unit]
Description=Apache ZooKeeper (Kafka dependency)
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=kafka
Group=kafka
EnvironmentFile=/etc/kafka.env
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh \
  /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-failure
RestartSec=10s
LimitNOFILE=100000
StandardOutput=append:/var/log/kafka/zookeeper.log
StandardError=append:/var/log/kafka/zookeeper-error.log

[Install]
WantedBy=multi-user.target
EOF

Configure Kafka for ZooKeeper mode:

Edit /opt/kafka/config/server.properties and set:

broker.id=0
zookeeper.connect=localhost:2181
log.dirs=/var/lib/kafka/data

Start ZooKeeper first, then Kafka:

sudo systemctl daemon-reload
sudo systemctl start zookeeper
sudo systemctl enable zookeeper

# Wait 10 seconds for ZooKeeper to initialize
sleep 10

sudo systemctl start kafka
sudo systemctl enable kafka

Open ZooKeeper port if needed:

sudo firewall-cmd --permanent --add-port=2181/tcp
sudo firewall-cmd --reload

Quick Troubleshooting Reference

ErrorSolution
kafka.common.InconsistentClusterIdException on startupThe cluster UUID in the storage directory doesn’t match the config file. Either re-format (losing all data) or restore the original cluster UUID. Always save the UUID when setting up.
“Too many open files” in Kafka logsLimitNOFILE isn’t applied or the system limit overrides it. Verify: cat /proc/$(pgrep -f kafka)/limits | grep 'open files'. Should show 100000. If not, check /etc/security/limits.d/kafka.conf and restart.
Connection refused on port 9092 from remote clientsEither

Leave a Reply

Your email address will not be published. Required fields are marked *