Installing and Starting Apache Kafka event streaming platform
Apache Kafka is used for building real time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.
- Publish and subscribe: Read and write streams of data like a messaging system.
- Process: write scalable stream processing apps that react to events in real-time.
- Store: store streams of data safely in a distributed, replicated, fault-tolerant cluster.
- Global
- Real time
- Event oriented
- Streaming platform imagined in
— Automobile industries
— Financial Banking(Global bank of Canada)
— Retail
LET’S BEGIN KAFKA —
- Install Java : version > 8
- Download Apache Kafka CLI
- https://kafka.apache.org/downloads :Latest Binary downloads
- Unzip/Extract the downloaded folder at your preferred location.
3. Open the extracted Kafka folder and make a folder named ‘data’, inside it.
4. Inside ‘data’ folder make two more empty folders, ‘kafka’ & ‘zookeeper’.
5. Update zookeeper — data directory path we created earlier, in “config/zookeeper.Properties” configuration file inside the kafka installation folder.
6. Update Kafka — log directory path in “config/server.properties” configuration file inside the kafka installation folder.
7. Open command prompt at ‘windows’ folder of Kafka and run Zookeeper with the command :
D:\kafka_2.12–2.5.0\bin\windows>
zookeeper-server-start.bat ../../config/zookeeper.properties
9. Open a new command prompt again at ‘windows’ folder of Kafka and run Kafka with the command :
D:\kafka_2.12–2.5.0\bin\windows>
kafka-server-start.bat ../../config/server.properties
Well, Apache Kafka & Zookeeper are finally up and running. Start producer, consumer and migrate your data. Will discuss later : ‘how to’.
Here, Zookeeper is required to start the Kafka cluster to store meta-data, choose the broker leader and partitions, etc. We will discuss this briefly in the next blog.