Introduction to Confluent Kafka Python Producer
Apache Kafka is a publish-subscribe messaging queue used for real-time streams of data. Apache Kafka lets you send and receive messages between various Microservices. In this article, we will see how to send JSON messages using Python and Confluent-Kafka Library.JavaScript Object Notation (JSON) is a standard text-based format for representing structured data. It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.
Prerequisites:
- Good knowledge of Kafka Basic Concepts (e.g. Kafka Topics, Brokers, Partitions, Offset, Producer, Consumer, etc).
- Good knowledge of Python Basics (pip install <package>, writing python methods).
Solution :
Kafka Python Producer has different syntax and behaviors based on the Kafka Library we are using. So the First Step is choosing the Right Kafka Library for our Python Program.
Popular Kafka Libraries for Python:
While working on Kafka Automation with Python we have 3 popular choices of Libraries on the Internet:
- PyKafka
- Kafka-python
- Confluent Kafka
Each of these Libraries has its own Pros and Cons So we will have to choose based on our Project Requirements.
Step 1: Choosing the right Kafka Library
If we are using Amazon MSK clusters then We can build our Kafka Framework using PyKafka or Kafka-python (both are Open Source and most popular for Apache Kafka). If we are using Confluent Kafka clusters then We have to use Confluent Kafka Library as we will get Library support for Confluent specific features like ksqlDB, REST Proxy, and Schema Registry.
We will use Confluent Kafka Library for Python Kafka Producer as we can handle both Apache Kafka cluster and Confluent Kafka cluster with this Library.
We need Python 3.x and Pip already installed. We can execute the below command to install the Library in our System.
pip install confluent-kafka
Step 2: Kafka Authentication Setup.
Unlike most of the Kafka Python Tutorials available on the Internet, We will not work on localhost. Instead, We will try to connect to the Remote Kafka cluster with SSL Authentication. In order to connect to Kafka clusters, Generally, We get 1 JKS File and one Password for this JKS file from the Infra Support Team. This JKS file works fine with Java/Spring but not with Python.
So our job is to convert this JKS file into the appropriate format (as expected by the Python Kafka Library).
For Confluent Kafka Library, We need to convert the JKS file into PKCS12 format in order to connect to remote Kafka clusters.
To learn more visit the below pages:
- How to convert JKS to PKCS12?
- How to receive messages using Confluent Kafka Python Consumer
Step 3: Confluent Kafka Python Producer with SSL Authentication.
We will use the same PKCS12 file that was generated during JKS to the PKCS conversion step mentioned above.
Python3
import time
import json
from uuid import uuid4
from confluent_kafka import Producer
jsonString1 =
jsonString2 =
jsonString3 =
jsonv1 = jsonString1.encode()
jsonv2 = jsonString2.encode()
jsonv3 = jsonString3.encode()
def delivery_report(errmsg, msg):
if errmsg is not None :
print ( "Delivery failed for Message: {} : {}" . format (msg.key(), errmsg))
return
print ( 'Message: {} successfully produced to Topic: {} Partition: [{}] at offset {}' . format (
msg.key(), msg.topic(), msg.partition(), msg.offset()))
kafka_topic_name = "kf.topic.empdev"
mysecret = "yourjksPassword"
print ( "Starting Kafka Producer" )
conf = {
'bootstrap.servers' : 'm1.msk.us-east.aws.com:9094, m2.msk.us-east.aws.com:9094, m3.msk.us-east.aws.com:9094' ,
'security.protocol' : 'SSL' ,
'ssl.keystore.password' : mysecret,
'ssl.keystore.location' : './certkey.p12'
}
print ( "connecting to Kafka topic..." )
producer1 = Producer(conf)
producer1.poll( 0 )
try :
producer1.produce(topic = kafka_topic_name, key = str (uuid4()), value = jsonv1, on_delivery = delivery_report)
producer1.produce(topic = kafka_topic_name, key = str (uuid4()), value = jsonv2, on_delivery = delivery_report)
producer1.produce(topic = kafka_topic_name, key = str (uuid4()), value = jsonv3, on_delivery = delivery_report)
producer1.flush()
except Exception as ex:
print ( "Exception happened :" ,ex)
print ( "\n Stopping Kafka Producer" )
|
Sample Output of this Above Code :
Starting Kafka Producer
connecting to Kafka topic...
Message: b'4acef7b3-dx55-5f89-b69r-18b3188f919z' successfully produced to Topic: kf.topic.empdev Partition: [1] at offset 43211
Message: b'98xff6y4-crl5-gfgx-dq1r-k3z5122h611v' successfully produced to Topic: kf.topic.empdev Partition: [2] at offset 43210
Message: b'rus3v9xx-0bd9-astn-mrtn-yyz1920evl6r' successfully produced to Topic: kf.topic.empdev Partition: [0] at offset 43211
Stopping Kafka Producer
Conclusion :
We have got some idea on How to publish JSON messages on Kafka Topic using Python. So we can extend this Code as per our Project needs and continue modifying and developing our Kafka Automation Framework. We can also send all messages based on some condition to a specific Kafka Partition instead of sending equally to all partitions. To explore more on Confluent kafka Python Library we can visit: Confluent Docs
Last Updated :
01 Dec, 2022
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...