In the previous tutorial (A basic Kafka program), we wrote two Java classes: SimpleProducer and SimpleConsumer. SimpleProducer acts like a Kafka Producer that send messages (numbers from 0 to 9) to ‘test’ topic and SimpleConsumer is a Kafka Consummer that subscribe to the ‘test’ topic. In this tutorial, we will write a program that uses Kafka to publish Twitter data by combining the code of SimpleProducer and the code of the tutorial Phân tích bài đăng trên Twitter.

First, we create a Kafka Topic named TwitterData with the following command (more details on Install Apache Kafka):

Next, we create a Java class named KafkaTwitter and add the dependency of twitter4j to pom.xml file(described on the tutorial Phân tích bài đăng trên Twitter). We also reuse the getTweets(String topic) method to collect data about a particular topic as follows:

In the main method of KafkaTwitter class, we create a Producer with the same properties as of SimpleProducer and set the Kafka Topic to be TwitterData topic:

Next, we use the getTweets(String topic) method to collect tweets related to “Natural Language Processing” as follows:

Finnaly, we publish Tweeter’s data to Kafka’s TwitterData topic as follows:

After running the program (press Ctrl+Shift+F10), we get the message: “Message sent successfully”, indicating that we have run the program sucessfully.

To check the received tweets, we open a Consumer terminal with the following command:

The tweets published to Kafka’s TwitterData topic will be displayed as follows:

So, we have finished writing a program that uses Kafka to publish Twitter’s tweets related to “Natural Language Processing” to a Kafka topic. The full code of this tutorial is provided below.

December 5, 2018
ITechSeeker