Real-time Twitter Analysis

In this tutorial, we will use TwitterStreaming API to get Twitter’s tweets in real time and publish it to a Kafka topic. We then use Spark as a Consumer to read Twitter data from Kafka and analyse that data. To

Integrating Kafka with Spark using Structured Streaming

In the previous tutorial (Integrating Kafka with Spark using DStream), we learned how to integrate Kafka with Spark using an old API of Spark – Spark Streaming (DStream) . In this tutorial, we will use a newer API of Spark,

A WordCount program

In this tutorial, we will write a WordCount program that count the occurrences of each word in a stream data received from a Data server. We will use Netcat to simulate the Data server and the WordCount program will use

Spark Structured Streaming

In this tutorial series, we will learn how to use Structured Streaming – a Spark’s stream processing engine built on Spark SQL. In Structured Streaming, stream data is consider as an unbounded table where arriving data is like a new