In this tutorial, we will learn how to write a program that uses Akka Streams – a library developed by Akka to address the problems of using Akka Actor for stream processing (Eg: buffer overflow problem during messages transmission or re-transmitting problem when messages are lost)

First, we add the dependency of Akka Streams to pom.xml file as follows:

Akka Streams consists of five main components [1] [2]:

– Source: has exactly one output stream. Source has two input parameters: type of output streams and additional information such as port, IP address,..  For example: Source[Int, NotUsed] creates an Int-type Output stream and NotUsed is used when the users don’t want to specify any additional information.

– Sink: has exactly one input stream ( can slow down the producer’s stream to meet the speed of the consumer)

– Flow: has exactly one input and one output stream (to process stream data)

– BidiFlow: behave like two Flows of opposite directions (2 input streams and 2 output streams)

– Graph: defines the pathways through which elements shall flow when the stream is running.

In this tutorial, we will use Source, Sink and Flow to write a program that extracts author and hashtag of Twitter’s tweets. First we defines the following three classes (we use Regex [^#\w]  to find all special characters that are not “#” like @, &,$ and remove them from the hashtag. You can find more details about regex on the tutorial Regular expression)

Next, we create an environment to run Akka Stream by defining an ActorSystem and a Materializer (Materializer is used to define an engine to run Akka stream). We also create an output stream by using Source as follows:

In akka, Streams always start from a Source[Out,M1], through Flow[In,Out,M2] elements and finally be consumed by a Sink[In,M3] (M1, M2 and M3 are materialized types). This process can be illustrated by the following code which extracts all hashtags from tweets stream:

Run the code, we get the following results (the original hastag “#kafka**” has been shorten to the hashtag “#kafka” with the help of Regex as explained above):

Similarly, we can list all Authors whose posts contain the hashtag #akka as follows:

So, we have finished writing a basic program that demonstrate how Akka Streams work. The full code of this tutorial is provided below.




January 30, 2019