Demo of Apache Kafka on Heroku

Overview

This system consumes data from the Twitter Streaming API, manipulates the data using a series of Heroku apps, and generates a dynamic visualization of the manipulated data.

The architecture uses five Heroku apps, each serving a different role in the data pipeline.

Data Ingest: Read from Twitter streaming API and produce messages for high volume ingest on Kafka topic
Data Fanout: Consume ingested messages and fans out to discrete keyword Kafka topics
Aggregate Statistic Calculation: Consume messages from keyword Kafka topics, and calculate and produce aggregate mention count to a topic
Related Terms Generation: Consumes messages from keyword Kafka topics, and produce related words & related word count to a topic
Visualization: Consume messages from aggregate and related words Kafka topics and generate the dynamic stream visualizations in a web application

Demo of Apache Kafka on Heroku

Overview

Architecture