Skip to content

Custom Spark Kafka consumer based on Kafka SimpleConsumer API.

License

Notifications You must be signed in to change notification settings

zhiwuya/spark-kafka-streaming

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spark-kafka-streaming

Custom Spark Kafka consumer based on Kafka SimpleConsumer API.

Features

  • discover kafka metadata from zookeeper (more reliable than from brokers, does not depend on broker list changes)
  • reding from multiple topics
  • reliably handles leader election and topic reassignment
  • saves offsets and stream metadata in hbase (more robust than zookeeper)
  • supports metrics via spark metrics mechanism (jmx, graphite, etc.)

Todo

  • abstract offset storage
  • time controlled offsets commit
  • refactor kafka message to rdd elements transformation (flatmapper method)

Usage example in ./examples

About

Custom Spark Kafka consumer based on Kafka SimpleConsumer API.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Scala 100.0%