Course Outline
Introduction
Overview of Spark Streaming Features and Architecture
- Supported data sources
- Core APIs
Preparing the Environment
- Dependencies
- Spark and streaming context
- Connecting to Kafka
Processing Messages
- Parsing inbound messages as JSON
- ETL processes
- Starting the streaming context
Performing a Windowed Stream Processing
- Slide interval
- Checkpoint delivery configuration
- Launching the environment
Prototyping the Processing Code
- Connecting to a Kafka topic
- Retrieving JSON from data source using Paw
- Variations and additional processing
Streaming the Code
- Job control variables
- Defining values to match
- Functions and conditions
Acquiring Stream Output
- Counters
- Kafka output (matched and non-matched)
Troubleshooting
Summary and Conclusion
Requirements
- Experience with Python and Apache Kafka
- Familiarity with stream-processing platforms
Audience
- Data engineers
- Data scientists
- Programmers
Testimonials (5)
Engagement with the Trainer A number of relevant Exercises and Labs Practical Exams
Salim - SICPA SA
Course - Administration of Kafka Message Queue
interactive approach of the teacher, not a straight story but acting on the questions from the audience.
Rens - Canon Medical Informatics Europe B.V.
Course - Administration of Kafka Topic
The labs and the slides combine well with Jorge's knowledge and love for Kafka.
Willem - BMW SA
Course - Apache Kafka for Developers
Sufficient hands on, trainer is knowledgable
Chris Tan
Course - A Practical Introduction to Stream Processing
Grate skills, examples, very good exercises