Course Information
Course Name
CTAK: Cloudera Training for Apache Kafka
Exam code
CDP-3003
Duration
4 Days
Certification
Cloudera Data Operator
Overview
This four-day instructor-led course begins by introducing Apache Kafka, explaining its key concepts and architecture, and discussing several common use cases. Building on this foundation, you will learn how to plan a Kafka deployment, and then gain hands-on experience by installing and configuring your own cloud-based, multi-node cluster running Kafka on the Cloudera Data Platform (CDP).
You will then use this cluster during more than 20 hands-on exercises that follow, covering a range of essential skills, starting with how to create Kafka topics, producers, and consumers, then continuing through progressively more challenging aspects of Kafka operations and development, such as those related to scalability, reliability, and performance problems. Throughout the course, you will learn and use Cloudera’s recommended tools for working with Kafka, including Cloudera Manager, Schema Registry, Streams Messaging Manager, and Cruise Control.
Audience Profile
This course is designed for system administrators, data engineers, and developers.
Prerequisities
All students are expected to have basic Linux experience, and basic proficiency with the Java programming language is recommended. No prior experience with Apache Kafka is necessary.
At Course Completion
During this course, you learn how to:
Plan, deploy and operate Kafka clusters
Create and manage topics
Develop producers and consumers
Use replication to improve fault tolerance
Use partitioning to improve scalability
Troubleshoot common problems and performance issues
Module 1: Kafka Overview
· High-Level Architecture
· Common Use Cases
· Cloudera’s Distribution of Apache Kafka
Module 2: Deploying Apache Kafka
· System Requirements and Dependencies
· Service Roles
· Planning Your Deployment Deploying Kafka Services
· Exercise: Preparing the Exercise Environment
· Exercise: Installing the Kafka Service with Cloudera Manager
· Exercise (optional): Create Metrics Dashboards
· Exercise (optional): Using the CM API
Module 3: Kafka Command Line Basics
· Create and Manage Topics
· Running Producers and Consumers
Module 4: Using Streams Messaging Manager (SMM)
· Streams Messaging Manager Overview
· Producers, Topics, and Consumers
· Data Explorer
· Brokers
· Topic Management
· Exercise: Managing Topics using the CLI
· Exercise: Connecting Producers and Consumers from the Command Line
Module 5: Kafka Java API Basics
· Overview of Kafka’s APIs
· Topic Management from the Java API
· Exercise (optional): Managing Kafka Topics Using the Java API
· Using Producers and Consumers from the Java API
· Exercise: Developing Producers and Consumers with the Java API
Module 6: Improving Availability through Replication
· Replication
· Exercise: Observing Downtime Due to Broker Failure
· Considerations for the Replication Factor
· Exercise: Adding Replicas to Improve Availability
Module 7: Improving Application Scalability
· Partitioning
· How Messages are Partitioned
· Exercise: Observing How Partitioning Affects Performance
· Consumer Groups
· Exercise: Implementing Consumer Groups
· Consumer Rebalancing
· Exercise: Using a Key to Control Partition Assignment
Module 8: Improving Application Reliability
· Delivery Semantics
· Demonstration (optional): ISRs vs. ACKs
· Producer Delivery
· Exercise: Idempotent Producer
· Transactions
· Exercise: Transactional Producers and Consumers
· Handling Consumer Failure
· Offset Management
· Exercise: Detecting and Suppressing Duplicate Messages
· Exercise: Handling Invalid Records
· Handling Producer Failure
Module 9: Analyzing Kafka Clusters with SMM
End-to-End Latency
Notifiers
Alert Policies
Use Cases
Module 10: Monitoring Kafka
Monitoring Overview
Monitoring using Cloudera Manager
Charts and Reports in CM
Monitoring Recommendations
Metrics for Troubleshooting
Diagnosing Service Failure
Exercise: Monitoring Kafka
Module 11: Managing Kafka
Managing Kafka Topic Storage
Demonstration (optional): Message Retention Period
Log Cleanup and Collection
Rebalancing Partitions
Cruise Control
Exercise: Installing Cruise Control
Exercise: Troubleshooting Kafka Topics
Unclean Leader Election
Exercise: Unclean Leader Election
Adding and Removing Brokers
Exercise: Adding and Removing Brokers
Best Practices
Module 12: Message Structure, Format, and Versioning
Message Structure
Schema Registry
Defining Schemas
Schema Evolution and Versioning
Schema Registry Client
Exercise: Using an Avro Schema
Module 13: Improving Application Performance
Message Size
Batching
Compression
Exercise: Observing How Compression Affects Performance
Module 14: Improving Kafka Service Performance
Performance Tuning Strategies for the Administrator
Cluster Sizing
Exercise: Planning Capacity Needed for a Use Case
Module 15: Securing the Kafka Cluster
Encryption
Authentication
Authorization
Auditing
All Cloudera certification courses are conducted by certified trainers from Iverson.
Digital Methods acts as the official training partner and assists with program consultation, registration, coordination, scheduling, and administrative arrangements to ensure a seamless and well-managed training experience.