Choose your language:

France
Germany
Hong Kong
India
Ireland
Japan
Malaysia
Netherlands
New Zealand
Singapore
Sweden
United Kingdom
United States

Advanced Hadoop - Hortonworks

Course Code

BD71

Duration

2 Days

Participants should have gone through Hadoop Essentials course prior to this; it’s beneficial to have familiarity with Java, Hadoop Concepts, HBASE basics and general Hortonworks concepts.
This course provides Hadoop developers a deep-dive into Hadoop application development. Participants will learn how to design and develop programs using multiple eco-system components, analyze and perform computations on their Big Data.
This course is designed for individuals who are developers and engineers who have Hadoop experience.

In this course, participants will:

  • Understand how NoSQL Databases work in detail
  • Look into Spark on a high-level
  • Learn Scala programming basics
  • Understand various streaming solutions that are available in Big Data
  • Get hands-on experience with Spark Streaming and Kafka

Module 1: Deep Dive into HBase
Base Architecture
HBase Data Model
Differentiate between HBase and RDBMS
HBase Schema Design Elements
Understand Schema Design Options for a Specific Use Case
Hot Spotting and its Solution
Table Shapes
Designing 1: N and M:M Relationships
Designing for Hierarchical Data
Designing for Inheritance Mapping Data
Schema Design
Import/Export Data
Data Operations
CRUD Operations and Bulk Loading HBase
Create Table using Trade Stocks Data
Create Table with Different Schema Design Options on Airline Data
Lab Exercises

Module 2: Spark
Introduction to Apache Spark
Apache Spark Architecture
Spark on Hadoop
Interact with Spark using Java, Python and Scala
Scala Programming Basics

  • Overview
  • Discuss on Functions
  • Lab Exercises

Load and Inspect Data on Spark RDD
Learning Different Modes of Running Spark Jobs
Understanding the Spark UI
Operations on RDD
Build a Simple Spark Application
Launching Spark Programs
Work with Pair RDD
Work with Apache Spark Data Frames, Dataset and Spark SQL
Build and Monitor Apache Spark Applications
Lab Exercises
Introduction to Big Data Streaming
Dwell into Spark Streaming
Introduction to Apache Spark Data Pipelines
Create an Apache Spark Streaming Application
Introduction to Structured Streaming
Discuss on Real World Examples
Lab Exercises

Module 3: Kafka
Introduction to Apache Kafka
What is a Messaging System?
Why Apache Kafka
Apache Kafka - Fundamentals & Architecture
Apache Kafka - Work Flow
Apache Kafka - Zookeeper Role
Apache Kafka - Basic Operations
Console Producer & Consumer
Apache Kafka - Simple Producer
Apache Kafka - Simple Consumer
Apache Kafka - Use Kafka Connect
Introduction to Kafka Streams
Kafka API’s
Lab Exercises

Send Us a Message
Choose one