Choose your language:

Hong Kong
New Zealand
United Kingdom
United States

Advanced Hadoop - MapR

Course Code



3 Days

Participants should have gone through Hadoop Essentials course prior to this; it’s beneficial to have familiarity with Java, Hadoop Concepts, HBASE basics and general MAPR Distribution concepts.
This course provides Hadoop developers a deep-dive into Hadoop application development. Participants will learn how to design and develop programs using multiple eco-system components, analyze and perform computations on their Big Data.
This course is designed for individuals who are developers and engineers who have Hadoop experience.

In this course, participants will:

  • Understand how NoSQL Databases work (HBase and MapR-DB) in detail.
  • Look into Spark on a high-level.
  • Learn Scala programming basics.
  • Understand various streaming solutions that are available in Big Data.
  • Get hands-on experience with Spark Streaming, Kafka and MapR Streams.

Module 1: Deep Dive into HBase
Base Architecture
HBase Data Model
Differentiate between HBase and RDBMS
HBase Schema Design Elements
Understand Schema Design Options for a Specific Use Case
Hot Spotting and its Solution
Table Shapes
Designing 1: N and M:M Relationships
Designing for Hierarchical Data
Designing for Inheritance Mapping Data
Schema Design
Import/Export Data
Data Operations
CRUD Operations and Bulk Loading HBase
Create Table using Trade Stocks Data
Create Table with Different Schema Design Options on Airline Data
Lab Exercises

Module 2: MapR-DB
MapR-DB Overview
MapR-DB Architecture
MapR-DB on MapR-Architecture
Advantages of MapR-DB
MapR-DB vs. HBase
MapR-DB Schema Design
Binary Table
JSON Table
MCS (MapR-Control Systems)
Lab Exercises

Module 3: Spark
Introduction to Apache Spark
Apache Spark Architecture
Spark on Hadoop
Interact with Spark using Java, Python and Scala
Scala Programming Basics

  • Overview
  • Discuss on Functions
  • Lab Exercises 

Load and Inspect Data on Spark RDD
Learning Different Modes of Running Spark Jobs
Understanding the Spark UI
Operations on RDD
Build a Simple Spark Application
Launching Spark Programs
Work with Pair RDD
Work with Apache Spark Data Frames, Dataset and Spark SQL
Build and Monitor Apache Spark Applications
Lab Exercises
Introduction to Big Data Streaming
Dwell into Spark Streaming
Introduction to Apache Spark Data Pipelines
Create an Apache Spark Streaming Application
Introduction to Structured Streaming
Discuss on Real World Examples
Lab Exercises

Module 4: Kafka
Introduction to Apache Kafka
What is a Messaging System?
Why Apache Kafka
Apache Kafka - Fundamentals & Architecture
Apache Kafka - Work Flow
Apache Kafka - Zookeeper Role
Apache Kafka - Basic Operations
Console Producer & Consumer
Apache Kafka - Simple Producer
Apache Kafka - Simple Consumer
Apache Kafka - Use Kafka Connect
Introduction to Kafka Streams
Kafka API’s
Lab Exercises

Module 5: MapR Streams
Introduction to MapR Streams
Why MapR Streams
Event processing with MapR Streams
Steams and Replication
MapR Streams Architecture
Discuss on Partitions
Introduction to Producers and Consumers
Create a Java Producer
Create a Java Consumer
Spark Streaming vs. Kafka vs. MapR Streams
Lab Exercises

Send Us a Message
Choose one