Course Outline

Introduction

  • Why and how project teams adopt Hadoop
  • How it all started
  • The Project Manager's role in Hadoop projects

Understanding Hadoop's Architecture and Key Concepts

  • HDFS
  • MapReduce
  • Other pieces of the Hadoop ecosystem

What Constitutes Big Data?

Different Approaches to Storing Big Data

HDFS (Hadoop Distributed File System) as the Foundation

How Big Data is Processed

  • The power of distributed processing

Processing Data with MapReduce

  • How data is picked apart step by step

The Role of Clustering in Large-Scale Distributed Processing

  • Architectural overview
  • Clustering approaches

Clustering Your Data and Processes with YARN

The Role of Non-Relational Database in Big Data Storage

Working with Hadoop's Non-Relational Database: HBase

Data Warehousing Architectural Overview

Managing Your Data Warehouse with Hive

Running Hadoop from Shell-Scripts

Working with Hadoop Streaming

Other Hadoop Tools and Utilities

Getting Started on a Hadoop Project

  • Demystifying complexity

Migrating an Existing Project to Hadoop

  • Infrastructure considerations
  • Scaling beyond your allocated resources

Hadoop Project Stakeholders and Their Toolkits

  • Developers, data scientists, business analysts and project managers

Hadoop as a Foundation for New Technologies and Approaches

Closing Remarks

Requirements

  • A general understanding of programming
  • An understanding of databases
  • Basic knowledge of Linux
  14 Hours
 

Testimonials (3)

Related Courses

Related Categories