Hadoop

1. Hadoop

  • Why Big Data and Hadoop?
  • Problem in Data Driven Businesses o How Hadoop Solves it and why Big Data Solutions
  • Hadoop Fundamental
  • What comprises of Hadoop, Subprojects and Ecosystem
  • Core Hadoop Components
  • Apache Subprojects
  • Hadoop Ecosystem

2. HDFS

  • HDFS o HDFS Feature
  • HDFS Architecture – Non HA o HDFS Architecture – HA
  • Writing and Reading Files in HDFS o NameNode Memory and Load Handling
  • Basic HDFS Security
  • HDFS commands
  • Hands-on in writing, reading files with HDFS, Permissions, Viewing Blocks and other basic HDFS Operations

3. Yarn

  • Mapreduce and YARN – Basics
  • Why Computational Framework
  • YARN Architecture
  • MapReduce Architecture and Hands-on
  • Spark Architecture
  • How YARN executes MR and Spark jobs
  • How to see YARN Applications in WEB UIs and Shell o YARN Application Logs

4. Sqoop

  • Importing RDBMS Data to Hadoop
  • Introduction to Apache Sqoop
  • Sqoop Architecture
  • Using Sqoop to import RDBMS Table to HDFS
  • Change the Delimiter and File Format of imported Tables
  • Control which columns to be imported
  • Sqoop Performance improvement o Sqoop – Import and Export using Sqoop.
  • Incremental Data Load using Sqoop

5. Hive

  • Hive Architecture and Data model
  • How to query Hive and Impala/Tez o How Hive and Impala/Tez differs RDBMS
  • Usage of Hive Metasore by Hive and Impala
  • HiveQL and Impala SQL for query operations
  • Managed and External Tables o Introduction to Hue
  • Create Tables using Hue o Load Data using Hive, impala and sqoop import to Hive tables
  • Overview of Partitions
  • Partitions in Hive and Impala o Dealing with Hive Partition Tables

6.Hadoop Data Formats

  • Introduction to Data Formats
  • Various Data Formats o Introduction to AVRO
  • Parquet
  • Evolution of Avro Schema – Compatabilities
  • Extracting Metadata and data from AVRO data file
  • Using AVRO with hive, sqoop
  • Using Parquet with hive, sqoop

7. Spark

  • What is spark
  • Spark architecture
  • RDD Intro
  • Transformations & Actions
  • Dataframe API
  • Spark Execution framework
Share this Post!

About the Author : ABrilliants


Skip to toolbar