Course Name:Big Data/Hadoop Development & Administration
Course Duration: 3 months
Course Fees: 30,000
Syllabus:
The Big Data Economy
- Data! Data! Data!
- Data Economy
- Data Analytics
- Data Science
- Traditional Data Processing Technologies
Apache Hadoop Architecture and Ecosystem
- Hadoop Background
- Hadoop Architecture
- Hadoop and RDBMS
- Hadoop Subprojects
- Hadoop Distributions
- Hadoop Documentation
Setting up Hadoop
- Installing Hadoop
- Configuring Hadoop
- Starting Hadoop
- Running Hadoop Clients
- Browsing Hadoop UI Consoles
HDFS Architecture
- Hadoop 1.0 HDFS Architecture
- Hadoop 1.0 HDFS Architectural Capabilities – – Performance, Scalability, Availability, Installability, Comnfigurability, Operability, Usability, Security
- Hadoop 2.0 HDFS Architecture
HDFS Programming Basics
- Hadoop Configuration API
- HDFS API Overview
- HDFS File CRUD API
- HDFS Directory CRUD API
HDFS Programming Advanced
- File Compression Decompression
- Type Serialization Deserialization
- Sequence Files
MapReduce Architecture
- Hadoop 1.0 MapReduce Architecture
- Hadoop 1.0 MapReduce Architectural Capabilities – Performance, Scalability, Availability, Installability, Comnfigurability, Operability, Usability, Programmability
- Hadoop 2.0 MapReduce Architecture
MapReduce Programming Basics
- MapReduce Programming Concepts – Map Phase and Reduce Phase
- MapReduce API – Key Java Classes and their Hierarchy
- Steps to Write a MapReduce Program
MapReduce Programming Intermediate
- Setting Mapper Counts and Reducer Counts
- MapReduce Configuration
- Combiners
- Partitioners
- Speculative Execution
- Task JVM Reuse
- Compression
MapReduce Programming Advanced
- Output Format
- Custom data Format
- Input Format
- Built in Mappers and Reducers
- Counters
- Multithreading
- Distributed Cache
MapReduce Streaming and Pipes
- MapReduce using Hadoop Streaming
- MapReduce using Hadoop Pipes
MapReduce Development Best Practices
- Logging in Hadooop
- Exception Handling
- Running Jobs Locally
- Unit Testing with MRUnit
- Top 10 Hadoop Anti-Patterns
Querying Data using Hive
- Hive Background
- Hive Architecture
- Downloading, Installing and Configuring Hive
- Simple Hive Example
- Loading Data into Hive
- Hive Query Statements
- Hive Schema Violations
- Using Built-in Hive Functions
- Partitioning Data using Hive
- Joining Data
Querying Data using Pig
- Pig Background
- Architecture
- Downloading, Installing and Configuring Pig
- Running Pig
- Pig Latin Language Basics
- Core Relational Operators – DISTINCT, FILTER, SPLIT, ORDER BY, LIMIT, GROUP, FOREACH
- Built-in Functions
- Relational Join Operators
- Debug Operators
Realtime Database using HBase
- HBase Overview
- Data Model
- Architecture
- Downloading, Installing and Configuring HBase
- HBase Shell
- HBase Java API for CRUD Operations