Apache Hive Training Course

Apache Hive is a data warehouse system build on top of Hadoop to query Big Data. Hive originated at Facebook and was open sourced in August 2008. The challenge Facebook had to address is one faced by many companies since then. Eventually data growth in a company challenges the capabilities of deployed RDBMS or NoSQL systems. Reports and analytics start to take minutes, then hours, and eventually overlap with other queries and the whole system grinds to a halt. Another common scenario companies start processing big data with Hadoop discovers the value of making the data accessible beyond the development team capable of writing complex map-reduce jobs.

Day 1

  • Introducing Hive
  • Getting Started with Hive
  • Data Types
  • HiveQL - Data Definition, Data Manipulation, Queries, Views, Indexes
  • Schema Design
  • Development with Hive

Day 2

  • Tuning
  • MapReduce Scripts
  • Partitions and Buckets
  • Storage Formats
  • Joins
  • Hive & AWS

© 2016 Laliwala IT. All rights reserved.