Building Hadoop-based Apps on YARN

来源:互联网 发布:linux双网关配置 编辑:程序博客网 时间:2024/05/01 02:12

Apache Hadoop YARN changes the game for Hadoop applications, enabling a multi-application, multi-workload general purpose data operating system. YARN is:

  1. Flexible

    Store data once and interact with it in multiple ways from batch to interactive to real time and streaming.

    Architected to enable new workloads.

  2. Shared

    Re-use key platform services for reliability, redundancy and security across multiple workloads.

    Multi-tenant architecture shares core resources while isolating services and data.

  3. Efficient

    Do more with less: 30%+ increased efficiency on existing resource utilization.

    Share and segment applications based on cluster resource management.

This set of resources is intended to get you up and running developing apps for YARN.

STEP 1. Understand the motivations and architecture for YARN.

Apache Hadoop YARN is the data operating system for Hadoop 2.0. YARN enables a user to interact with all data in multiple ways simultaneously, making Hadoop a true multi-use data platform and allowing it to take its place in a modern data architecture. Find out more about the concepts and specifics of YARN.


Get an overview of Apache Hadoop YARN concepts in this slide deck.

Concepts

  • Introducing Apache Hadoop YARN
  • Apache Hadoop YARN – Background and an Overview
  • Apache Hadoop YARN – Concepts and Applications
  • Apache Hadoop YARN – ResourceManager
  • Apache Hadoop YARN – NodeManager

Building Apps

  • Running existing applications on Hadoop 2 YARN
  • Stabilizing YARN APIs for Apache Hadoop 2 
  • Management of Application Dependencies
  • Resource Localization in YARN: Deep Dive
  • Simplifying user-logs management and access in YARN

STEP 2. Explore example applications on YARN.

The simple applications in this section show how to build and deploy apps against the YARN APIs and are a simple way to get started. These apps can be easily replicated in the Hortonworks Sandbox VM environment.

  • Simple YARN App. This ‘Hello World’ app for YARN runs n copies of a unix command.
  • Distributed Shell. This fuller example implements a distributed shell on YARN.
  • MemcacheD on YARN. A tutorial showing how to deploy the very popular MemcacheD framework on YARN.

STEP 3. Examine real world applications YARN.

These applications are richer applications built on YARN and demonstrate real-world use and deployment.

  • MapReduce on YARN The official codebase for Apache Hadoop MapReduce on YARN (MR2)
  • HBase on YARN. Efforts to deploy HBase on YARN.

FURTHER RESOURCES

The following resources can also assist with developing Hadoop-based Apps on YARN.

  • Apache Hadoop YARN – Enabling Next Generation Applications
    Get Started with Hadoop 2.0 with this reference presentation
  • Sample eBook Chapters – Apache Hadoop YARN
  • Find more blog posts related to Apache Hadoop YARN

TRAINING

Hortonworks also provides training and certification for Hadoop.

  • Hadoop Essentials 1-Day Class
  • Hadoop Training for Java Developers – 4-Day Class
  • Hadoop on Windows – 4-Day Class
  • Hadoop for Data Analysts – 4-Day Class
Ref: http://hortonworks.com/get-started/yarn/
0 0
原创粉丝点击