HDP学习--YARN Resource Management(00)

来源:互联网 发布:淘宝合伙开店协议 编辑:程序博客网 时间:2024/05/17 05:14

一、 Overview

  YARN (Yet Another Resource Negotiator) 是Hadoop的计算框架, 如果HDFS看做Hadoop集群的文件系统, 那么YARN就是Hadoop集群的操作系统。YARN是Hadoop的中心架构
  操作系统, 像Window或Linux管理安装的程序访问资源(如CPU, memory, and disk), 同样, YARN提供一种管理架构师的多种类型(batch, interactive, online, streaming…)的应用都可以跨越整个集群来执行操作数据。 YARN manages resource allocation for the various types of data processing workloads, prioritizes and schedules jobs, and enables authentication and multitenancy.


  Mutitenancy: Software multitenancy is achieved when a single instance of an application serves multiple groups of users, or “tenants.” Each tenant shares common access to an application, hardware, and underlying resources (including data), but with specific and potentially unique privileges granted by the application based on their identification. This is in contrast with multi-instance architectures, where each user gets a unique instance of an application, and the application then competes for resources on behalf of its tenant. A typical example of a multitenant application architecture would be SaaS cloud computing, where multiple users and even multiple companies are accessing the same instance of an application at the same time (for example, Salesforce CRM). A typical example of a multi-instance architecture would be applications running in virtualized or IaaS environments (for example, applications running in KVM virtual machines).


注意:
在以前的hadoop版本中, 资源管理是MapReduce的一部分, 这种情况下, 一个程序将同时处理任务调度和作业处理; 从hadoop2.0开始, Mapreduce简化了, 数据处理运行在YARN框架上。

这里写图片描述

二、 architecture

在大的层面YARN由两种节点组成Master 和Worker(Salve):

  • Master node
    The ResourceManager component runs on a master node and manages resources globally for all YARN applications.
  • Worker node
    The NodeManager component runs on each worker node in the cluster, and executes all tasks as directed by the global ResourceManager component.

下图为YARN架构的略微图:
这里写图片描述

2.1 NodeManager

NodeManager是一个守护进程/服务, 运行在每一个工作节点上, 她的功能:

  It manages local resources on behalf of the requesting services (such as the ResourceManager and, ApplicationMasters).
  It tracks the health of the node and communicates its status with the ResourceManager.

2.2 NodeManager: Container

  当Resource Manager 发送一个ApplicationMaster 请求(启动一个应用程序并运行它所需的工作的请求), NodeManager 会开始分配资源(CPU、内存)。
这里写图片描述

2.3 Containter 定义

  A container is a unit of work within a YARN application that is allocated specific CPU and memory resources by the NodeManager on behalf of the ResourceManager. The container is the component that performs the work of the specific YARN application. A container is launched each time a new ApplicationMaster request is made by the ResourceManager. When a job is executed, the ApplicationMaster requests additional resources from the ResourceManager (via the NodeManager on which it is running). If additional resources can be allotted, the ResourceManager can then request additional containers to run that task from across the cluster.

下图为Container的功能:
这里写图片描述

2.4 NodeManager: ApplicationMaster

2.4.1 ApplicationMaster 的功能:

这里写图片描述

2.4.2 ApplicationMaster与Container

  一旦NodeManager产生一个Container, ApplicationMaster 就会被Container中的资源启动。
这里写图片描述

2.4.3 ApplicationMaster的Job Scheduing

  ApplicationMaster向ResourceManager提交一个请求, 用来查询集群的能力,然后会收到一个授权信息关于那些资源是需要或允许的。ApplicationMastere就会和NodeManager通信, 就这些资源分配给Containers,创建好了之后, 就会配置这些Container来执行任务。
注意:
  NodeManager虽然管理 和监控Container的资源使用情况, 但是不能见到应用的工作任务(job task是不可见的)。
  ApplicationMaster跟踪和监控应用的资源使用情况和进程,\如果一个任务失败, 将有ApplicationMaster修正, 但是在Container的层面, 将由NodeManager来处理, 在集群间NodeManager和ApplicationMaster频繁的通信。

下图为ApplicationMaster进行Job Scheduing 流程:
这里写图片描述
不断创建Container, 直到ApplicationMaster资源耗尽或者所有的任务都分配好了。
下面用一个图例展示一个有三个节点的集群中,Containers, ApplicationMaster, job tasks 的情况:
这里写图片描述
  这个例子中, job1的ApplicationMaster在NodeManager2, job1的第一个任务 job task1 在NM1, job task2 在NM2, 完成了job1所需的所有任务。同样job2分布在不同的NodeManager上。
  最重要一点: ApplicationMaster能够在集群中的任何可用的NodeManager上创建Container , 默认的动作是:将data block上的任务收集,放在一个更多计算能力的节点,即使没有那些data.
注意:是将”计算”而不是”数据“移到有处理能力的节点上。

2.5 YARN ResourceManager(Master Node)

ResourceManager的功能:
这里写图片描述
YARN 有两种Scheduler:
  FairScheduler and CapacityScheduler

The ResourceManager component is comprised of a number of services that perform three main duties: Scheduling, Node Management, and Security. The YARN Scheduler is a single component that controls resource usage according to parameters set by the Hadoop Administrator. This allows for greater efficiency by allowing different organizations to use a centrally pooled set of cluster resources (multitenancy) while at the same time controlling each tenant’s access to those resources. This ensures that each organizations can be guaranteed the minimum required resources needed in order to meet its SLAs. At the same time, this also allows organizations to access excess capacity not being used by the others, thus providing elasticity and lower overall cost of deployment. The scheduling mechanism used and specific settings are under the control of the Hadoop Administrator.Side note: There are two YARN Scheduler options – FairScheduler and CapacityScheduler. These will be discussed in more detail elsewhere.Node and ApplicationMaster management in the ResourceManager is accomplished via a number of services that perform a variety of tasks:  Monitor NodeManagers for heartbeat (sent by NodeManager every second by default, expected within 10 minutes) Submit ApplicationMaster launch requests to appropriate NodeManagers Verify that resource container components were actually launched on appropriate NodeManagers (within 10 minutes) and attempts restart if required Monitors ApplicationMasters running in containers for heartbeat (expects one every 10 minutes) and attempts restart if requiredNote: Only ApplicationMaster is monitored. Job task monitoring is the responsibility of the ApplicationMaster itself. Maintain a list of submitted ApplicationMasters across the cluster and their current stateThe ResourceManager serves as a web application proxy and controls access to resources via ACLs. It manages resource / application security via token-based systems which that verify that all container requests are valid. The ApplicationMaster must pass a verified containerToken to the NodeManager that contains information about the resources that should be allocated to that container. This checking mechanism prohibits a rogue ApplicationMaster from allocating more resources than it has been allotted by the ResourceManager.
0 0
原创粉丝点击