HDFS ---- Services startup
来源:互联网 发布:python有什么用 编辑:程序博客网 时间:2024/05/22 14:35
Overview
HDFS内部把各种功能通过各种服务的方式向外部提供。 在启动的时候,HDFS主要启动以下一些服务:
- HTTPServer, 用来动态查看当前系统状态;
- JVMPauseMonitor, 用来记录当前运行的JVM是否曾经暂停过;
- NameNodeResourceChecker, 定期检查当前系统可用的本地目录的可用空间;
- BlockManager, 管理系统中所有与Block相关的信息;
- RPC Server, 主要用来与datanode,second namenode, backup namenode, checkpoint namenode, HA node(Zookeeper)FSClient 等通信;
BlockManager, 与 RPC server是整个系统最重要的两个模块;这里也就只查看这两个模块的初始化启动。
BlockManager 启动
BlockManager 实例化
由FSNamesystem 调用其构造函数完成,实例化过程中主要完成以下信息的初始化:
- 保存一个到FSNamesystem, 及FSclusterStats 的引用,这个两个引用在实现中是同一个对像,即FSNamesystem;
- 完成DataNodeManager的实例化;
- HartbeatManager 初始化;
- 为BlocksMap对像分配2% 的系统内存,内部是一个link-hashmap 的实现,主要保存Block的元数据及Datanode信息;
- BlockPlacementPlolicy对像初始化, 这个对像主要用来选择合适的datanode来存放block及replication block;
- PendingReplicationBlocks: 对像初始化, 该对像主要用来保存那些replication 还没有达到系统配置的Block的信息;
- BlockTokenSecretManager,对像实例化;
- 所有的Blocks相关的配置信息读取;包括:
- defaultReplication
- maxReplication
- minReplication
- maxReplicationStreams
- shouldCheckForEnoughRacks
- replicationRecheckInterval
- encryptDataTransfer
- maxNumBlocksToLog
BlockManager 初始化
在初始化过程中,BlockManger主要完成以下任务:
- pendingReplications 线程启动;
- 启动datanodeManager;
- 启动ReplicationThread, 该线程以固定的时间计算Datanode的负载,并处理所有等待Replicate的Block;
其中最主要的是DataNdeManager的启动;
DataNodeManager 启动
DataNodeManger实例化
DataNodeManager实例化过程中主要完成以下资源的初始化:
- NetworkTopology对像初始化;
- hearbeatManager对像初始化;
- HostFileManager初始化;
- DNSToSwitchMapping对像初始化;
- 所有的DataNode相关的配置信息的读取;
DataNodeManager 初始化
DataNodeManager 初始化主要完成以下任务:
- DecommissionManager的启动, 该对像主要用来监视系统中退投的Datanode, 在日志中记录下来;
- 启动heatBeatManager, 来处理,来自己dataNode的请求;
RPC Server 启动
RPC Server实现在几个重要的通信协议:
- ClientProtocol 用来与在Client 通信;
- DataNodeProtocol, 用来与DataNode 通信;
- NamenodeProtocol, 用来与backupNamenode, secondaryNamenode,及checkpointNamenode通信;
- RefreshAuthorizationPolicyProtocol, 用来与管理工具交互;
- RefreshUserMappingsProtocol, 用来更新user 信息;
- GetUserMappingsProtocol:用来获得当用用户信息;
- HAServiceProtocol: Zookeeper使用该 协议来切换NameNode的状态。、
RPC Server 启动之后,Namenode所有的公共的Service已经就绪, 然后NameNode 根据配置是否启用HA决定进入Standby或Active状态,并启动相应的Service;
如果启用HA功能,则Namenode进入standby 状态,并启动standby 相应的Service 后续由Zookeeper去选择active的namenode;
否则,Namenode直接进入active状态,启动active Service 开始对外提供服务。
HAContext Service 启动
Standby Service 启动
Standby Service的启动比较简单,主是把FSEditLog以只读的方式打开, 并从active读取最新的操日志记录,
LOG.info("Starting services required for standby state"); if (!dir.fsImage.editLog.isOpenForRead()) { // During startup, we're already open for read. dir.fsImage.editLog.initSharedJournalsForRead(); } blockManager.setPostponeBlocksFromFuture(true); editLogTailer = new EditLogTailer(this, conf); editLogTailer.start(); if (standbyShouldCheckpoint) { standbyCheckpointer = new StandbyCheckpointer(conf, this); standbyCheckpointer.start(); }
Active Service 启动
Active 主要初始化之前service, 并启动两个monitor:
FSEditLog editLog = dir.fsImage.getEditLog(); if (!editLog.isOpenForWrite()) { // During startup, we're already open for write during initialization. editLog.initJournalsForWrite(); // May need to recover editLog.recoverUnclosedStreams(); LOG.info("Catching up to latest edits from old active before " + "taking over writer role in edits logs"); editLogTailer.catchupDuringFailover(); blockManager.setPostponeBlocksFromFuture(false); blockManager.getDatanodeManager().markAllDatanodesStale(); blockManager.clearQueues(); blockManager.processAllPendingDNMessages(); if (!isInSafeMode() || (isInSafeMode() && safeMode.isPopulatingReplQueues())) { LOG.info("Reprocessing replication and invalidation queues"); blockManager.processMisReplicatedBlocks(); } if (LOG.isDebugEnabled()) { LOG.debug("NameNode metadata after re-processing " + "replication and invalidation queues during failover:\n" + metaSaveAsString()); } long nextTxId = dir.fsImage.getLastAppliedTxId() + 1; LOG.info("Will take over writing edit logs at txnid " + nextTxId); editLog.setNextTxId(nextTxId); dir.fsImage.editLog.openForWrite(); } if (haEnabled) { // Renew all of the leases before becoming active. // This is because, while we were in standby mode, // the leases weren't getting renewed on this NN. // Give them all a fresh start here. leaseManager.renewAllLeases(); } leaseManager.startMonitor(); startSecretManagerIfNecessary(); //ResourceMonitor required only at ActiveNN. See HDFS-2914 this.nnrmthread = new Daemon(new NameNodeResourceMonitor()); nnrmthread.start();
如以上代码所示,在active阶段,主要是确认,FSEditlog是以写入的方式打开,设定blockmanager的工作方式,更新所有的datanode的状态;
启动leaseManager monitor, 启动 NameNodeResourceMonitor
至此,所有的的service 启动,然后,主线程进入等待, namenode启动完成;
- HDFS ---- Services startup
- android system services startup process
- HDFS --- DataNode startup service initialize
- the startup type and services status of services' properties
- CRS-4124: Oracle High Availability Services startup failed.
- startup
- StartUp
- Startup
- Startup
- startup
- startup
- startup
- Oracle 11.2.0.1 RAC GRID 无法启动 : Oracle High Availability Services startup failed
- Services
- services
- Services
- services
- Services
- shell 在终端输入密码时,怎么不让密码显示出来
- spoj 694 Distinct Substrings(求不同的子串个数,后缀数组基础题)
- 大学专业学习的期望与目标
- 计算机国际顶级会议
- Unity导出webPlayer并且部署到IIS
- HDFS ---- Services startup
- 快速了解反射(Reflection)
- 冒泡算法
- 常用的汇编指令集
- Dynamic Performance Views
- 【wikioi】1010 过河卒
- IO总结(六)
- s q l i t e 加密 - S Q L C i p h e r
- CCS中调试DM6467高清视频采集(TVP7002输入)