Hadoop源码分析27 JobTracker空载处理心跳
来源:互联网 发布:普中科技单片机论坛 编辑:程序博客网 时间:2024/05/29 16:00
JobTracker无任务时处理心跳流程
HeartBeat格式:{restarted=true,initialContact=true,acceptNewTasks=true,responseId=-1,
status=TaskTrackerStatus {failures=0,trackerName="tracker_server3:localhost.localdomain/127.0.0.1:57441"(id=2249)
healthStatus=TaskTrackerStatus$TaskTrackerHealthStatus
resStatus=TaskTrackerStatus$ResourceStatus {availablePhysicalMemory=601034752, availableSpace=32463671296,availableVirtualMemory=2705653760, cpuFrequency=2195079,cpuUsage=-1.0,cumulativeCpuTime=1227000
}
}
判断是否应该接受:(inHostsList(status)&& !inExcludedHostsList(status))
主要调用方法:
public synchronized HeartbeatResponse heartbeat(TaskTrackerStatus status, boolean restarted, boolean initialContact, boolean acceptNewTasks, short responseId)
private synchronized boolean process Heartbeat(TaskTrackerStatus trackerStatus,
boolean initialContact,long timeStamp) throws UnknownHostException
private boolean updateTaskTrackerStatus(String trackerName, TaskTrackerStatus status)
privatevoidaddNewTracker(TaskTracker taskTracker)throwsUnknownHostException
publicNode resolveAndAddToTopology(String name)throwsUnknownHostException
privateNode addHostToNodeMapping(String host, StringnetworkLoc)
void updateTaskStatuses(TaskTrackerStatus status)
private void updateNodeHealthStatus(TaskTrackerStatus trackerStatus,long timeStamp)
synchronized List<Task> getSetupAndCleanupTasks(TaskTrackerStatus taskTracker) throws IOException
public synchronized ClusterStatus getClusterStatus(boolean detailed)
private synchronized List<TaskTrackerAction> getTasksToKill(String taskTracker)
private List<TaskTrackerAction> getJobsForCleanup(String taskTracker)
private synchronized List<TaskTrackerAction> getTasksToSave(TaskTrackerStatust ts )
public int getNextHeartbeatInterval
private void removeMarkedTasks(String taskTracker)
void org.apache.hadoop.mapred.JobTracker.FaultyTrackersInfo.markTrackerHealthy(StringhostName)
boolean org.apache.hadoop.mapred.JobTracker.FaultyTrackersInfo.isBlacklisted(StringhostName)
void org.apache.hadoop.mapred.JobTracker.FaultyTrackersInfo.setNodeHealthStatus(StringhostName, boolean isHealthy, Stringreason, long timeStamp)
List<String> org.apache.hadoop.net.CachedDNSToSwitchMapping .resolve(List<String> names)
void org.apache.hadoop.net.CachedDNSToSwitchMapping .cacheResolvedHosts(List<String> uncachedHosts, List<String> resolvedHosts)
List<String> org.apache.hadoop.net.CachedDNSToSwitchMapping .getCachedHosts(List<String> names)
List<Task> org.apache.hadoop.mapred.JobQueueTaskScheduler.assignTasks(TaskTrackertaskTracker) throwsIOException
1、Server3的TaskTracker首次启动后HeartBeat
从FaultyTrackersInfo.potentiallyFaultyTracker
从trackerToHeartbeatRespon
从taskTrackers拿出上一次的TaskTracker.TaskTrackerStatus,为null,
更新JobTracker的成员totalMaps=0、totalReduces=0、occupiedMapSlots=0、occupiedReduceSlots=0,
从FaultyTrackersInfo.potentiallyFaultyTracker
加入taskTrackers此时内容为:
{tracker_server3:localhost.localdomain/127.0.0.1:43336=org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@4fd86469}
加入uniqueHostsMap此时其内容为:
{server3=1}
加入trackerExpiryQueue,此时其内容为:
[org.apache.hadoop.mapred.TaskTrackerStatus@4e048dc6]
加入dnsToSwitchMapping.cache,内容为
{10.1.1.103=/default-rack}
加入clusterMap,内容为:
Number of racks: 1
Expected number of leaves:1
/default-rack/server3
加入hostnameToNodeMap,内容为:
{server3=/default-rack/server3}
加入nodesAtMaxLevel,内容为:
[/default-rack]
加入hostnameToTaskTracker,内容为
{server3=[org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@4fd86469]}
检查status.getTaskReports(),若不为空,则更新expireLaunchingTasks、trackerToJobsToCleanup、trackerToTasksToCleanup、taskidToTIPMap
responseId加1,从jobs、jobQueueJobInProgressLis
取得nextInterval
生成HeartbeatResponse,内容:
{actions=[],conf=null,heartbeatInterval=240000,recoveredJobs=[],responseId=0}
加入trackerToHeartbeatRespon
{tracker_server3:localhost.localdomain/127.0.0.1:43336=org.apache.hadoop.mapred.HeartbeatResponse@25a78661}
发送HeartbeatResponse给客户端
2、Server2的TaskTracker首次启动后HeartBeat
同样先从potentiallyFaultyTracker
从FaultyTrackersInfo.trackerToHeartbeatRespon
从taskTrackers拿出上一次的TaskTracker.TaskTrackerStatus,为null,
更新JobTracker的成员totalMaps=0、totalReduces=0、occupiedMapSlots=0、occupiedReduceSlots=0、totalMapTaskCapacity=4、totalReduceTaskCapacity=4。
加入taskTrackers此时内容为:
{tracker_server2:localhost.localdomain/127.0.0.1:34381=org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@412eb15f,tracker_server3:localhost.localdomain/127.0.0.1:45605=org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@2634d0e2}
加入uniqueHostsMap此时其内容为:
{server2=1, server3=1}
加入trackerExpiryQueue,此时其内容为:
[org.apache.hadoop.mapred.TaskTrackerStatus@4444ad54,org.apache.hadoop.mapred.TaskTrackerStatus@2ea31991]
加入dnsToSwitchMapping.cache,内容为
{10.1.1.102=/default-rack,10.1.1.103=/default-rack}
加入clusterMap,内容为:
Number of racks: 1
Expected number of leaves:2
/default-rack/server3
/default-rack/server2
加入hostnameToNodeMap,内容为:
{server2=/default-rack/server2,server3=/default-rack/server3}
加入nodesAtMaxLevel,内容为:
[/default-rack]
加入hostnameToTaskTracker,内容为
{server2=[org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@412eb15f],server3=[org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@2634d0e2]}
检查status.getTaskReports(),若不为空,则更新expireLaunchingTasks、trackerToJobsToCleanup、trackerToTasksToCleanup、taskidToTIPMap
responseId加1,从jobs、jobQueueJobInProgressLis
取得nextInterval
生成HeartbeatResponse,内容:
{actions=[],conf=null,heartbeatInterval=240000,recoveredJobs=[],responseId=0}
加入trackerToHeartbeatRespon
{tracker_server2:localhost.localdomain/127.0.0.1:34381=org.apache.hadoop.mapred.HeartbeatResponse@2f4dd8ae,tracker_server3:localhost.localdomain/127.0.0.1:45605=org.apache.hadoop.mapred.HeartbeatResponse@16bd1f19}
发送HeartbeatResponse给客户端
3. Server3再次HeartBeat
从potentiallyFaultyTracker
从FaultyTrackersInfo.trackerToHeartbeatRespon
org.apache.hadoop.mapred.HeartbeatResponse@16bd1f19
判断上一次的ResponseId是否与这次接收的ResponseId相同。
更新JobTracker的成员totalMaps、totalReduces、occupiedMapSlots、occupiedReduceSlots、totalMapTaskCapacity、totalReduceTaskCapacity,先从taskTrackers拿出上一次的TaskTracker.TaskTrackerStatus还原更新,然后用这一次的TaskTrackerStatus更新,其中要从FaultyTrackersInfo.potentiallyFaultyTracker
更新taskTrackers此时内容为:{tracker_server2:localhost.localdomain/127.0.0.1:52688=org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@1cdc471a,tracker_server3:localhost.localdomain/127.0.0.1:40286=org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker@665755f5}
检查status.getTaskReports(),若不为空,则更新expireLaunchingTasks、trackerToJobsToCleanup、trackerToTasksToCleanup、taskidToTIPMap
responseId加1,从jobs、jobQueueJobInProgressLis
取得nextInterval
生成HeartbeatResponse,内容:
{actions=[],conf=null,heartbeatInterval=240000,recoveredJobs=[],responseId=1}
更新trackerToHeartbeatRespon
{tracker_server2:localhost.localdomain/127.0.0.1:52688=org.apache.hadoop.mapred.HeartbeatResponse@1500df0b,tracker_server3:localhost.localdomain/127.0.0.1:40286=org.apache.hadoop.mapred.HeartbeatResponse@6c3355f2}
发送HeartbeatResponse给客户端
4. ExpireTrackers移除过期
从trackerExpiryQueue取出一个TaskTrackerStatus,根据LastSeen判断是否清除或更新,
从taskTrackers取出TaskTracker.TaskTrackerStatus,继续判断LastSeen。
若不需清除有则更新trackerExpiryQueue。
若需清除从trackerExpiryQueue清除,从trackerToJobsToCleanup、trackerToTasksToCleanup、recoveredTrackers、trackerToTaskMap清除
还原更新totalMaps、totalReduces、occupiedMapSlots、occupiedReduceSlots、totalMapTaskCapacity、totalReduceTaskCapacity。
从taskTrackers、uniqueHostsMap、hostnameToTaskTracker移除
- Hadoop源码分析27 JobTracker空载处理心跳
- Hadoop源码分析24 JobTracker启动和心跳处理流程
- Hadoop源码分析28 JobTracker 处理JobClient请求
- Hadoop之JobTracker源码分析
- JobTracker响应TaskTracker心跳及调度task源码级分析
- Hadoop心跳机制源码分析
- Hadoop心跳机制源码分析
- Hadoop心跳机制源码分析
- Hadoop心跳机制源码分析
- Hadoop源码之JobTracker
- Hadoop源码之JobTracker
- Hadoop JobTracker 分析
- Hadoop源码分析26 JobTracker主要容器和线程
- hadoop 源码分析(一) jobClient 提交到JobTracker
- Hadoop源码分析之心跳机制
- Hadoop源码分析之心跳机制<转>
- Hadoop源码分析之心跳机制
- Hadoop源码分析之心跳机制
- Hadoop源码分析23:MapReduce的Job提交过程
- JMX 监控 Hadoop
- Hadoop源码分析24 JobTracker启动和心跳处理流程
- Hadoop源码分析25 JobInProgress 主要容器
- Hadoop源码分析26 JobTracker主要容器和线程
- Hadoop源码分析27 JobTracker空载处理心跳
- Hadoop源码分析28 JobTracker 处理JobClient请求
- Hadoop源码分析29 split和splitmetainfo
- Hadoop源码分析30 JobInProgress 的 TaskInProgress 执行情况
- Hadoop源码分析31 TaskTracke成员
- Hadoop源码分析32 TaskTracker流程
- Hadoop源码分析33 Child的主要流程
- Hadoop源码分析34 Child的Map
- Collection测试