YARN安装配置初体验
来源:互联网 发布:拒绝帮助别人知乎 编辑:程序博客网 时间:2024/06/06 00:55
本安装在开发实验环境中部署,只涉及到全局资源管理调度系统YARN的安装,HDFS还是第一代,没有部署HDFS Federation和HDFS HA,后续会加上。
OS: CentOS Linux release 6.0 (Final) x86_64
部署机器:
dev80.hadoop 192.168.7.80
dev81.hadoop 192.168.7.81
dev82.hadoop 192.168.7.82
dev83.hadoop 192.168.7.83
dev80主要作为ResourceManager, Namenode,SecondaryNamenode,slave节点(起datanode和nodemanager)包括 dev80,dev81,dev82,dev83
首先需要安装jdk,并保证和各个slave节点ssh打通。
从hadoop官网上下载2.0.5 alpha版本(目前最新的打包版本,beta版本已经从trunk上拉出了分支,不过需要自己build)
wget http://apache.fayea.com/apache-mirror/hadoop/common/hadoop-2.0.5-alpha/hadoop-2.0.5-alpha.tar.gztar xzvf hadoop-2.0.5-alpha.tar.gz
解压开来后发现整个目录和hadoop 1.0发生很大变化,和linux根目录结构很相似,客户端的启动命令都放到bin下面,而管理员服务端启动命令都在sbin(super bin)下面,配置文件统一放在了etc/hadoop下,在原有基础上多了一个yarn-site.xml和yarn-env.sh,启动yarn的话可以用sbin/yarn-daemon.sh和sbin/yarn-daemons.sh(启动多个slave上的service)
drwxr-xr-x 2 hadoop hadoop 4096 Aug 16 18:18 bindrwxr-xr-x 3 hadoop hadoop 4096 Aug 14 10:27 etcdrwxr-xr-x 2 hadoop hadoop 4096 Aug 14 10:27 includedrwxr-xr-x 3 hadoop hadoop 4096 Aug 14 10:27 libdrwxr-xr-x 2 hadoop hadoop 4096 Aug 16 15:58 libexecdrwxrwxr-x 3 hadoop hadoop 4096 Aug 14 18:15 logsdrwxr-xr-x 2 hadoop hadoop 4096 Aug 16 18:25 sbindrwxr-xr-x 4 hadoop hadoop 4096 Aug 14 10:27 share
配置
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.0.5-alpha 加入/etc/profile文件中,这样启动的时候就会加载到系统环境变量中
hadoop-env.sh中设置JAVA HOME和ssh参数
export JAVA_HOME=/usr/local/jdkexport HADOOP_SSH_OPTS="-p 58422"slaves文件加入如下节点:
dev80.hadoopdev81.hadoopdev82.hadoopdev83.hadoopcore-site.xml
<configuration> <property> <name>fs.default.name</name> <value>hdfs://dev80.hadoop:8020</value> <final>true</final> </property></configuration>hdfs-site.xml中配置namenode存放editlog和fsimage的目录、和datanode存放block storage的目录
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>/data/yarn/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/yarn/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property></configuration>yarn-site.xml,yarn中shuffle部分被独立成一个service,需要在nodemanager启动的时候作为auxiliary service一起启动,这样可以自定义第三方的shuffle provider,和ShuffleConsumer,比如可以替换现阶段的HTTP Shuffle为RDMA Shuffle,对于中间结果merge可以采用更合适的策略来得到更好的性能提升
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.address</name> <value>dev80.hadoop:9080</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>dev80.hadoop:9081</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>dev80.hadoop:9082</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property></configuration>mapred-site.xml中需要配置mapreduce.framework.name为yarn,这样mr job会被提交到ResourceManager
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property></configuration>将上述conf文件rsync到各个slave节点上
启动Service
先启动HDFS
bin/hdfs namenode -format
这条命令执行后/data/yarn/name下面就被formatted了
启动namenode:
sbin/hadoop-daemon.sh start namenode
启动datanode
sbin/hadoop-daemons.sh start datanode (注意这边是hadoop-daemons.sh,他会调用salves.sh,读取slaves文件ssh到各个slave节点上启动service)
至此namenode和datanode启动完毕
可以通过http://192.168.7.80:50070 查看hdfs页面
启动ResourceManager
sbin/yarn-daemon.sh start resourcemanager
启动NodeManager
sbin/yarn-daemons.sh start nodemanager
检查YARN的页面http://192.168.7.80:8088/cluster
启动history server
sbin/mr-jobhistory-daemon.sh start historyserver
查看页面http://192.168.7.80:19888
跑一个简单的例子
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar pi 30 30
Number of Maps = 30Samples per Map = 3013/08/19 12:03:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableWrote input for Map #0Wrote input for Map #1Wrote input for Map #2Wrote input for Map #3Wrote input for Map #4Wrote input for Map #5Wrote input for Map #6Wrote input for Map #7Wrote input for Map #8Wrote input for Map #9Wrote input for Map #10Wrote input for Map #11Wrote input for Map #12Wrote input for Map #13Wrote input for Map #14Wrote input for Map #15Wrote input for Map #16Wrote input for Map #17Wrote input for Map #18Wrote input for Map #19Wrote input for Map #20Wrote input for Map #21Wrote input for Map #22Wrote input for Map #23Wrote input for Map #24Wrote input for Map #25Wrote input for Map #26Wrote input for Map #27Wrote input for Map #28Wrote input for Map #29Starting Job13/08/19 12:03:52 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.13/08/19 12:03:52 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.13/08/19 12:03:53 INFO input.FileInputFormat: Total input paths to process : 3013/08/19 12:03:53 INFO mapreduce.JobSubmitter: number of splits:3013/08/19 12:03:53 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar13/08/19 12:03:53 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative13/08/19 12:03:53 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces13/08/19 12:03:53 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class13/08/19 12:03:53 WARN conf.Configuration: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative13/08/19 12:03:53 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class13/08/19 12:03:53 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name13/08/19 12:03:53 WARN conf.Configuration: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class13/08/19 12:03:53 WARN conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class13/08/19 12:03:53 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir13/08/19 12:03:53 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir13/08/19 12:03:53 WARN conf.Configuration: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class13/08/19 12:03:53 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps13/08/19 12:03:53 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class13/08/19 12:03:53 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir13/08/19 12:03:53 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1376884226092_000113/08/19 12:03:53 INFO client.YarnClientImpl: Submitted application application_1376884226092_0001 to ResourceManager at dev80.hadoop/192.168.7.80:908013/08/19 12:03:53 INFO mapreduce.Job: The url to track the job: http://dev80.hadoop:8088/proxy/application_1376884226092_0001/13/08/19 12:03:53 INFO mapreduce.Job: Running job: job_1376884226092_000113/08/19 12:04:00 INFO mapreduce.Job: Job job_1376884226092_0001 running in uber mode : false13/08/19 12:04:00 INFO mapreduce.Job: map 0% reduce 0%13/08/19 12:04:10 INFO mapreduce.Job: map 3% reduce 0%13/08/19 12:04:11 INFO mapreduce.Job: map 23% reduce 0%13/08/19 12:04:13 INFO mapreduce.Job: map 27% reduce 0%13/08/19 12:04:14 INFO mapreduce.Job: map 43% reduce 0%13/08/19 12:04:15 INFO mapreduce.Job: map 73% reduce 0%13/08/19 12:04:16 INFO mapreduce.Job: map 100% reduce 0%13/08/19 12:04:17 INFO mapreduce.Job: map 100% reduce 100%13/08/19 12:04:17 INFO mapreduce.Job: Job job_1376884226092_0001 completed successfully13/08/19 12:04:17 INFO mapreduce.Job: Counters: 44 File System Counters FILE: Number of bytes read=666 FILE: Number of bytes written=2258578 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=8060 HDFS: Number of bytes written=215 HDFS: Number of read operations=123 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=30 Launched reduce tasks=1 Data-local map tasks=27 Rack-local map tasks=3 Total time spent by all maps in occupied slots (ms)=358664 Total time spent by all reduces in occupied slots (ms)=5182 Map-Reduce Framework Map input records=30 Map output records=60 Map output bytes=540 Map output materialized bytes=840 Input split bytes=4520 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=840 Reduce input records=60 Reduce output records=0 Spilled Records=120 Shuffled Maps =30 Failed Shuffles=0 Merged Map outputs=30 GC time elapsed (ms)=942 CPU time spent (ms)=14180 Physical memory (bytes) snapshot=6924914688 Virtual memory (bytes) snapshot=22422675456 Total committed heap usage (bytes)=5318574080 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=3540 File Output Format Counters Bytes Written=97Job Finished in 24.677 secondsEstimated value of Pi is 3.13777777777777777778job history页面
dev80 jps出来的java进程
27172 JobHistoryServer28627 Jps26699 ResourceManager26283 NameNode26507 DataNode27014 NodeManager
dev81 jps出来的java进程
3232 Jps1858 NodeManager1709 DataNode
这样yarn cluster算搭建完成了
参考:
http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-install/
本文链接http://blog.csdn.net/lalaguozhe/article/details/10062619,转载请注明
- YARN安装配置初体验
- YARN安装配置
- Ganglia安装配置初体验
- 【RabbitMQ】安装、配置、初体验
- Hadoop YARN的安装配置
- linux安装hadoop配置Yarn
- ganglia安装配置初体验(二)
- supervisord 安装、配置体验
- Hadoop 2.x(YARN)安装配置LZO
- YARN & HDFS2 安装和配置Kerberos
- Hadoop2.0 YARN cloudra4.4.0安装配置
- Hadoop2.0 YARN cloudra4.4.0安装配置
- 安装yarn
- Yarn 安装
- 安装 配置 初次体验gradle
- 安装配置FreeBSD9全过程体验
- 安装 配置 初次体验gradle
- 安装 配置 初次体验gradle
- mysql数据库优化
- Linux 的启动流程
- sadfasd
- 装饰模式
- UVAlive 6195 拓扑排序 描述巨坑
- YARN安装配置初体验
- 创建英文资源的dll
- 【Mac上SVN客户端系列】SmartSVN专业版安装、破解及功能使用
- Ext 4.2.1初次使用心得,包括一些不合理的地方帮助您少走弯路
- web.xml中load-on-startup的作用
- 短信PDU编码解析
- 高危的程序猿,又被抓来顶缸了?光大证券“乌龙指”事件有感
- Android和Linux kernel发展史
- 多表关联更新(二)