hadoop-2.0.2-alpha(yarn)集群搭建

来源:互联网 发布:淘宝怎么撤回退款申请 编辑:程序博客网 时间:2024/06/04 19:39

1、机器IP及root用户

     10.28.168.157root/cdyjs

     10.28.168.158 root/cdyjs

     10.28.168.159 root/cdyjs

hosts:

     10.28.168.157 nn nn.360buy.com n1

     10.28.168.158 slave1 salve1.360buy.com n2

     10.28.168.159 slave2 slave2.360buy.com n3

 

2、root登录157、158和159创建hadoop2用户并指定密码

(1)、创建用户hadoop2,并指定用户主目录

useradd -d /home/hadoop2 -m hadoop2

(2)、设置用户密码123456

passwd hadoop2

 

3、用hadoop2用户上传hadoop-2.0.2-alpha.tar.gz到目录/home/hadoop2下

 

4、用hadoop2用户解压hadoop-2.0.2-alpha.tar.gz

tar zxvf hadoop-2.0.2-alpha.tar.gz

 

5、打通157、158、159的hadoop2用户的ssh连接,以下操作都是在hadoop2用户下执行

(1)、在157、158、159上执行以下命令并一直回车:

 ssh-keygen-t rsa

(2)、在157执行:

 scp~/.ssh/id_rsa.pub hadoop2@slave1:/home/hadoop2/.ssh/id_rsa.pub_m

    scp ~/.ssh/id_rsa.pubhadoop2@slave2:/home/hadoop2/.ssh/id_rsa.pub_m

    cpid_rsa.pub authorized_keys

    chmod600 authorized_keys

(3)、在158、159上执行:

    cp id_rsa.pub_mauthorized_keys

    chmod600 authorized_keys

 

6、hadoop2登录到157、158、159上修改环境变量

.bash_profile文件中增加以下代码:

export JAVA_HOME=/usr/local/jdk1.6.0_30

export JAVA_BIN=/usr/local/jdk1.6.0_30/bin

exportCLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export JAVA_OPTS="-Djava.library.path=/usr/local/lib -server -Xms1024m-Xmx2048m -XX:MaxPermSize=256m -Djava.awt.headless=true-Dsun.net.client.defaultReadTimeout=60000 -Djmagick.systemclassloader=no-Dnetworkaddress.cache.ttl=300 -Dsun.net.inetaddr.ttl=300"

exportHADOOP_HOME=/home/hadoop2/hadoop-2.0.2-alpha

export HADOOP_MAPRED_HOME=${HADOOP_HOME}

export HADOOP_COMMON_HOME=${HADOOP_HOME}

export HADOOP_HDFS_HOME=${HADOOP_HOME}

export HADOOP_YARN_HOME=${HADOOP_HOME}

exportPATH=$PATH:${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin

export JAVA_HOME JAVA_BIN PATH CLASSPATH JAVA_OPTS

export HADOOP_LIB=${HADOOP_HOME}/lib

export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

 

7、配置hadoop(hadoop-2.0.2-alpha/etc/hadoop)

(1)、修改157上的配置文件

core-site.xml

<configuration>

<property>

 <name>io.native.lib.available</name>

 <value>true</value>

</property>

 

<property>

 <name>fs.default.name</name>

 <value>hdfs://nn:9000</value>

 <description>The name of the default file system.Either theliteral string "local" or a host:port for NDFS.</description>

 <final>true</final>

</property>

</configuration>

 

hdfs-site.xml:

<configuration>

<property>

 <name>dfs.namenode.name.dir</name>

 <value>file:/home/hadoop2/dfsdata/name</value>

 <description>Determines where on the local filesystem the DFS namenode should store the name table.If this is a comma-delimited list ofdirectories,then name table is replicated in all of the directories,forredundancy.</description>

 <final>true</final>

</property>

 

<property>

 <name>dfs.datanode.data.dir</name>

 <value>file:/home/hadoop2/dfsdata/data</value>

 <description>Determines where on the local filesystem an DFS datanode should store its blocks.If this is a comma-delimited list ofdirectories,then data will be stored in all named directories,typically ondifferent devices.Directories that do not exist are ignored.

  </description>

 <final>true</final>

</property>

 

<property>

 <name>dfs.replication</name>

 <value>2</value>

</property>

 

<property>

 <name>dfs.permission</name>

 <value>false</value>

</property>

</configuration>

 

mapred-site.xml

<configuration>

<property>

 <name>mapreduce.framework.name</name>

 <value>yarn</value>

</property>

 

<property>

 <name>mapreduce.job.tracker</name>

 <value>hdfs://nn:9001</value>

 <final>true</final>

</property>

 

<property>

 <name>mapreduce.map.memory.mb</name>

 <value>1536</value>

</property>

 

<property>

 <name>mapreduce.map.java.opts</name>

 <value>-Xmx1024M</value>

</property>

 

<property>

 <name>mapreduce.reduce.memory.mb</name>

 <value>3072</value>

</property>

 

<property>

 <name>mapreduce.reduce.java.opts</name>

 <value>-Xmx2560M</value>

</property>

 

<property>

 <name>mapreduce.task.io.sort.mb</name>

 <value>512</value>

</property>

 

<property>

 <name>mapreduce.task.io.sort.factor</name>

 <value>100</value>

</property>

 

<property>

 <name>mapreduce.reduce.shuffle.parallelcopies</name>

 <value>50</value>

</property>

 

<property>

 <name>mapred.system.dir</name>

 <value>file:/home/hadoop2/mapreddata/system</value>

 <final>true</final>

</property>

 

<property>

 <name>mapred.local.dir</name>

 <value>file:/home/hadoop2/mapreddata/local</value>

 <final>true</final>

</property>

</configuration>

 

yarn-site.xml

<configuration>

 

<!-- Site specific YARN configurationproperties -->

<property>

 <name>yarn.resourcemanager.address</name>

 <value>nn:8080</value>

</property>

 

<property>

 <name>yarn.resourcemanager.scheduler.address</name>

 <value>nn:8081</value>

</property>

 

<property>

 <name>yarn.resourcemanager.resource-tracker.address</name>

 <value>nn:8082</value>

</property>

 

<property>

 <name>yarn.nodemanager.aux-services</name>

 <value>mapreduce.shuffle</value>

</property>

 

<property>

 <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

 <value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

 

<property>

  <name>yarn.nodemanager.local-dirs</name>

 <value>file:/home/hadoop2/nmdata/local</value>

 <description>the local directories used by thenodemanager</description>

</property>

 

<property>

 <name>yarn.nodemanager.log-dirs</name>

 <value>file:/home/hadoop2/nmdata/log</value>

 <description>the directories used by Nodemanagers as logdirectories</description>

</property>

</configuration>

 

hadoop-env.sh

增加JAVA_HOME

 

slaves文件:

slave1

slave2

 

(2)、然后拷贝配置文件到158、159

scp /home/hadoop2/hadoop-2.0.2-alpha/etc/hadoop/*hadoop2@slave1:/home/hadoop2/hadoop-2.0.2-alpha/etc/hadoop/

scp/home/hadoop2/hadoop-2.0.2-alpha/etc/hadoop/*hadoop2@slave2:/home/hadoop2/hadoop-2.0.2-alpha/etc/hadoop/

 

7、在157上格式化分布式文件系统并启动hdfs和yarn

hdfs namenode -format

start-dfs.sh

start-yarn.sh

 

8、运行自带作业例子

hadoopjarhadoop-2.0.2-alpha/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.2-alpha.jarwordcount /temp/input /temp/output

 

9、查看集群状态和job进度

http://nn:50070/dfshealth.jsp

http://nn:8088/cluster

 

说明:

(1)、yarn-site.xml配置文件的两个参数yarn.nodemanager.local-dirsyarn.nodemanager.log-dirs必须配置,如果不配置将使得NodeManager处于Unhealthy状态,无法提供服务,现象是提交作业时,作业一直处于pending状态无法往下执行。

 

参考文档链接:

http://www.cnblogs.com/aniuer/archive/2012/07/16/2594448.html

http://www.cnblogs.com/scotoma/archive/2012/09/18/2689902.html

http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html

http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html


 更多精彩博客请访问:http://yuntai.1kapp.com/

 

原创粉丝点击