Hadoop Yarn 集群搭建
来源:互联网 发布:淘宝怎么清仓 编辑:程序博客网 时间:2024/05/22 08:07
1. 服务器部署
192.168.0.101 主名字节点
192.168.0.101 备名字节点
192.168.0.101 数据节点1
192.168.0.102 数据节点2
2. /etc/hosts 设置
192.168.0.101 namenode
192.168.0.102 datanode
3. ssh 免密码登录
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh localhost
第一次需要输入密码,第二次以后自动登录
此外,由于namenode需要ssh访问datanode,因此需要在datanode的授权码中增加对namdenode的授权
在namenode中使用scp命令
如:scp id_dsa.pub tizen@datanode:share
在datanode中
cat ~/share/id_dsa.pub >> ~/.ssh/authorized_keys
4. 安装jdk,例如openjdk-7-jdk
ln到/usr/local/jdk
export JAVA_HOME=/usr/local/jdk
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin
5. 下载hadoop2.7.1
ln 到 /usr/local/hadoop
6. 环境变量中设置.bashrc
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
7. 创建运行时环境目录
sudo mkdir -p /data/hadoop/pids
sudo mkdir -p /data/hadoop/storage/tmp
sudo mkdir -p /data/hadoop/storage/hdfs/name
sudo mkdir -p /data/hadoop/storage/hdfs/data
8. 配置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/storage/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
</property>
</configuration>
9. 配置hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>namenode:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/storage/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/storage/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
10. 配置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>namenode:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>namenode:19888</value>
</property>
</configuration>
11. 配置yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>namenode:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>namenode:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>namenode:8034</value>
</property>
</configuration>
12. 在以下三个文件开头处增加环境变量
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
/usr/local/hadoop/etc/hadoop/mapred-env.sh
/usr/local/hadoop/etc/hadoop/yarn-env.sh
export JAVA_HOME=/usr/local/jdk
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_PREFIX=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
13. 配置从节点包含的hostname
例如,本人环境有限,只有两台机器,一台datanode,而namenode同时又是datanode
# vim /usr/local/hadoop/etc/hadoop/slaves
namenode
datanode
14. hadoop 集群测试
# hdfs namenode -format
tizen@samsung:~$ start-dfs.sh
Starting namenodes on [namenode]
namenode: starting namenode, logging to /usr/local/hadoop/logs/hadoop-tizen-namenode-samsung.out
namenode: starting datanode, logging to /usr/local/hadoop/logs/hadoop-tizen-datanode-samsung.out
datanode: starting datanode, logging to /usr/local/hadoop/logs/hadoop-tizen-datanode-hp.out
Starting secondary namenodes [namenode]
namenode: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-tizen-secondarynamenode-samsung.out
tizen@samsung:~$ jps
12307 DataNode
12490 SecondaryNameNode
12597 Jps
12148 NameNode
192.168.0.101 主名字节点
192.168.0.101 备名字节点
192.168.0.101 数据节点1
192.168.0.102 数据节点2
2. /etc/hosts 设置
192.168.0.101 namenode
192.168.0.102 datanode
3. ssh 免密码登录
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh localhost
第一次需要输入密码,第二次以后自动登录
此外,由于namenode需要ssh访问datanode,因此需要在datanode的授权码中增加对namdenode的授权
在namenode中使用scp命令
如:scp id_dsa.pub tizen@datanode:share
在datanode中
cat ~/share/id_dsa.pub >> ~/.ssh/authorized_keys
4. 安装jdk,例如openjdk-7-jdk
ln到/usr/local/jdk
export JAVA_HOME=/usr/local/jdk
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin
5. 下载hadoop2.7.1
ln 到 /usr/local/hadoop
6. 环境变量中设置.bashrc
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
7. 创建运行时环境目录
sudo mkdir -p /data/hadoop/pids
sudo mkdir -p /data/hadoop/storage/tmp
sudo mkdir -p /data/hadoop/storage/hdfs/name
sudo mkdir -p /data/hadoop/storage/hdfs/data
8. 配置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/storage/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
</property>
</configuration>
9. 配置hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>namenode:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/storage/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/storage/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
10. 配置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>namenode:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>namenode:19888</value>
</property>
</configuration>
11. 配置yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>namenode:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>namenode:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>namenode:8034</value>
</property>
</configuration>
12. 在以下三个文件开头处增加环境变量
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
/usr/local/hadoop/etc/hadoop/mapred-env.sh
/usr/local/hadoop/etc/hadoop/yarn-env.sh
export JAVA_HOME=/usr/local/jdk
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_PREFIX=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
13. 配置从节点包含的hostname
例如,本人环境有限,只有两台机器,一台datanode,而namenode同时又是datanode
# vim /usr/local/hadoop/etc/hadoop/slaves
namenode
datanode
14. hadoop 集群测试
# hdfs namenode -format
tizen@samsung:~$ start-dfs.sh
Starting namenodes on [namenode]
namenode: starting namenode, logging to /usr/local/hadoop/logs/hadoop-tizen-namenode-samsung.out
namenode: starting datanode, logging to /usr/local/hadoop/logs/hadoop-tizen-datanode-samsung.out
datanode: starting datanode, logging to /usr/local/hadoop/logs/hadoop-tizen-datanode-hp.out
Starting secondary namenodes [namenode]
namenode: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-tizen-secondarynamenode-samsung.out
tizen@samsung:~$ jps
12307 DataNode
12490 SecondaryNameNode
12597 Jps
12148 NameNode
如果版本正确,则最终会出现上面的信息,否则会出现一大串ssh等wanring,这个问题已经在其他相关文章中说明解决方法。
http://192.168.0.101:8034
http://192.168.0.101:9001
http://192.168.0.101:50070
0 0
- Hadoop Yarn 集群搭建
- Hadoop集群搭建(HDFS和Yarn集群)
- hadoop-2.0.2-alpha(yarn)集群搭建
- hadoop集群搭建HDFS、HA、 YARN
- hadoop集群搭建HDFS、HA、 YARN
- Hadoop-2.7.3集群(YARN)搭建
- hadoop集群部署(yarn)
- Hadoop(3)-Yarn集群
- Hadoop 2.2 YARN分布式集群搭建配置流程
- Hadoop+yarn搭建
- hadoop集群的搭建脚本及构思(N):一个简化的Hadoop+Spark on Yarn集群快速搭建
- Spark on Yarn集群搭建
- Spark on Yarn集群搭建
- Spark on Yarn集群搭建
- Spark on Yarn集群搭建
- Spark on Yarn集群搭建
- Spark on Yarn集群搭建
- Spark on Yarn集群搭建
- 未排序数组中累加和为给定值的最长子数组
- 一些感想
- CC2540 CC2541 CC2530脱机烧录器量产批量生产设备 2路并行烧录
- Struts2 (一)
- C语言的atan()和atan2()函数
- Hadoop Yarn 集群搭建
- Win10+VS2013编译OSG3.4
- struts2 (二)单元测试/通过耦合和非耦合的方式获取和设置值/返回值类型/异常映射
- C#面向对象概念总结
- 携程DynamicAPK插件化框架源码分析
- word2010公式排版、mathtype公式编辑和自动编号
- 记一次ubutnu14.10下安装编译Apache
- hadoop Yarn 搭建集群时错误 主节点 NameNode 没有启动成功
- 农夫过河(数据结构)之C语言