hadoop2.7.2的安装过程

来源:互联网 发布:网络招标注册流程 编辑:程序博客网 时间:2024/05/17 01:15

一、准备安装环境:

1、Vmware workstation 12 的安装

2、虚拟机 Red Hat RHEL 6.6

[hadoop@master ~]$more /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4localhost4.localdomain4

::1         localhost localhost.localdomainlocalhost6 localhost6.localdomain6

 

192.168.150.30 masterTST-RHEL66-00

192.168.150.31 slave1TST-RHEL66-01

192.168.150.32 slave2TST-RHEL66-02

[hadoop@master ~]$

2、虚拟机之间可以需要SSH免密码登录

## (注意:ssh与-keygen之间没有空格)    一路回车即可。

[hadoop@master ~]$ cd

[hadoop@master ~]$pwd

/home/hadoop

[hadoop@master~]$ ssh-keygen -t rsa

##转到.ssh目录 cd~/.ssh 可以看到生成了id_rsa,和id_rsa.pub两个文件

[hadoop@master ~]$ cd.ssh/

[hadoop@master .ssh]$ls

authorized_keys id_rsa  id_rsa.pub  known_hosts

## 执行 cp id_rsa.pub authorized_keys

[hadoop@master.ssh]$ cp id_rsa.pub authorized_keys

## 把Master上面的authorized_keys文件复制到Slave机器的/home/hadoop/.ssh/文件下面

[hadoop@master .ssh]$scp authorized_keys slave1:~/.ssh/

[hadoop@master .ssh]$scp authorized_keys slave2:~/.ssh/

## 修改修改.ssh目录的权限以及authorized_keys的权限(这个必须修改,要不然还是需要密码)

sudo chmod 644~/.ssh/authorized_keys

sudo chmod 700 ~/.ssh

二、Hadoop 2.0稳定版介质

http://mirrors.cnnic.cn/apache/

http://mirrors.cnnic.cn/apache/hadoop/core/stable/hadoop-2.7.2.tar.gz

1、上传解压文件并创建软链接

  # tar xzvfhadoop-2.2.0.tar.gz
  # chown -R hadoop:hadoop hadoop-2.2.0 -R级联的授权,子目录都有权限)



2、配置主机变量

配置环境变量(三台主机)

添加如下内容到hadoop用户的.bashrc文件:

# User specificaliases and functions

 

exportJAVA_HOME=/usr/java/latest

exportCLASSPATH=$CLASSPATH:$JAVA_HOME/lib

 

exportHADOOP_DEV_HOME=/home/hadoop/hadoop2

export HADOOP_MAPARED_HOME=${HADOOP_DEV_HOME}

exportHADOOP_COMMON_HOME=${HADOOP_DEV_HOME}

exportHADOOP_HDFS_HOME=${HADOOP_DEV_HOME}

exportYARN_HOME=${HADOOP_DEV_HOME}

exportHADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

exportHDFS_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

exportYARN_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

发送到另外两台主机

[hadoop@master .ssh]$scp .bashrc slave1:~

[hadoop@master .ssh]$scp .bashrc slave2:~

3、Hadoop配置有关文件 

修改hadoop-env.sh和mapred-env.sh文件

配置hadoop-env.sh



配置mapred-env.sh


修改yarn-env.sh和slaves文件

~/hadoop2/etc/hadoop/yarn-env.sh


配置~/hadoop2/etc/hadoop/slaves


修改core-site.xml文件

创建hadoop工作目录(临时工作目录,默认是/tmp目录,服务器重启后,文件消失,所以需要另外指定一个目录 /hadoop2)


修改~/hadoop2/etc/hadoop/core-site.xml

fs_defaultFS 是 NameNode的IP

Hadoop.tmp.dir 是hadoop的临时目录,刚刚root用户创建的 /hadoop2/tmp

Hadoop.proxyuser.hadoop.hosts中的“.hadoop.”是用户名,我们这里是hadoop,如果使用别的用户,需要用别的用户名,例如: hadoop.proxyuser.userhadoop.hosts 


修改hdfs-site.xml文件

创建hadoop工作目录(生产环境中的hadoop目录需要指定挂接独立磁盘或独立盘阵的目录。


修改~/hadoop2/etc/hadoop/hdfs-site.xml

dfs.replication 副本数量,这里设置为2,默认为3

dfs.webhdfs.enabled 通过web监控hdfs


二、Hadoop 2.0——集群配置安装

编辑 ~/hadoop2/etc/hadoop/mapred-site.xml

hadoop 2.7.2的官方文档中没有vcores的设置项,此截图源自 hadoop2.2.0


编辑~/hadoop2/etc/hadoop/yarn-site.xml



复制配置到其他节点

##复制.bashrc和hadoop安装目录到slave1和slave2

$ scp .bashrcslave1:~

$ scp .bashrcslave2:~

$ scp -r hadoop-2.2.0slave1:~

$ scp -r hadoop-2.2.0slave2:~

##为slave1和slave2创建软链接

$ ln –s hadoop-2.2.0hadoop2

####为slave1和slave2创建hadoop工作目录

# mkdir -p/hadoop2/dfs/data

# chown -Rhadoop.hadoop /hadoop2/

启动HDFS集群(至此hdfs文件系统启动)

启动HDFS集群:

如果配置了ssh无密码登录可以使用 start-dfs.sh 启动分布式文件系统

If etc/hadoop/slavesand ssh trusted access is configured (see Single Node Setup), all of the HDFSprocesses can be started with a utility script.

[hdfs]$$HADOOP_PREFIX/sbin/start-dfs.sh

登录Web控制台,查看HDFS集群状态

http://192.168.150.30:50070

二、启动HDFS集群可能遇到的问题

1、启动日志中提示"unable to load native-library"

hadoop默认编译的native-library是32bit,我们的RedHat 6.6是64系统需要重新编译类库,或从网路上找一个64位下载即可

http://dl.download.csdn.net/down10/20150323/537fec32064614e002edf5b9ceb4f3e5.rar?response-content-disposition=attachment%3Bfilename%3D%22native-x64.rar%22&OSSAccessKeyId=9q6nvzoJGowBj4q1&Expires=1470307292&Signature=hhB7yb0AsKxFJI%2BsYc2DTATIE%2Fo%3D

2、多次格式化造成datanade进程无法拉起且master的日志无输出

${HADOOP_PREFIX}/bin/hdfsnamenode -format 多次执行会造成nameNode 的 clusterID与dataNode不一致

[hadoop@master ~]$head /hadoop2/tmp/dfs/namesecondary/current/VERSION

#Thu Aug 04 18:39:53PDT 2016

namespaceID=1226758419

clusterID=CID-48e0bfd8-5722-4b9d-9da2-79bc13fd8388

cTime=0

storageType=NAME_NODE

blockpoolID=BP-722559016-192.168.150.30-1470359389680

layoutVersion=-63

[hadoop@master ~]$

需要删除 在master节点上删除格式化时生成的文件,之后重新格式化

[hadoop@masterhadoop2]$ rm -fr /hadoop2/tmp/*

[hadoop@masterhadoop2]$ rm -fr /hadoop2/dfs/name/*

[hadoop@masterhadoop2]$ rm -fr /hadoop2/dfs/data/*

[hadoop@masterhadoop2]$/home/hadoop/hadoop2/bin/hdfs namenode -format

[hadoop@masterhadoop2]$start-dfs.sh

三、启动YARN集群

 

在Master上,执行如下命令

[hadoop@master sbin]$pwd

/home/hadoop/hadoop2/sbin

[hadoop@master sbin]$start-yarn.sh

starting yarn daemons

startingresourcemanager, logging to/home/hadoop/hadoop-2.7.2/logs/yarn-hadoop-resourcemanager-master.out

slave2: startingnodemanager, logging to /home/hadoop/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-slave2.out

slave1: startingnodemanager, logging to/home/hadoop/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-slave1.out

[hadoop@master sbin]$

可以查看启动日志,确认HDFS集群启动成功:

[hadoop@master logs]$pwd

/home/hadoop/hadoop2/logs

[hadoop@master logs]$ll

total 160

-rw-rw-r-- 1 hadoophadoop 56709 Aug  4 19:15hadoop-hadoop-namenode-master.log

-rw-rw-r-- 1 hadoophadoop   718 Aug  4 18:10 hadoop-hadoop-namenode-master.out

-rw-rw-r-- 1 hadoophadoop   718 Aug  4 18:03 hadoop-hadoop-namenode-master.out.1

-rw-rw-r-- 1 hadoophadoop 46001 Aug  4 18:39hadoop-hadoop-secondarynamenode-master.log

-rw-rw-r-- 1 hadoophadoop   718 Aug  4 18:10hadoop-hadoop-secondarynamenode-master.out

-rw-rw-r-- 1 hadoophadoop   718 Aug  4 18:03hadoop-hadoop-secondarynamenode-master.out.1

-rw-rw-r-- 1 hadoophadoop     0 Aug  4 18:03 SecurityAuth-hadoop.audit

-rw-rw-r-- 1 hadoophadoop 34622 Aug  4 19:29yarn-hadoop-resourcemanager-master.log

-rw-rw-r-- 1 hadoophadoop  1524 Aug  4 19:29 yarn-hadoop-resourcemanager-master.out

[hadoop@slave1 logs]$pwd

/home/hadoop/hadoop2/logs

[hadoop@slave1 logs]$ll yarn-hadoop-nodemanager-slave1.*

-rw-rw-r-- 1 hadoophadoop 28167 Aug  4 19:29yarn-hadoop-nodemanager-slave1.log

-rw-rw-r-- 1 hadoophadoop  1508 Aug  4 19:29 yarn-hadoop-nodemanager-slave1.out

[hadoop@slave1 logs]$jps

2913 NodeManager

2773 DataNode

3029 Jps

登录Web控制台,查看ResourceManager状态

http://192.168.150.30:8088

登录Web控制台,查看NodeManager状态

http://192.168.150.31:8042/node

 


0 0
原创粉丝点击