centos7全分布模式安装hadoop--hadoop安装系列之二
来源:互联网 发布:幼猫吃什么猫粮好 知乎 编辑:程序博客网 时间:2024/06/13 09:46
一、实验目标
1、学会全分布集群模式安装hadoop软件
2、学会安装hadoop软件过程中的各种跳坑姿势
二、实验环境
三台机器的网络主机配置如下:
192.168.10.166 master
192.168.10.167 slave01
192.168.10.168 slave02
再次梳理一下安装前置条件:
1、安装三台centos7的服务器,版本是Linux master 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
2、创建hadoop组,创建hadoop用户加入hadoop组
3、修改/etc/hosts文件,修改/etc/hostsname文件
4、安装jdk环境,验证jdk安装是否正确
5、完成集群内三台主机的免密认证登陆,实现master主机对slave01、slave02主机的免密登陆
具体实验环境及安装环境前置步骤参见前篇博文:
http://blog.csdn.net/firehadoop/article/details/68953541三、实验步骤
1、修改master主机hadoop用户环境变量,并拷贝至slave01、slave02主机上
[hadoop@master ~]$ vi .bashrc
export JAVA_HOME=/usr/java/jdk1.8.0_121
export HADOOP_HOME=/home/hadoop/bigdata/hadoop
export HADOOP_USER_NAME=hadoop
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
--拷贝配置文件至slave01,slave02上
[hadoop@master ~]$ scp .bashrc hadoop@slave01:/home/hadoop/
.bashrc 100% 418 0.4KB/s 00:00
[hadoop@master ~]$ scp .bashrc hadoop@slave02:/home/hadoop/
.bashrc 100% 418 0.4KB/s 00:00
2、hadoop用户/home/hadoop目录下解压hadoop安装文件,并执行相关shell命令完成hadoop安装文件部署到/home/hadoop/bigdata/hadoop
[hadoop@master ~]$ tar -zxr hadoop-2.7.3.tar.gz
mkdir bigdata
mv hadoop-2.7.3 bigdata/
cd bigdata/
mv hadoop-2.7.3 hadoop
3、在集群的三台机器上关闭防火墙,并禁用防火墙服务防止下次启动自动加载
--如果不关闭防火墙,后面hadoop集群启动后会出现只有在master上可以运行hadoop命令,其它节点上运行hadoop命令都会报错。
--centos7之前版本关闭防火墙的命令如下,在centos7中使用systemctl工具来管理服务程序,包括了service和chkconfig
service iptables stop
/etc/init.d/iptables stop
--master上关闭防火墙
[hadoop@master ~]$ systemctl stop firewalld.service
[hadoop@master ~]$ systemctl disable firewalld.service
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
--slave01上关闭防火墙
[hadoop@slave01 ~]$ systemctl stop firewalld.service
[hadoop@slave01 ~]$ systemctl disable firewalld.service
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
--slave02上关闭防火墙
[hadoop@slave02 ~]$ systemctl stop firewalld.service
[hadoop@slave02 ~]$ systemctl disable firewalld.service
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
4、在master上修改hadoop配置文件core-site.xml
vim /home/hadoop/bigdata/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000/</value>
</property>
<property>
<!--set node temp dir -->
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/bigdata/data/hadoop/tmp</value>
</property>
</configuration>
--hadoop官网对core-site.xml参数参考
Parameter Value Notes
5、在master上对hdfs-site.xml参数进行配置
--配置senondarynamenode节点在集群中的哪台机器上,本例中与namenode合设在master主机上,业务环境中不建议这样设置,应该设置与master不在同一服务器上;
--配置datanode、namenode的数据实际存储的操作系统文件位置;
--配置hdfs上每个block副本的数量,本例设置了3个副本;
vim /home/hadoop/bigdata/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<!--special SecondaryNamenode address -->
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/bigdata/data/hadoop/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/bigdata/data/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
--hadoop官网关于namenode的配置还有其它选项参考,比如每个block块的大小,包含或不包含那些datanode主机等
Parameter Value Notes
6、修改配置文件mapred-site.xml
vim /home/hadoop/bigdata/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
--hadoop官网关于hadoop的mapreduce计算框架通过什么进行调度的配置,此处引入yarn资源调度框架;
--既然是资源调度框架,所有关于map、reduce、shuffle及task的资源优化配置都在此配置文件中配置,这里的资源是涉及jvm虚拟机的底层资源管理;
Parameter Value Notes
7、修改配置文件yarn-site.xml文件
--该步骤的配置是为了配置resourcemanager及nodemanager节点
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
--hadoop官网关于yarn框架中对resourcemanager、nodemanager资源配置优化有非常多的参数,实验环境只需要设置上面最简单的参数即可跑起来,但是在业务环境中需要重点学习关注下面的参数;
- Configurations for ResourceManager:
- Configurations for NodeManager:
8、修改slaves文件参数
[hadoop@master hadoop]$ vim slaves
slave01
slave02
--设置使用 slaves文件一次在许多主机上运行命令。 它不用于任何基于Java的Hadoop配置。 为了使用此功能,必须为用于运行Hadoop的帐户建立ssh信任。
9、拷贝master上的/home/hadoop/bigdata/hadoop目录都复制到slave01、slave02对应的目录下
--拷贝hadoop文件到slave01主机上
--此处必须要注意,你需要首先登陆到slave01主机上在/home/hadoop/下手工创建bigdata目录否则拷贝数据后会丢失/bigdata后面的hadoop目录
[hadoop@master hadoop]$ scp -r /home/hadoop/bigdata/hadoop/ hadoop@slave01:/home/hadoop/bigdata
--此处必须要注意,你需要首先登陆到slave02主机上在/home/hadoop/下手工创建bigdata目录否则拷贝数据后会丢失/bigdata后面的hadoop目录
[hadoop@master hadoop]$ scp -r /home/hadoop/bigdata/hadoop/ hadoop@slave01:/home/hadoop/bigdata
10、namenode文件系统初始化
--namenode必须要先格式化,否则namenode节点无法启动成功
[hadoop@master sbin]$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
17/04/03 05:44:41 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.10.166
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.3
17/04/03 05:44:41 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/04/03 05:44:41 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-a36eb93b-a2f3-482e-b3c8-8507c2aeca07
17/04/03 05:44:42 INFO namenode.FSNamesystem: No KeyProvider found.
17/04/03 05:44:42 INFO namenode.FSNamesystem: fsLock is fair:true
17/04/03 05:44:42 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
17/04/03 05:44:42 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
17/04/03 05:44:42 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
17/04/03 05:44:42 INFO blockmanagement.BlockManager: The block deletion will start around 2017 Apr 03 05:44:42
17/04/03 05:44:42 INFO util.GSet: Computing capacity for map BlocksMap
17/04/03 05:44:42 INFO util.GSet: VM type = 64-bit
17/04/03 05:44:42 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
17/04/03 05:44:42 INFO util.GSet: capacity = 2^21 = 2097152 entries
17/04/03 05:44:42 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
17/04/03 05:44:42 INFO blockmanagement.BlockManager: defaultReplication = 3
17/04/03 05:44:42 INFO blockmanagement.BlockManager: maxReplication = 512
17/04/03 05:44:42 INFO blockmanagement.BlockManager: minReplication = 1
17/04/03 05:44:42 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
17/04/03 05:44:42 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
17/04/03 05:44:42 INFO blockmanagement.BlockManager: encryptDataTransfer = false
17/04/03 05:44:42 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
17/04/03 05:44:42 INFO namenode.FSNamesystem: fsOwner = hadoop (auth:SIMPLE)
17/04/03 05:44:42 INFO namenode.FSNamesystem: supergroup = supergroup
17/04/03 05:44:42 INFO namenode.FSNamesystem: isPermissionEnabled = true
17/04/03 05:44:42 INFO namenode.FSNamesystem: HA Enabled: false
17/04/03 05:44:42 INFO namenode.FSNamesystem: Append Enabled: true
17/04/03 05:44:43 INFO util.GSet: Computing capacity for map INodeMap
17/04/03 05:44:43 INFO util.GSet: VM type = 64-bit
17/04/03 05:44:43 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
17/04/03 05:44:43 INFO util.GSet: capacity = 2^20 = 1048576 entries
17/04/03 05:44:43 INFO namenode.FSDirectory: ACLs enabled? false
17/04/03 05:44:43 INFO namenode.FSDirectory: XAttrs enabled? true
17/04/03 05:44:43 INFO namenode.FSDirectory: Maximum size of an xattr: 16384
17/04/03 05:44:43 INFO namenode.NameNode: Caching file names occuring more than 10 times
17/04/03 05:44:43 INFO util.GSet: Computing capacity for map cachedBlocks
17/04/03 05:44:43 INFO util.GSet: VM type = 64-bit
17/04/03 05:44:43 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
17/04/03 05:44:43 INFO util.GSet: capacity = 2^18 = 262144 entries
17/04/03 05:44:43 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
17/04/03 05:44:43 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
17/04/03 05:44:43 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
17/04/03 05:44:43 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
17/04/03 05:44:43 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
17/04/03 05:44:43 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
17/04/03 05:44:43 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
17/04/03 05:44:43 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
17/04/03 05:44:43 INFO util.GSet: Computing capacity for map NameNodeRetryCache
17/04/03 05:44:43 INFO util.GSet: VM type = 64-bit
17/04/03 05:44:43 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
17/04/03 05:44:43 INFO util.GSet: capacity = 2^15 = 32768 entries
11、启动hadoop并验证是否安装配置成功
--进入hadoop启动目录执行启动命令
[hadoop@master hadoop]$ cd /home/hadoop/bigdata/hadoop/sbin
[hadoop@master sbin]$ sh start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /home/hadoop/bigdata/hadoop/logs/hadoop-hadoop-namenode-master.out
slave01: starting datanode, logging to /home/hadoop/bigdata/hadoop/logs/hadoop-hadoop-datanode-slave01.out
slave02: starting datanode, logging to /home/hadoop/bigdata/hadoop/logs/hadoop-hadoop-datanode-slave02.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /home/hadoop/bigdata/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/bigdata/hadoop/logs/yarn-hadoop-resourcemanager-master.out
slave01: starting nodemanager, logging to /home/hadoop/bigdata/hadoop/logs/yarn-hadoop-nodemanager-slave01.out
slave02: starting nodemanager, logging to /home/hadoop/bigdata/hadoop/logs/yarn-hadoop-nodemanager-slave02.out
--验证master节点是否运行正常
[hadoop@master sbin]$ jps -m
22996 SecondaryNameNode
23158 ResourceManager
22778 NameNode
23420 Jps -m
--验证slave01节点是否运行正常
[hadoop@slave01 current]$ jps -m
16377 NodeManager
16250 DataNode
16506 Jps -m
--验证slave02节点是否运行正常
[hadoop@slave02 current]$ jps -m
59138 NodeManager
59011 DataNode
59275 Jps -m
--验证集群中所有节点hdfs运行状态
[hadoop@slave01 ~]$ hadoop dfsadmin -report
Configured Capacity: 38002491392 (35.39 GB)
Present Capacity: 27343609856 (25.47 GB)
DFS Remaining: 27343589376 (25.47 GB)
DFS Used: 20480 (20 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (2)://清楚看到2个datanode节点存活
Name: 192.168.10.167:50010 (slave01)
Hostname: slave01
Decommission Status : Normal
Configured Capacity: 19001245696 (17.70 GB)
DFS Used: 12288 (12 KB)
Non DFS Used: 5321039872 (4.96 GB)
DFS Remaining: 13680193536 (12.74 GB)
DFS Used%: 0.00%
DFS Remaining%: 72.00%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Apr 03 18:06:47 PDT 2017
Name: 192.168.10.168:50010 (slave02)
Hostname: slave02
Decommission Status : Normal
Configured Capacity: 19001245696 (17.70 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 5337841664 (4.97 GB)
DFS Remaining: 13663395840 (12.73 GB)
DFS Used%: 0.00%
DFS Remaining%: 71.91%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Apr 03 18:06:48 PDT 2017
--通过hadoop的http服务验证
查看RM:http://master:8088/
查看hdfs:http://master:50070/
四、实验排错
1、master上的namenode总是无法启动
观察hadoop下的logs/hadoop-hadoop-namenode-master.log文件发现故障原因是没有初始化namenode,导致hdfs下对应的namenode不存在
2017-04-02 14:58:10,052 WARN org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/hadoop/bigdata/data/hadoop/hdfs/namenode does not exist
2017-04-02 14:58:10,053 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/hadoop/bigdata/data/hadoop/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:327)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:215)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)
2017-04-02 14:58:10,090 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:50070
2017-04-02 14:58:10,091 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2017-04-02 14:58:10,091 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2017-04-02 14:58:10,091 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2017-04-02 14:58:10,091 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
2、多次初始化namenode后导致datanode上的参数与namenode对不上,datanode无法启动
cat /home/hadoop/bigdata/data/hadoop/hdfs/namenode/current/VERSION
--注意clusterID这行,复制下来
clusterID=CID-a36eb93b-a2f3-482e-b3c8-8507c2aeca07
--进入slave01主机
vim /home/hadoop/bigdata/data/hadoop/hdfs/datanode/current/VERSION
发现clusterID与上面master的值不一样,将master复制的值粘贴覆盖;
--进入slave02主机执行与slave01主机一样的粘贴覆盖操作;
vim /home/hadoop/bigdata/data/hadoop/hdfs/datanode/current/VERSION
故障原因:
这个坑是因为我安装集群后,没有格式化namenode就直接启动,导致namenode启动不了,datanode可以启动,后来格式化了namenode,但是忘记关闭防火墙,导致集群其它机器不能使用,所以我又格式化了两次namenode,并且又关闭了防火墙,结果这时namenode可以启动,结果datanode死活启动不了。
这个问题一般是由于两次或两次以上的格式化NameNode造成的,有两种方法可以解决,
第一种方法是删除DataNode的所有资料(及将集群中每个datanode的/home/hadoop/bigdata/data/hadoop/hdfs/namenode/current/VERSION删掉,然后执行hadoop namenode -format重启集群
第二种方法是修改每个DataNode的/home/hadoop/bigdata/data/hadoop/hdfs/namenode/current/VERSION中的clusterID,使其与master中一致。
3、没有关闭防火墙导致集群内只有master可以使用hadoop
--mster上的防火墙没有关,集群中master上的防火墙必须要关闭,slave01与slave02上的防火墙建议关闭
[hadoop@slave02 current]$ hadoop fs -ls /
ls: No Route to Host from slave02/192.168.10.168 to master:9000 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
注意以下四种情况错误即可:
- The hostname of the remote machine is wrong in the configuration files
The client's host table /etc/hosts has an invalid IPAddress for the target host.
- The DNS server's host table has an invalid IPAddress for the target host.
- The client's routing tables (In Linux, iptables) are wrong.
五、实验总结
集群中安装hadoop2.7.3服务是一个要按照流程一步一步操作的过程,其中遇到的很多问题,都是因为对hadoop安装流程不熟人为导致的,回过头总结以下安装流程:
1、集群java环境安装确认;
2、集群hadoop组、用户及/etc/hosts、/etc/hostname文件修改确认
3、集群实现master主机对slave01、slave02免密登陆确认
4、集群hadoop用户环境配置文件 .bashrc修改确认
5、集群内三台服务器关闭防火墙及关闭启动服务修改确认
6、master主机上释放hadoop2.7.3的安装文件并做安装目录设置调整
7、进入hadoop配置文件目录/home/hadoop/bigdata/hadoop/etc/hadoop,修改以下五个配置文件:
core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml
slaves
8、登陆到slave01、slave02主机上在/home/hadoop下用hadoop用户创建bigdata目录
9、通过scp命令加-r(目录拷贝)将master上/home/hadoop/bigdata/hadoop目录所有文件内容拷贝到slave01、slave02主机的/home/hadoop/bigdata下
10、namenode节点hdfs文件系统初始化
11、进入/home/hadoop/bigdata/hadoop/sbin下通过sh start-all.sh启动hadoop服务;
- centos7全分布模式安装hadoop--hadoop安装系列之二
- hadoop全分布模式安装
- centos7 安装hadoop:伪分布式配置、全分布模式配置
- Hadoop全分布模式安装和测试
- [原创]全分布模式下Hadoop安装
- centos7 伪分布hadoop安装
- hadoop(二) - 伪分布模式的安装
- [Hadoop系列]Hadoop的安装-2.伪分布模式
- [Hadoop系列]Hadoop的安装-3.完全分布模式
- hadoop 2.6全分布安装
- hadoop 2.6全分布安装
- hadoop 2.6全分布安装
- centos6.4安装hadoop-1.2.1全分布模式
- Hadoop伪分布模式安装
- Hadoop伪分布模式安装
- 伪分布模式安装Hadoop
- hadoop之伪分布安装
- Hadoop全分布安装配置及常见问题
- 2015年第六届蓝桥杯C/C++程序设计本科B组省赛-九数组分数
- Linux的文件权限与目录配置
- pie hist plot boxplot
- 解决了unmet peer dependency
- 从服务端架构设计角度,深入理解大型APP架构升级
- centos7全分布模式安装hadoop--hadoop安装系列之二
- 【机器学习】k-近邻算法(kNN) 学习笔记
- ProgressDialog使用总结
- 用JavaScript循环地删除table中特定的一行
- 【学习记录】练习题-二分查找
- 大数据IMF传奇行动绝密课程第87课:Flume推送数据到Spark Streaming案例实战和内幕源码解密
- 爬虫抓站技巧总结
- Android中BaseFragment封装多状态视图显示
- 插入排序