hadoop 2.2.0安装与测试

来源:互联网 发布:免费预算软件 编辑:程序博客网 时间:2024/04/26 04:20

一、安装Centos6.3

 


二、安装JDK

下面在Master节点中安装JDK,其他的节点按同样的方法安装JDK,当然也可以把Master中java文件夹复制到slaves节点中相应的目录,下面所有的操作都以root身份进行:

 

1、将下载的jdk-6u27-linux-i586.bin拷贝到Master节点的/usr/local/java目录下(先在/usr/local/目录下新建java目录);

 

2、解压JDK到当前目录,并删除当前包

 

[hadoop@Master java]$ tar -zxvfjdk-6u27-linux-i586.bin[hadoop@Master java]$ rm -rf jdk-6u27-linux-i586.bin 

3、配置环境变量

 

[hadoop@Master local]$ vi /etc/profile


在profile配置文件中加入:

#set java path
export JAVA_HOME=/usr/local/java/jdk1.6.0_27
export JRE_HOME=$JAVA_HOME/jre
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
exportCLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH

使配置文件生效:

 

[hadoop@Master local]$ source /etc/profile

或者:

[hadoop@Master local]$ . /etc/profile


检验JDK是否安装成功:

[hadoop@Master local]$ java -versionjava version "1.6.0_27"Java(TM) SE Runtime Environment (build 1.6.0_27-b07)Java HotSpot(TM) Client VM (build 20.2-b06, mixed mode, sharing)[hadoop@Master local]$  

三、节点之间无密码登录


四、hadoop安装与配置

现在Master机器上安装和配置,且都需要以root身份进行:

 

1  安装hadoop

1)将hadoop-2.2.0.tar.gz拷贝到Master.hadoop的“/usr/local”目录下;

 

2)解压hadoop-2.2.0.tar.gz

[root@Master local]# tar -zxvf hadoop-2.2.0.tar.gz

 

3) 将hadoop-2.2.0.tar.gz更名为hadoop

[root@Master local]# mv hadoop-2.2.0 hadoop

 

4)删除hadoop-2.2.0.tar.gz安装包

[root@Master local]# rm  -rf hadoop-2.2.0.tar.gz

 

5)将文件夹hadoop的读权限分配给hadoop用户

[root@Master local]# chown -R hadoop:hadoop hadoop

 

6)设置hadoop的路径(配置/etc/profile)

[root@Master hadoop]# vi /etc/profile

 

在文件中加入:

#set hadoop path
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin

 

7)重启/etc/profile

用source /etc/profile 或者  .  /etc/profile  命令

[root@Master hadoop]# . /etc/profile

 

2  配置hadoop

     hadoop-2.2.0所有的配置文件都位于/usr/local/hadoop/etc/hadoop目录下,例如:hadoop-env.sh、yarn-env.sh、core-site.xml、hdfs-site.xml 、  mapred-site.xml.template、 yarn-site.xml

 

以下操作都以hadoop身份进行:

 

 (其实第1)步没必要配置,因为在后面测试时,执行hadoopnamenode -format命令格式化时,会自动创建下面这些目录)

注意:后面我们配置的dfs.namenode.name.dir和dfs.datanode.data.dir的目录不是在/usr/local/hadoop下,我们放在/home目录下,因为根目录所挂载的分区内存不足,如果配置在/usr/local/hadoop目录下,将无法上传文件到hdfs文件系统

1)以hadoop用户在/usr/local/hadoop/创建“tmp”,“dfs”文件夹,在dfs下创建“name”和“data”

[hadoop@Master hadoop]$ pwd
/usr/local/hadoop

 

[hadoop@Master hadoop]$ mkdir tmp
[hadoop@Master hadoop]$ mkdir dfs
[hadoop@Master hadoop]$ ll
total 60
drwxr-xr-x. 2 hadoop hadoop  4096 Mar 31 04:49 bin
drwxrwxr-x. 2 hadoop hadoop  4096 Aug  2 05:15 dfs
................
drwxr-xr-x. 2 hadoop hadoop  4096Aug  2 04:34 tmp

 

[hadoop@Master hadoop]$ cd dfs
[hadoop@Master dfs]$ ll
total 0
[hadoop@Master dfs]$ mkdir name
[hadoop@Master dfs]$ mkdir data
[hadoop@Master dfs]$ ll
total 8
drwxrwxr-x. 2 hadoop hadoop 4096 Aug  2 05:17 data
drwxrwxr-x. 2 hadoop hadoop 4096 Aug  2 05:17 name

 

 2)配置hadoop-env.sh

[root@Master hadoop]# vi hadoop-env.sh

加入:

# The java implementation to use.
export JAVA_HOME=/usr/local/java/jdk1.6.0_27

 

 3)配置yarn-env.sh

[hadoop@Master hadoop]$ vi yarn-env.sh 

加入:

# some Java parameters

exportJAVA_HOME=/usr/local/java/jdk1.6.0_27

 

4)配置vi slaves

 [root@Master hadoop]# vi slaves

 

加入:
#localhost
192.168.137.128
192.168.137.129

 

5)配置core-site.xml

[root@Master hadoop]# vi core-site.xml


在<configuration></configuration>之间加入:

<property>
        <name>fs.defaultFS</name>
       <value>hdfs://192.168.137.120:9000/</value>
</property>
<property>
       <name>io.file.buffer.size</name>
       <value>131072</value>

</property>
<property>
       <name>hadoop.tmp.dir</name>
      <value>file:/usr/local/hadoop/tmp</value>
       <description>Abase for other temporarydirectories.</description>
</property>

6)配置hdfs-site.xml

[hadoop@Master hadoop]$ vi hdfs-site.xml 

 

在<configuration></configuration>之间加入:

<property>
       <name>dfs.namenode.secondary.http-address</name>
        <value>Master.Hadoop:9001</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
       <value>/home/${user.name}/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
       <value>/home/${user.name}/dfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>

 

7)配置mapred-site.xml

[hadoop@Master hadoop]$ mv mapred-site.xml.template mapred-site.xml
[hadoop@Master hadoop]$ vi mapred-site.xml 



在<configuration></configuration>之间加入:

<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
       <name>mapreduce.jobhistory.address</name>
        <value>Master.Hadoop:10020</value>
    </property>
    <property>
       <name>mapreduce.jobhistory.webapp.address</name>
        <value>Master.Hadoop:19888</value>
    </property>
    <property>
       <name>mapreduce.jobhistory.intermediate-done-dir</name>
        <value>/mr-history/tmp</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.done-dir</name>
        <value>/mr-history/done</value>
    </property>

 

8)配置yarn-site.xml

[hadoop@Master hadoop]$ vi yarn-site.xml 

 

在<configuration></configuration>之间加入:

<property>
       <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>Master.Hadoop:8032</value>
    </property>
    <property>
       <name>yarn.resourcemanager.scheduler.address</name>
        <value>Master.Hadoop:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>Master.Hadoop:8031</value>
    </property>
    <property>
       <name>yarn.resourcemanager.admin.address</name>
        <value>Master.Hadoop:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>Master.Hadoop:8088</value>
    </property>

 

配置其他机器:

将Master中配置好的hadoop的文件夹“/usr/local/hadoop”复制到所有的slave的“/usr/local”下,下面以slave1为例,slave2以同样的方法:

 

1)复制到其他slave机器上

复制到slave1:

[hadoop@Master hadoop]$ scp -r /usr/local/hadoop root@192.168.137.128:/usr/local



 

 

2)以root身份修改hadoop文件夹的所有者和所属组

[root@Salve1 local]# pwd
/usr/local
[root@Salve1 local]# chown -R hadoop:hadoop hadoop

3)关闭集群中所有的机器的防火墙

Master:

[root@Master local]# chkconfig iptables off
[root@Master local]# service iptables status
iptables: Firewall is not running.




Slave1:

[root@Salve1 local]# chkconfig iptables off
[root@Salve1 local]# service iptables status
iptables: Firewall is not running.




Slave2:

[root@Salve2 local]# chkconfig iptables off
[root@Salve2 local]# service iptables status
iptables: Firewall is not running.



4)同步系统时间和硬件时间

查看系统时间和硬件时间:

[hadoop@Master hadoop]$ date;hwclock -r
Tue Aug 19 20:27:58 CST 2014
Tue 19 Aug 2014 08:20:52 PM CST  -0.286125 seconds

 

从时间服务器time.nist.gov同步系统时间:

[hadoop@Master hadoop]$ ntpdate time.nist.gov


将系统时间同步到硬件时间:

[hadoop@Master hadoop]$ hwclock -w




 

扩展:

linux中防火墙的关闭:

1、重启后有效

chkconfig iptables off   //关闭防火墙

chkconfig iptables on //开启防火墙

2、重启后失效

service iptables start //开启防火墙

service iptables stop  //关闭防火墙

3、查看防火墙状态

service iptables status

 

 

 

下面操作在Master中以hadoop用户进行:

1)格式化

[hadoop@Master hadoop]$ hadoop namenode -format


2)启动所有节点

[hadoop@Master hadoop]$ start-all.sh


3)查看进程

[hadoop@Master hadoop]$ jps
17483 SecondaryNameNode
28569 Jps
17627 ResourceManager
17317 NameNode



[hadoop@Salve1 ~]$ jps
14967 Jps
13251 NodeManager
13160 DataNode



[hadoop@Salve2 ~]$  jps
12020 DataNode
13779 Jps
12113 NodeManager

 

<pre name="code" class="html">[hadoop@Master hadoop]$ hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
 
14/08/19 20:40:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 31334506496 (29.18 GB)
Present Capacity: 29394505728 (27.38 GB)
DFS Remaining: 29394436096 (27.38 GB)
DFS Used: 69632 (68 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
 
-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)
 
Live datanodes:
Name: 192.168.137.128:50010 (Salve1.Hadoop)
Hostname: Salve1.Hadoop
Decommission Status : Normal
Configured Capacity: 15667253248 (14.59 GB)
DFS Used: 45056 (44 KB)
Non DFS Used: 969990144 (925.05 MB)
DFS Remaining: 14697218048 (13.69 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.81%
Last contact: Tue Aug 19 20:40:56 CST 2014
 
 
Name: 192.168.137.129:50010 (Salve2.Hadoop)
Hostname: Salve2.Hadoop
Decommission Status : Normal
Configured Capacity: 15667253248 (14.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 970010624 (925.07 MB)
DFS Remaining: 14697218048 (13.69 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.81%
Last contact: Tue Aug 19 20:40:57 CST 2014



五、测试

1、在Master节点中以hadoop身份进行操作

在Master节点的“/opt”目录下新建file1.txt和file2.txt两个文件,并在文件中加入内容:

[hadoop@Master hadoop]$ cd /opt
[hadoop@Master opt]$ ll
total 24
-rwxrwxrwx 1 hadoop hadoop   66 Aug  8 16:30 file1.txt
-rwxrwxrwx 1 hadoop hadoop   67 Aug  8 16:31 file2.txt
drwxr-xr-x 2 root   root   4096 Aug  4 15:23 tools
[hadoop@Master opt]$ 
[hadoop@Master opt]$ cat file1.txt 
Hello, i love coding
are you ok?
Hello, i love hadoop
are you ok?
[hadoop@Master opt]$ cat file2.txt 
Hello, i love coding
are you ok ?
Hello i love hadoop
are you ok ?
[hadoop@Master opt]$


2、新建HDFS文件系统的目录

[hadoop@Master ~]$ hadoop fs -mkdir -p /home/input


3、上传文件到HDFS文件系统目录中

上传:

[hadoop@Master ~]$ hadoop fs -put /opt/file*.txt /home/input

 

查看是上传成功:

[hadoop@Master ~]$ hadoop fs -ls /home/input
14/08/19 21:42:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   1 hadoop supergroup         66 2014-08-19 15:27 /home/input/file1.txt
-rw-r--r--   1 hadoop supergroup         67 2014-08-19 15:27 /home/input/file2.txt
[hadoop@Master ~]$




 

至此,全部完成,一般报错的几个关键地方:版本,防火墙,系统时间,日志。

 

 

本文版权所有,如需转载,请声明,并给出原文地址!



 

 

 

 

 

 

 

 

0 0
原创粉丝点击