Hadoop 1.0.3 集群搭建

来源：互联网发布：king最新域名编辑：程序博客网时间：2024/06/06 03:10

最近花了几天时间，终于把hadoop集群搭建成功了，集群信息如下：

主机名角色jps命令结果安装路径

centos-1master2612 NameNode
3017 TaskTracker
3129 Jps
2821 SecondaryNameNode
2907 JobTracker/opt/soft/hadoop1.0.3

centos-2slaves4146 DataNode
4222 Jps
3998 TaskTracker/opt/soft/hadoop1.0.3
centos-3
slaves3968 TaskTracker
4208 Jps
4132 DataNode/opt/soft/hadoop1.0.3

1、安装Java运行环境

下载JDK 1.7，解压安装到/opt/soft/jdk1.7 ，在/etc/profile中配置环境变量

#Java
export JAVA_HOME=/opt/soft/jdk1.7
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

2、安装hadoop1.0.3

下载hadoop1.0.3，解压到/opt/soft/hadoop1.0.3 ，在/etc/profile中配置hadoop的环境变量

#hadoop
export HADOOP_HOME=/opt/soft/hadoop1.0.3
export PATH=$HADOOP_HOME/bin:$PATH

打开/opt/soft/hadoop1.0.3/conf/hadoop-env.sh文件，取消JAVA_HOME的注释并将JAVA_HOME的值改为/opt/soft/jdk1.7

至此 hadoop单机版就安装成功了，这是hadoop的配置文件都为空，接下来安装hadoop的伪分布式

3、在centos-1上hadoop的伪分布式安扎un个

1）修改core-site.xml文件

编辑/opt/soft/hadoop1.0.3/conf/core-site.xml文件，添加如下内容：

<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://centos-1:9000</value>
<description>The name of the default file system.A URI whose scheme and authority determine the FileSystem implementation.The uri's scheme determines the c onfig property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host,port,etc. for a filesystem.
</description>
</property>

<property>
<name>dfs.name.dir</name>
<value>/hadoop/name</value>
<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-deliminted list of directories then the n ame table is replicated in all of the directories, for redundancy .</description>
</property>

2）修改hdfs-site.xml文件

编辑/opt/soft/hadoop1.0.3/conf/hdfs-site.xml文件，添加如下内容：

<property>
<name>dfs.data.dir</name>
<value>/hadoop/data</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks.If this is a comma-delimited list of directories,then data will be sto red in all named directories,typically on different devices.Directories that do not exist are ignored.</description>
</property>
<property>
<name>dfs.replication</name> // 默认Block副本数
<value>1</value>
<description>Default block replication.The actual number of replications can be specified when the file is created. The default is used if replication is not specified i n create time.</description>
</property>

3) 修改mapred-site.xml文件

编辑/opt/soft/hadoop1.0.3/conf/mapred-site.xml文件，添加如下内容：

<property>
<name>mapred.job.tracker</name>
<value>centos-1:9001</value>
<description>The host and port that the MapReduce job tracker runs at. If "local",then jobs are run in-process as a single map and reduce task.</description>
</property>

4) 修改conf文件夹下的masters文件，在文件内容写入master节点的地址：centos-1
5) 修改conf文件夹下的slaves文件，添加所有的slaver节点:

centos-1

centos-2

centos-3

6) 修改 /etc/hosts 文件，将master节点和slaver节点的ip映射写入文件

192.168.1.20 centos-1

192.168.1.21 centos-2

192.168.1.22 centos-3

如果上述第五步、第六步不添加slaver节点的ip ，那么就是伪分布式安装了。

4、分布式安装

将上述修改过的文件拷贝到slaves节点相应的目录下，接下来还需要安装ssh免密登录.

首先进入/root目录，使用命令 ssh-keygen -t rsa ，然后使用命令 cd .ssh 进入隐藏目录.ssh ，接着使用命令cp id_rsa.pub authorized_keys 复制该文件，将复制后的文件拷贝到各个slaves节点相应的目录下

scp authorized_keys centos-2:/root/.ssh

scp authorized_keys centos-3:/root/.ssh

然后使用命令ssh centos-2

ssh centos-3 第一次输入密码后，以后登录就都不需要密码了。

注意: .ssh 权限必须是700 ，.ssh 里面的文件的权限最好是600

5、至此，hadoop的分布式环境就搭建好了

第一次使用时执行命令：hadoop namenode -format

start-all.sh //开启集群

#####注意：本文的hadoop搭建是在root用户下进行的，如果需要换成其他用户，那么在hadoop的安装，以及配置ssh免密登陆时一定要同一用户########

#####如果在执行start-all.sh 后，slaves节点中的datanode、tasktracker没有启动，那么请关闭各个节点的防火墙，让偶

0 0