Fedora 18 Hadoop 1.2.0集群搭建

来源:互联网 发布:淘宝市场在哪里 编辑:程序博客网 时间:2024/05/16 01:47

一、服务器规划
  1、Masters OS:Fedora Linux 18 内存:1G 硬盘:12G root/master hadoop/hadoop
        master1 hostname:master1.hadoop
        master2 hostname:master2.hadoop
   2、Slaves OS:Fedora Linux 18 内存:512M 硬盘:12G root/slave hadoop/hadoop
        slave1 hostname:slave1.hadoop
        slave2 hostname:slave2.hadoop
        slave3 hostname:slave3.hadoop
        slave4 hostname:slave4.hadoop 
        slave5 hostname:slave5.hadoop
        slave6 hostname:slave6.hadoop
  注意:
        1、参与集群的服务都必须有一个同名的用户hadoop,用于SSH

二、系统环境准备
 1、vmware虚8台服务器,并做如下配置:
      1)修改hostname
          [root@slave1 hadoop]# vim /etc/hostname
      2)修改hosts列表
          [root@slave1 hadoop]# vim /etc/hosts
      3)更新Fedora源
          yum update 更新Fedora软件源
      4)安装基本软件
          yum install nginx axel vim gvim gedit wget
      5)卸载openJdk
          rpm -qa|grep java
          yum remove openjdk信息  
      6)配置Master无密码登录所有Salve
          a)安装SSH
             确保系统安装的openSSH版本相同
               rpm -qa|grep openssh
               yum install openssh
    
          b)确保系统已安装rsync
              rpm -qa|grep rsync
               yum install rsync
          c)Master机器上生成密码对
              ssh-keygen -t rsa -P '' 生成id_rsa和id_rsa.pub,默认路径~/.ssh/
              cat .ssh/id_rsa.pub >> .ssh/authorized_keys 生成授权key文件authorized_keys
              chmod 600 authorized_keys
              修改ssh配置文件
                  [root@slave1 hadoop]# vim /etc/ssh/sshd_config
              RSAAuthentication yes # 启用 RSA 认证
                  PubkeyAuthentication yes # 启用公钥私钥配对认证方式
                  AuthorizedKeysFile .ssh/authorized_keys # 公钥文件路径(和上面生成的文件同)
               向slaves节点分发authorized_keys
                  scp .ssh/authorized_keys hadoop@slave1:~/.ssh
              追加slaves节点公钥id_rsa.pub到authorized_keys
                   [hadoop@slave1 .ssh]$ cat id_rsa.pub >> .ssh/authorized_keys
              将追加完的authorized_keys传到master1
                   [hadoop@slave1 .ssh]$ scp authorized_keys hadoop@master1:~/.ssh
        7)安装jdk1.6.0_45
              通过scp命令分发master1 /opt/java到slaves /opt/ 这里要先更改所有服务器hadoop对/opt操作权限(chown hadoop /opt)
              [root@slave1 hadoop]# vim /etc/profile 添加如下配置
                   JAVA_HOME=/opt/java/jdk1.6.0_45
                   CLASS_PATH=$JAVA_HOME/lib/dt.jar:.:$JAVA_HOME/lib/tools.jar
                   PATH=$JAVA_HOME/bin:.:$PATH
              [root@slave1 hadoop]# source /etc/profile
              [hadoop@slave1 ~]$ vim .bash_profile 添加如下配置
                   JAVA_HOME=/opt/java/jdk1.6.0_45
                   CLASS_PATH=$JAVA_HOME/lib/dt.jar:.:$JAVA_HOME/lib/tools.jar
                   PATH=$JAVA_HOME/bin:.:$PATH
              [hadoop@slave1 ~]$ source .bash_profile
  2、masters和slaves服务器的Hadoop配置
   1)Hadoop存储配置
    创建/data目录,并授予hadoop用户权限,将hadoop存储挂载到/data,以hadoop用户登录并执行以下操作:
    [hadoop@slave1 data]$ mkdir logs pids dfs tmp
    [hadoop@slave1 dfs]$ mkdir name data
    注意:slaves节点的dfs目录下不要创建name和data目录,否则datanode启动时会报d
   2)修改/opt/hadoop120/conf/hadoop-env.sh
    [hadoop@slave1 hadoop120]$ vim conf/hadoop-env.sh
     export JAVA_HOME=/opt/java/jdk1.6.0_45
     export HADOOP_HOME=/opt/hadoop120
    
     export PATH=$HADOOP_HOME/bin:$PATH
     export HADOOP_LOG_DIR=/data/logs
     export HADOOP_PID_DIR=/data/pids
     修hadoop用户.bash_profile添加如下内容
     source /opt/hadoop120/conf/hadoop-env.sh
    
   3)修改/opt/hadoop120/conf/core-site.xml    
    [hadoop@slave1 hadoop120]$ vim conf/core-site.xml
     <property>
      <name>hadoop.tmp.dir</name>
      <value>/data/tmp/hadoop-${user.name}</value>
     </property>
     <property>
      <name>fs.default.name</name>
      <value>hdfs://master1:9000</value>
     </property>
    
   4)修改/opt/hadoop120/conf/mapred-site.xml
    [hadoop@slave1 hadoop120]$ vim conf/mapred-site.xml
    <property>
     <name>mapred.job.tracker</name>
     <value>master1:9001</value>
    </property>
    
   5)修改/opt/hadoop120/conf/hdfs-site.xml
    [hadoop@slave1 hadoop120]$ vim conf/hdfs-site.xml
    <property>
     <name>dfs.replication</name>
     <value>4</value>
    </property>
    <property>
     <name>dfs.data.dir</name>
     <value>/data/dfs/data</value>
    </property>
    <property>
     <name>dfs.name.dir</name>
     <value>/data/dfs/name</value>
    </property>
   
   6)修改/opt/hadoop120/conf/masters
    [hadoop@slave1 hadoop120]$ vim conf/masters
    master1
    master2
    
   7)修改/opt/hadoop120/conf/slaves
    [hadoop@slave1 hadoop120]$ vim conf/slaves
    slave1
    slave2
    slave3
    slave4
    slave5
    slave6
   注意:1、修改/data权限 chown hadoop.hadoop /data chmod 775 /data
     2、打开所有集群服务器9000 9001端口,这里hadoop配置的是这两端口
  三、验证
   hadoop namenode -format 首次启动前使用,namenode datanode成功启动后就不要使用了,否则会报pid错误
   start-all.sh
   hadoop dfsadmin -report
   http://127.0.0.1:50030
   http://127.0.0.1:50070
    
 参考的贴子:Hadoop集群(第5期)_Hadoop安装配置 http://www.cnblogs.com/xia520pi/archive/2012/05/16/2503949.html

                       Hadoop1.2.0开发笔记(三)  http://www.cnblogs.com/chenying99/archive/2013/05/31/3109566.html

原创粉丝点击