第一部分:分布式部署Hadoop 2.x

来源:互联网 发布:腰振mmd动作数据下载 编辑:程序博客网 时间:2024/05/19 02:26

分布式集群安装


step1: 基于伪分布式环境安装进行展开

step2: 规划机器与服务

  • HDFS文件系统
  • YARN”云操作系统”
  • JobHistoryServer历史服务监控
        hadoop-senior           hadoop-senior02         hadoop-senior03HDFS        NameNode        DataNode                DataNode                DataNode                                                        SecondaryNameNodeYARN                                ResourceManager        NodeManager             NodeManager             NodeManagerMapReduce        JobHistoryServer
删除文件:[root@hadoop-senior01 opt]# rm -rf app给app文件赋权限:[root@hadoop-senior01 opt]# chown -R xiangkun:xiangkun /opt/app/

step3: 修改配置文件,设置服务运行机器节点

配置    * hdfs         * hadoop-env.sh         * core-site.xml         * hdfs-site.xml         * slaves    * yarn         * yarn-env.sh         * yarn-site.xml         * slaves    * mapredue         * mapred-env.sh         * mapred-site.xml

hadoop-env.sh

这里写图片描述

core-site.xml

这里写图片描述

hdfs-site.xml

这里写图片描述

slaves

这里写图片描述

yarn-env.sh

这里写图片描述

yarn-site.xml

这里写图片描述

mapred-env.sh

这里写图片描述

mapred-site.xml

这里写图片描述

step4: 分发HADOOP安装包至各个机器节点

由于doc文件作用不大,而且占用空间,我们将其清除(rm -rf ./doc/)

[xiangkun@hadoop-senior01 hadoop-2.5.0]$ cd share[xiangkun@hadoop-senior01 share]$ lsdoc  hadoop[xiangkun@hadoop-senior01 share]$ df -lhFilesystem                          Size  Used Avail Use% Mounted on/dev/mapper/vg_xiangkunqin-lv_root   18G   13G  3.6G  79% /tmpfs                               940M   80K  940M   1% /dev/shm/dev/sda1                           485M   39M  421M   9% /boot[xiangkun@hadoop-senior01 share]$ rm -rf ./doc/[xiangkun@hadoop-senior01 share]$ df -lhFilesystem                          Size  Used Avail Use% Mounted on/dev/mapper/vg_xiangkunqin-lv_root   18G   12G  5.1G  69% /tmpfs                               940M   80K  940M   1% /dev/shm/dev/sda1                           485M   39M  421M   9% /boot

分发使用fcp协议,故配置ssh无密钥登录

[xiangkun@hadoop-senior01 ~]$ cd .ssh[xiangkun@hadoop-senior01 .ssh]$ ls[xiangkun@hadoop-senior01 .ssh]$ ls[xiangkun@hadoop-senior01 .ssh]$ ssh-keygen -t rsaGenerating public/private rsa key pair.Enter file in which to save the key (/home/xiangkun/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/xiangkun/.ssh/id_rsa.Your public key has been saved in /home/xiangkun/.ssh/id_rsa.pub.The key fingerprint is:de:76:78:38:e6:84:74:f1:2c:f8:da:50:9a:be:a4:33 xiangkun@hadoop-senior01.xiangkunThe key's randomart image is:+--[ RSA 2048]----+|                 ||                 ||          .      ||         . +     ||        S + o    ||       o O +     ||        B X o    ||      E+ O +     ||      .o+.o      |+-----------------+[xiangkun@hadoop-senior01 .ssh]$ [xiangkun@hadoop-senior01 .ssh]$ ll总用量 8-rw-------. 1 xiangkun xiangkun 1675 7月   4 14:05 id_rsa-rw-r--r--. 1 xiangkun xiangkun  415 7月   4 14:05 id_rsa.pub[xiangkun@hadoop-senior01 .ssh]$ hostnamehadoop-senior01.xiangkun[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior01.xiangkunThe authenticity of host 'hadoop-senior01.xiangkun (192.168.111.106)' can't be established.RSA key fingerprint is da:12:42:76:de:23:3a:01:48:18:cd:9e:60:d6:83:b8.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'hadoop-senior01.xiangkun,192.168.111.106' (RSA) to the list of known hosts.xiangkun@hadoop-senior01.xiangkun's password: Now try logging into the machine, with "ssh 'hadoop-senior01.xiangkun'", and check in:  .ssh/authorized_keysto make sure we haven't added extra keys that you weren't expecting.[xiangkun@hadoop-senior01 .ssh]$ ll总用量 16-rw-------. 1 xiangkun xiangkun  415 7月   4 14:08 authorized_keys-rw-------. 1 xiangkun xiangkun 1675 7月   4 14:05 id_rsa-rw-r--r--. 1 xiangkun xiangkun  415 7月   4 14:05 id_rsa.pub-rw-r--r--. 1 xiangkun xiangkun  422 7月   4 14:08 known_hosts

分别配置三台机器无秘钥登录

第一台:[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior01.xiangkun[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior02.xiangkun[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior03.xiangkun第二台:同上第三台:同上

分别将hadoop分发到另外三台机器上

[xiangkun@hadoop-senior01 app]$ scp -r ./hadoop-2.5.0/ xiangkun@hadoop-senior03.xiangkun: /opt/app/

step5: 依据官方集群安装文档,分别启动各个节点相应服务

hadoop-senior01 格式化:
这里写图片描述

启动:
bin/start-dfs.sh

step6: 测试HDFS、YARN、MapReduce, Web UI监控集群

step7: 配置主节点SSH无密钥登陆

step8: 集群基准测试(实验环境中必须的)—>面试题

集群基准测试(实际环境中必须的)-面试题    ** 基本测试        &服务启动,是否可用,简单的应用        &HDFS创建和删除是否能够成功                读写操作                bin/hdfs dfs -mkdir -p /user/xingkun/tmp/conf                bin/hdfs dfs -put etc/hadoop/*-site.xml /user/xingkun/tmp/conf               bin/hdfs  bfs -text /user/xiangkun/tmp/conf/core-site.xml        &yarn         run jar        &mapreduce         bin/yarn  jar share/hadoop/mapreduce/hadoop* example*.jar         word count /user/xiangkun/mapreduce/wordcount/input  /user/xaingkun/mapreduce/wordcount/output    **基准测试        测试集群的性能            &hdfs:写数据、读数据    **监控集群        &cloudera        &Cloudera Manager            部署安装集群            监控集群            配置同步集群            预警
原创粉丝点击