Hadoop集群开发

来源:互联网 发布:用友软件下载 编辑:程序博客网 时间:2024/06/15 10:14

                   Hadoop集群开发

1.  配置网络环境

1.1. 配置ip地址:

[root@localhost85 ~]# vim/etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0

TYPE=Ethernet

UUID=c1fa9edb-773e-48d2-83cf-82e8b01ffbb0

ONBOOT=yes

NM_CONTROLLED=yes

BOOTPROTO=none

HWADDR=00:0C:29:4E:A6:48

IPADDR=192.168.1.85

PREFIX=24

GATEWAY=192.168.1.1

DNS1=8.8.8.8

DEFROUTE=yes

IPV4_FAILURE_FATAL=yes

IPV6INIT=no

NAME=eth0

[root@localhost85 ~]#

 

1.2. 设置ip地址映射:

[root@localhost85 ~]# vim /etc/hosts

 

127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4

::1        localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.1.85 localhost85

~                            

[root@localhost85 ~]#

 

1.3. 设置主机名

[root@localhost85 ~]# vim /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=localhost85

GATEWAY=192.168.1.1

~                                                                                                  

[root@localhost85 ~]#

 

1.4. 关闭selinux

[root@localhost85 ~]# vim /etc/selinux/config

 

 

# This file controls the state of SELinuxon the system.

# SELINUX= can take one of these threevalues:

#    enforcing - SELinux security policy is enforced.

#    permissive - SELinux prints warnings instead of enforcing.

#     disabled - No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these twovalues:

#    targeted - Targeted processes are protected,

#    mls - Multi Level Security protection.

SELINUXTYPE=targeted

[root@localhost85 ~]#

 

1.5. 关闭防火墙

[root@localhost85 ~]#

[root@localhost85 ~]# iptables -F

[root@localhost85 ~]# /etc/init.d/iptablessave  #保存规则

iptables:将防火墙规则保存到/etc/sysconfig/iptables:     [确定]

[root@localhost85 ~]#

[root@localhost85 ~]# chkconfig iptablesoff  #开机关闭

[root@localhost85 ~]# chkconfig --list

[root@localhost85 ~]#reboot  #重新启动

 

1.6. 生成ssh无秘钥登录

[root@localhost85 ~]#

[root@localhost85 ~]# ssh-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in/root/.ssh/id_rsa.

Your public key has been saved in/root/.ssh/id_rsa.pub.

The key fingerprint is:

13:f1:e5:a3:62:ac:ce:ad:63:72:d1:6f:4a:89:de:5broot@localhost85

The key's randomart image is:

+--[ RSA 2048]----+

|       .   .    |

|        o o     |

|       . . o    |

|      . . . .   |

|      .S .      |

|     .+.+       |

|     o.o.E      |

|   .+++ .o      |

|    +=o=+       |

+-----------------+

[root@localhost85 ~]#

[root@localhost85 ~]# ls /root/.ssh/

id_rsa  id_rsa.pub known_hosts

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# ssh-copy-id -i /root/.ssh/id_rsa root@192.168.1.85

root@192.168.1.85's password:

Now try logging into the machine, with"ssh 'root@192.168.1.85'", and check in:

 

  .ssh/authorized_keys

 

to make sure we haven't added extra keysthat you weren't expecting.

 

[root@localhost85 ~]#

[root@localhost85 ~]# ls /root/.ssh/

authorized_keys  id_rsa id_rsa.pub  known_hosts

[root@localhost85 ~]#

以上配置就可以通过ssh直接登录,不需要密码验证

 

2.  jdk安装

2.1. jdk下载

1、下载jdk官方地址:

http://www.oracle.com/technetwork/java/javase/downloads/index.html

oracle中文网:

http://www.oracle.com/cn/customers/index.html

 

jdk6下载:

http://www.oracle.com/technetwork/cn/java/javase/downloads/java-ee-sdk-6u3-jdk-7u1-downloads-523391-zhs.html

jdk7下载:

http://www.oracle.com/technetwork/cn/java/javase/downloads/java-se-jdk-7-download-432154-zhs.html

 

2.2. jdk卸载

卸载已经存在的jdk:

[root@localhost85 ~]#

[root@localhost85 ~]# rpm -qa | grep gcj

libgcj-4.4.7-4.el6.x86_64

[root@localhost85 ~]# rpm -qa | grep jdk

java-1.7.0-openjdk-devel-1.7.0.45-2.4.3.3.el6.x86_64

java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64

java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64

java-1.6.0-openjdk-devel-1.6.0.0-1.66.1.13.0.el6.x86_64

[root@localhost85 ~]#

[root@localhost85 ~]# rpm -e --nodepsjava-1.6.0-openjdk-devel-1.6.0.0-1.66.1.13.0.el6.x86_64

 

2.3. jdk安装

[root@localhost85 download]# ls

hadoop-1.1.2.tar.gz           mysql-connector-java-5.1.10.jar

hbase-0.94.7-security.tar.gz  pig-0.11.1.tar.gz

hive-0.9.0.tar.gz            sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz

jdk-7u15-linux-x64.gz         zookeeper-3.4.5.tar.gz

[root@localhost85 download]#  tar -xvfjdk-7u15-linux-x64.gz -C /usr/local/

[root@localhost85 download]# cd /usr/local/

[root@localhost85 local]# ls

bin games         include      lib   libexec  share  VMwareTools-9.6.2-1688356.tar.gz

etc hadoop-1.1.2  jdk1.7.0_15  lib64 sbin     src    vmware-tools-distrib

[root@localhost85 local]# cd jdk1.7.0_15/

[root@localhost85 jdk1.7.0_15]# pwd

/usr/local/jdk1.7.0_15

[root@localhost85 jdk1.7.0_15]#

 

配置jdk路径

[root@localhost85 jdk1.7.0_15]# vim /etc/profile

nset i

unset -f pathmunge

exportJAVA_HOME=/usr/local/jdk1.7.0_15

exportJRE_HOME=/usr/local/jdk1.7.0_15/jre

exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

export HADOOP_HOME=/usr/local/hadoop-1.1.2

exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/lib:$PATH

[root@localhost85 jdk1.7.0_15]#

[root@localhost85 jdk1.7.0_15]# source /etc/profile  #使文件生效

 

测试jdk安装:

[root@localhost85 jdk1.7.0_15]# java -version

java version "1.7.0_15"

Java(TM) SE Runtime Environment (build1.7.0_15-b03)

Java HotSpot(TM) 64-Bit Server VM (build23.7-b01, mixed mode)

[root@localhost85 jdk1.7.0_15]#

#以上表示安装成功。

 

3.  Hadoop1.x伪分布式安装

1.  

2.  

3.  

3.1. 解压缩

[root@localhost85 ~]# ls

anaconda-ks.cfg  install.log         公共的  视频  文档  音乐

download         install.log.syslog  模板    图片  下载  桌面

[root@localhost85 ~]# cd download/

[root@localhost85 download]# ls

hadoop-1.1.2.tar.gz           mysql-connector-java-5.1.10.jar

hbase-0.94.7-security.tar.gz  pig-0.11.1.tar.gz

hive-0.9.0.tar.gz            sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz

jdk-7u15-linux-x64.gz         zookeeper-3.4.5.tar.gz

[root@localhost85 download]#

[root@localhost85 download]# tar -xvf hadoop-1.1.2.tar.gz -C /usr/local/

[root@localhost85 download]# cd /usr/local/

[root@localhost85 local]# ls

bin games         include      lib   libexec  share  VMwareTools-9.6.2-1688356.tar.gz

etc  hadoop-1.1.2  jdk1.7.0_15 lib64  sbin     src   vmware-tools-distrib

[root@localhost85 local]# cd hadoop-1.1.2/

[root@localhost85 hadoop-1.1.2]# ls

bin         hadoop-ant-1.1.2.jar         ivy          sbin

build.xml   hadoop-client-1.1.2.jar      ivy.xml      share

c++         hadoop-core-1.1.2.jar        lib          src

CHANGES.txt hadoop-examples-1.1.2.jar    libexec      webapps

conf        hadoop-minicluster-1.1.2.jar LICENSE.txt

contrib     hadoop-test-1.1.2.jar        NOTICE.txt

docs        hadoop-tools-1.1.2.jar       README.txt

[root@localhost85 hadoop-1.1.2]#

 

3.2. 配置hadoop路径

[root@localhost85 hadoop-1.1.2]# pwd

/usr/local/hadoop-1.1.2

[root@localhost85 hadoop-1.1.2]# vim /etc/profile

 

 

for i in /etc/profile.d/*.sh ; do

   if [ -r "$i" ]; then

       if [ "${-#*i}" != "$-" ]; then

           . "$i"

       else

           . "$i" >/dev/null 2>&1

       fi

   fi

done

 

unset i

unset -f pathmunge

export JAVA_HOME=/usr/local/jdk1.7.0_15

export JRE_HOME=/usr/local/jdk1.7.0_15/jre

exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

exportHADOOP_HOME=/usr/local/hadoop-1.1.2

export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/lib:$PATH

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# vim /etc/profile

 

 

3.3. 修改配置文件

1hadoop-env.sh

[root@localhost85 hadoop-1.1.2]# vim conf/hadoop-env.sh

 

# Set Hadoop-specific environment variableshere.

 

# The only required environment variable isJAVA_HOME.  All others are

# remote nodes.

 

# The java implementation to use.  Required.

# export JAVA_HOME=/usr/lib/j2sdk1.5-sun

exportJAVA_HOME=/usr/local/jdk1.7.0_15

[root@localhost85 hadoop-1.1.2]#

 

2core-site.xml

[root@localhost85 hadoop-1.1.2]# vim conf/core-site.xml

<?xml version="1.0"?>

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific propertyoverrides in this file. -->

 

<configuration>

    <property>

       <name>fs.default.name</name>

        <value>hdfs://192.168.1.85:9000</value>

    </property>

    <property>

        <name>hadoop.tmp.dir</name>

       <value>/usr/local/hadoop/tmp</value>

    </property> 

</configuration>

[root@localhost85 hadoop-1.1.2]#

 

3、hdfs-site.xml

[root@localhost85 hadoop-1.1.2]# vim conf/hdfs-site.xml

<?xml version="1.0"?>

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific propertyoverrides in this file. -->

 

<configuration>

    <property>

       <name>dfs.replication</name>

        <value>1</value>

    </property>

    <property>

       <name>dfs.permissions</name>

        <value>false</value>

    </property>

</configuration>

[root@localhost85 hadoop-1.1.2]#

 

4mapred-site.xml

[root@localhost85 hadoop-1.1.2]# vimconf/mapred-

mapred-queue-acls.xml  mapred-site.xml       

[root@localhost85 hadoop-1.1.2]# vim conf/mapred-site.xml

<?xml version="1.0"?>

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific propertyoverrides in this file. -->

 

<configuration>

    <property>

       <name>fs.default.name</name>

       <value>hdfs://192.168.1.85:9000</value>

    </property>

    <property>

        <name>hadoop.tmp.dir</name>

       <value>/usr/local/hadoop/tmp</value>

    </property> 

</configuration>

[root@localhost85 hadoop-1.1.2]#

 

3.4. 启动hadoop服务

格式化节点:

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop namenode -format

Warning: $HADOOP_HOME is deprecated.

 

17/03/13 22:56:45 INFO namenode.NameNode:STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = localhost85/192.168.1.85

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 1.1.2

STARTUP_MSG:   build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782;compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013

************************************************************/

17/03/13 22:56:46 INFO util.GSet: VMtype       = 64-bit

17/03/13 22:56:46 INFO util.GSet: 2% maxmemory = 19.33375 MB

17/03/13 22:56:46 INFO util.GSet:capacity      = 2^21 = 2097152 entries

17/03/13 22:56:46 INFO util.GSet:recommended=2097152, actual=2097152

17/03/13 22:56:46 INFO namenode.FSNamesystem:fsOwner=root

17/03/13 22:56:46 INFOnamenode.FSNamesystem: supergroup=supergroup

17/03/13 22:56:46 INFOnamenode.FSNamesystem: isPermissionEnabled=false

17/03/13 22:56:46 INFOnamenode.FSNamesystem: dfs.block.invalidate.limit=100

17/03/13 22:56:46 INFOnamenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0min(s), accessTokenLifetime=0 min(s)

17/03/13 22:56:46 INFO namenode.NameNode:Caching file names occuring more than 10 times

17/03/13 22:56:47 INFO common.Storage:Image file of size 110 saved in 0 seconds.

17/03/13 22:56:47 INFO namenode.FSEditLog:closing edit log: position=4,editlog=/usr/local/hadoop/tmp/dfs/name/current/edits

17/03/13 22:56:47 INFO namenode.FSEditLog:close success: truncate to 4, editlog=/usr/local/hadoop/tmp/dfs/name/current/edits

17/03/13 22:56:47 INFO common.Storage:Storage directory /usr/local/hadoop/tmp/dfs/name has been successfullyformatted.

17/03/13 22:56:47 INFO namenode.NameNode:SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode atlocalhost85/192.168.1.85

************************************************************/

[root@localhost85 hadoop-1.1.2]#

 

清除格式化生成文件:

Usr/local/Hadoop/tmp文件夹

 

启动hadoop服务

root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# start-all.sh

Warning: $HADOOP_HOME is deprecated.

 

starting namenode, logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-namenode-localhost85.out

The authenticity of host 'localhost (::1)'can't be established.

RSA key fingerprint is8e:85:ef:8b:df:ad:9b:e9:47:57:24:0d:60:0c:51:21.

Are you sure you want to continueconnecting (yes/no)? yes

localhost: Warning: Permanently added'localhost' (RSA) to the list of known hosts.

localhost: starting datanode, logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-datanode-localhost85.out

localhost: starting secondarynamenode,logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-secondarynamenode-localhost85.out

starting jobtracker, logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-jobtracker-localhost85.out

localhost: starting tasktracker, logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-tasktracker-localhost85.out

[root@localhost85 hadoop-1.1.2]#

 

查询启动服务

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# jps

4452 Jps

4022 DataNode

3913 NameNode

4203 JobTracker

4125 SecondaryNameNode

4307 TaskTracker

[root@localhost85 hadoop-1.1.2]#

 

 

 

 

3.5. 测试访问服务

测试访问节点:http://192.168.1.85:50070

 

 

测试访问mapredurce: http://192.168.1.85:50030

 

 

 

 

 

4.  Hadoop常用命令

[root@localhost85 hadoop-1.1.2]# hadoop

Usage: hadoop [--config confdir] COMMAND

where COMMAND is one of:

 namenode -format     format theDFS filesystem

 secondarynamenode    run the DFSsecondary namenode

 namenode             run the DFSnamenode

 datanode             run a DFSdatanode

 dfsadmin             run a DFSadmin client

 mradmin              run aMap-Reduce admin client

 fsck                 run a DFSfilesystem checking utility

 fs                   run a genericfilesystem user client

 balancer             run a clusterbalancing utility

 fetchdt              fetch adelegation token from the NameNode

 jobtracker           run theMapReduce job Tracker node

 pipes                run a Pipesjob

 tasktracker          run aMapReduce task Tracker node

 historyserver        run jobhistory servers as a standalone daemon

 job                  manipulateMapReduce jobs

 queue                getinformation regarding JobQueues

 version              print theversion

  jar<jar>            run a jar file

 distcp <srcurl> <desturl> copy file or directories recursively

 archive -archiveName NAME -p <parent path> <src>*<dest> create a hadoop archive

 classpath            prints theclass path needed to get the

                       Hadoop jar and therequired libraries

 daemonlog            get/set thelog level for each daemon

 or

 CLASSNAME            run the classnamed CLASSNAME

Most commands print help when invoked w/oparameters.

[root@localhost85 hadoop-1.1.2]#

 

命令选项:

-help [cmd]       //显示命令的帮助信息

-ls(r)<path>     //显示当前目录下所有文件

-du(s)<path>   //显示目录中所有文件大小

-count[-q]<path>     //显示目录中文件数量

-mv<src> <dst>       //移动多个文件到目标目录

-cp<src> <dst>         //复制多个文件到目标目录

-rm(r)        //删除文件(夹)

-put<localsrc> <dst>       //本地文件复制到hdfs

-copyFromLocal        //同put

-moveFromLocal      //从本地文件移动到hdfs

-get[-ignoreCrc] <src> <localdst>   //复制文件到本地,可以忽略crc校验

-getmerge<src> <localdst>              //将源目录中的所有文件排序合并到一个文件中

-cat<src>         //在终端显示文件内容

-text<src>        //在终端显示文件内容

-copyToLocal[-ignoreCrc] <src> <localdst>     //复制到本地

-moveToLocal<src> <localdst>

-mkdir<path>  //创建文件夹

-touchz<path>         //创建一个空文件

 

 

 

[root@localhost85 ~]# hadoop fs

Usage: java FsShell

          [-ls <path>]

          [-lsr <path>]

          [-du <path>]

          [-dus <path>]

          [-count[-q] <path>]

          [-mv <src> <dst>]

          [-cp <src> <dst>]

          [-rm [-skipTrash] <path>]

          [-rmr [-skipTrash] <path>]

          [-expunge]

          [-put <localsrc> ... <dst>]

          [-copyFromLocal <localsrc> ... <dst>]

          [-moveFromLocal <localsrc> ... <dst>]

          [-get [-ignoreCrc] [-crc] <src> <localdst>]

           [-getmerge <src> <localdst>[addnl]]

          [-cat <src>]

          [-text <src>]

          [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>]

          [-moveToLocal [-crc] <src> <localdst>]

          [-mkdir <path>]

          [-setrep [-R] [-w] <rep> <path/file>]

          [-touchz <path>]

          [-test -[ezd] <path>]

          [-stat [format] <path>]

          [-tail [-f] <file>]

          [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]

          [-chown [-R] [OWNER][:[GROUP]] PATH...]

          [-chgrp [-R] GROUP PATH...]

          [-help [cmd]]

 

Generic options supported are

-conf <configuration file>     specify an application configuration file

-D <property=value>            use value for given property

-fs <local|namenode:port>      specify a namenode

-jt <local|jobtracker:port>    specify a job tracker

-files <comma separated list offiles>    specify comma separatedfiles to be copied to the map reduce cluster

-libjars <comma separated list ofjars>    specify comma separated jarfiles to include in the classpath.

-archives <comma separated list ofarchives>    specify comma separatedarchives to be unarchived on the compute machines.

 

The general command line syntax is

bin/hadoop command [genericOptions] [commandOptions]

 

[root@localhost85 ~]#

 

查看hadoop帮助命令:

Usage:hadoop fs [generic options]

         [-appendToFile <localsrc> ...<dst>]

         [-cat [-ignoreCrc] <src> ...]

         [-checksum <src> ...]

         [-chgrp [-R] GROUP PATH...]

         [-chmod [-R] <MODE[,MODE]... |OCTALMODE> PATH...]

         [-chown [-R] [OWNER][:[GROUP]] PATH...]

         [-copyFromLocal [-f] [-p] [-l]<localsrc> ... <dst>]

         [-copyToLocal [-p] [-ignoreCrc] [-crc]<src> ... <localdst>]

         [-count [-q] [-h] <path> ...]

         [-cp [-f] [-p | -p[topax]] <src>... <dst>]

         [-createSnapshot <snapshotDir>[<snapshotName>]]

         [-deleteSnapshot <snapshotDir><snapshotName>]

         [-df [-h] [<path> ...]]

         [-du [-s] [-h] <path> ...]

         [-expunge]

         [-get [-p] [-ignoreCrc] [-crc]<src> ... <localdst>]

         [-getfacl [-R] <path>]

         [-getfattr [-R] {-n name | -d} [-e en]<path>]

         [-getmerge [-nl] <src><localdst>]

         [-help [cmd ...]]

         [-ls [-d] [-h] [-R] [<path> ...]]

         [-mkdir [-p] <path> ...]

         [-moveFromLocal <localsrc> ...<dst>]

         [-moveToLocal <src><localdst>]

         [-mv <src> ... <dst>]

         [-put [-f] [-p] [-l] <localsrc>... <dst>]

         [-renameSnapshot <snapshotDir><oldName> <newName>]

         [-rm [-f] [-r|-R] [-skipTrash]<src> ...]

         [-rmdir [--ignore-fail-on-non-empty]<dir> ...]

         [-setfacl [-R] [{-b|-k} {-m|-x<acl_spec>} <path>]|[--set <acl_spec> <path>]]

         [-setfattr {-n name [-v value] | -xname} <path>]

         [-setrep [-R] [-w] <rep><path> ...]

         [-stat [format] <path> ...]

         [-tail [-f] <file>]

         [-test -[defsz] <path>]

         [-text [-ignoreCrc] <src> ...]

         [-touchz <path> ...]

         [-usage [cmd ...]]

 

 

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -help ls

-ls <path>:       List the contents that match the specified file pattern. If

                   pathis not specified, the contents of /user/<currentUser>

                   willbe listed. Directory entries are of the form

                            dirName(full path) <dir>

                   andfile entries are of the form

                            fileName(fullpath) <r n> size

                   wheren is the number of replicas specified for the file

                   andsize is the size of the file, in bytes.

 

[root@localhost85 ~]#

 

 

1.  

2.  

3.  

3.1.  

3.2.  

3.3.  

3.4.  

1.  

2.  

3.  

4.  

4.1. ls查看hdfs根目录下内容

用法:hadoop fs –ls  /   (斜杠表示根目录)

[root@localhost85 hadoop-1.1.2]# hadoop fs -ls /

Warning: $HADOOP_HOME is deprecated.

 

Found 1 items

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]#

 

 

4.2. lsr查看hdfs根目录所有文件

用法:hadoop fs -lsr /  递归查询hdfs目录下所有内容

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop fs -lsr /

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop/tmp

drwxr-xr-x  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred

drwx------  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system

-rw-------  1 root supergroup          42017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system/jobtracker.info

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]#

 

4.3. Mkdir hdfs创建文件夹

用法:hadoop fs -mkdir/test1  hdfs上创建文件夹

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop fs -mkdir /test1

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop fs-ls /

Found 2 items

drwxr-xr-x   - root supergroup          0 2017-03-22 22:12 /test1

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]#

 

4.4. put将数据从linux上传到hdfs指定位置

用法:hadoop fs -put  linux源位置】hdfs目标位置】

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop fs -put /root/install.log  /test2

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop fs-ls /test2

Found 1 items

-rw-r--r--  1 root supergroup      541182017-03-22 22:19 /test2/install.log

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]#

 

上传文件时,如果目标不存在,则目标为文件

[root@localhost85 hadoop-1.1.2]# hadoop fs -put /root/install.log  /demo

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop fs -ls /

Found 3 items

drwxr-xr-x  - root supergroup          02017-03-22 22:19 /d1

-rw-r--r--  1 root supergroup      541182017-03-22 22:22 /demo

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

[root@localhost85 hadoop-1.1.2]#

 

4.5. copyFromLocal将数据从linux上传到hdfs指定位置

用法:hadoop fs - copyFromLocal  linux源位置】hdfs目标位置】

[root@localhost85 ~]# ls

anaconda-ks.cfg  download    install.log.syslog  模板  图片  下载  桌面

demo             install.log  公共的              视频  文档  音乐

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -copyFromLocal anaconda-ks.cfg  /test

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -ls /test/

Found 2 items

-rw-r--r--  1 root supergroup       17672017-03-30 18:56 /test/anaconda-ks.cfg

-rw-r--r--  1 root supergroup      541182017-03-30 18:54 /test/install.log

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]#

 

4.6. moveFromLocal 把文件从linux 上移动到hdfs 中

用法:hadoop fs - moveFromLocal  linux源位置】hdfs目标位置】

[root@localhost85 ~]#

[root@localhost85 ~]# ls

anaconda-ks.cfg  download    install.log.syslog  模板  图片  下载  桌面

demo             install.log  公共的              视频  文档  音乐

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -moveFromLocal download/ /test/

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -ls /test/

Found 3 items

-rw-r--r--  1 root supergroup       17672017-03-30 18:56 /test/anaconda-ks.cfg

drwxr-xr-x  - root supergroup          0 2017-03-30 18:58/test/download

-rw-r--r--  1 root supergroup      541182017-03-30 18:54 /test/install.log

[root@localhost85 ~]#

[root@localhost85 ~]#

 

 

4.7. get将数据从hdfs上下载到linux指定位置

用法:hadoop fs -get hdfs源】 linux目标】

[root@localhost85 ~]# hadoop fs -ls /test1

Found 1 items

-rw-r--r--  1 root supergroup      541182017-03-22 22:19 /test1/install.log

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -get /test1 /root/demo

[root@localhost85 ~]# ls /root/demo/

install.log

[root@localhost85 ~]#

[root@localhost85 ~]#

 

4.8. rm/rmr删除hdfs文件及文件夹

用法:hadoop fs –rmrhdfs目录】

[root@localhost85 ~]# hadoop fs -rm /demo   #删除文件

Deleted hdfs://192.168.1.85:9000/demo

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -rmr /d1  #递归删除目录

Deleted hdfs://192.168.1.85:9000/d1

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -ls /d1

ls: Cannot access /d1: No such file ordirectory.

[root@localhost85 ~]#

[root@localhost85 ~]#

4.9. du统计目录下各文件大小

用法:hadoop fs –du [目录]

 

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -du /

Found 3 items

54118      hdfs://192.168.1.85:9000/test

0          hdfs://192.168.1.85:9000/test1

4          hdfs://192.168.1.85:9000/usr

[root@localhost85 ~]#

 

4.10.  dus汇总统计目录下文件大小

用法:hadoop fs –dus [目录]

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -dus /

hdfs://192.168.1.85:9000/      54122

[root@localhost85 ~]#

[root@localhost85 ~]#

 

4.11.  count统计文件(夹)数量

用法:hadoop fs –count [目录]

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -count /

          9            2              54122 hdfs://192.168.1.85:9000/

[root@localhost85 ~]#

[root@localhost85 ~]#

 

 

4.12.  mv移动hdfs 的文件到指定的hdfs 目录

用法: hadoop fs -mv  [hdfs源目录]  [hdfs目标目录]

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -lsr /

drwxr-xr-x  - root supergroup          02017-03-30 18:37 /test

-rw-r--r--  1 root supergroup      541182017-03-30 18:37 /test/install.log

drwxr-xr-x  - root supergroup          02017-03-22 22:54 /test1

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop/tmp

drwxr-xr-x  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred

drwx------  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system

-rw-------  1 root supergroup          42017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system/jobtracker.info

[root@localhost85 ~]# \

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -mv /test/install.log /test1

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -lsr /

drwxr-xr-x  - root supergroup          02017-03-30 18:46 /test

drwxr-xr-x  - root supergroup          02017-03-30 18:46 /test1

-rw-r--r--  1 root supergroup      541182017-03-30 18:37 /test1/install.log

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop/tmp

drwxr-xr-x  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred

drwx------  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system

-rw-------  1 root supergroup          42017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system/jobtracker.info

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -lsr /

drwxr-xr-x  - root supergroup          02017-03-30 18:46 /test

drwxr-xr-x  - root supergroup          02017-03-30 18:46 /test1

-rw-r--r--  1 root supergroup      541182017-03-30 18:37 /test1/install.log

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop/tmp

drwxr-xr-x  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred

drwx------  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system

-rw-------  1 root supergroup          42017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system/jobtracker.info

[root@localhost85 ~]#

 

 

4.13.  cp复制hdfs 指定的文件到指定的hdfs 目录

用法: hadoop fs -cp  [源目录]  [目标目录]

[root@localhost85 ~]# hadoop fs -cp/test1/install.log /test

[root@localhost85 ~]# hadoop fs -lsr /

drwxr-xr-x  - root supergroup          02017-03-30 18:47 /test

-rw-r--r--  1 root supergroup      54118 2017-03-3018:47 /test/install.log

drwxr-xr-x  - root supergroup          02017-03-30 18:46 /test1

-rw-r--r--  1 root supergroup      541182017-03-30 18:37 /test1/install.log

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr/local/hadoop/tmp

drwxr-xr-x  - root supergroup          0 2017-03-2222:01 /usr/local/hadoop/tmp/mapred

drwx------  - root supergroup          02017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system

-rw-------  1 root supergroup          42017-03-22 22:01 /usr/local/hadoop/tmp/mapred/system/jobtracker.info

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# '

 

 

4.14.  getmerge把hdfs 指定目录下的所有文件内容合并到本地linux 的文件

用法: hadoop fs -getmerge [hdfs源目录]  [linux目标目录]

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -getmerge/temp/ tmp/

17/03/30 19:05:49 INFOutil.NativeCodeLoader: Loaded the native-hadoop library

[root@localhost85 ~]#

[root@localhost85 ~]# ls

anaconda-ks.cfg  install.log         tmp    模板  图片  下载  桌面

demo             install.log.syslog  公共的  视频  文档  音乐

[root@localhost85 ~]#

[root@localhost85 ~]#

 

 

4.15.  cat/text查看hdfs文件内容

用法: hadoop fs –cat 文件

[root@localhost85 ~]# ls

anaconda-ks.cfg  install.log         tmp    模板  图片  下载  桌面

demo             install.log.syslog  公共的  视频  文档  音乐

[root@localhost85 ~]# echo'sldjfksldjldg'>a.txt

[root@localhost85 ~]# ls

anaconda-ks.cfg  demo        install.log.syslog  公共的  视频  文档  音乐

a.txt            install.log  tmp                 模板    图片  下载  桌面

[root@localhost85 ~]# hadoop fs -put a.txt/temp/

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# cat a.txt

sldjfksldjldg

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -cat /temp/a.txt

sldjfksldjldg

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -text /temp/a.txt

sldjfksldjldg

[root@localhost85 ~]#

[root@localhost85 ~]#

 

4.16.  setrep 设置副本数量

用法: hadoop fs –setrep文件

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -setrep 2 /temp/a.txt

Replication 2 set:hdfs://192.168.1.85:9000/temp/a.txt

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -ls /temp/

Found 2 items

-rw-r--r--  2 root supergroup         142017-03-30 19:09 /temp/a.txt

-rw-r--r--  1 root supergroup      541182017-03-30 19:03 /temp/install.log

[root@localhost85 ~]#

 

 

4.17.  touchz在hdfs 中创建空白文件

用法: hadoop fs –touchz文件

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -touchz /emptyfile

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -ls /

Found 4 items

-rw-r--r--  1 root supergroup          02017-03-30 19:20 /emptyfile

drwxr-xr-x  - root supergroup          02017-03-30 19:09 /temp

drwxr-xr-x  - root supergroup          02017-03-30 19:01 /user

drwxr-xr-x  - root supergroup          02017-03-13 22:58 /usr

[root@localhost85 ~]#

 

4.18.  tail在hdfs查看文件内容

用法: hadoop fs –tail 文件

 

[root@localhost85 ~]# hadoop fs -tail -f /temp/install.log

5.5-1.1.el6.noarch

安装 scenery-backgrounds-6.0.0-1.el6.noarch

安装 gnome-backgrounds-2.28.0-2.el6.noarch

安装 ql2400-firmware-7.00.01-1.el6.noarch

安装 iwl5000-firmware-8.83.5.1_1-1.el6_1.1.noarch

安装 ql2100-firmware-1.19.38-3.1.el6.noarch

 

 

4.19.  chown在hdfs修改文件所属权限

用法: hadoop fs –chown所属组文件

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -chown wangshc /emptyfile

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -ls /

Found 4 items

-rw-r--r--  1 wangshc supergroup          02017-03-30 19:20 /emptyfile

drwxr-xr-x  - root    supergroup          0 2017-03-30 19:09 /temp

drwxr-xr-x  - root    supergroup          0 2017-03-30 19:01 /user

drwxr-xr-x  - root    supergroup          0 2017-03-13 22:58 /usr

[root@localhost85 ~]#

4.20.  chgrp修改hdfs文件所属组

用法:hadoop fs –chgrp 组名【文件或目录】

[root@localhost85 ~]# hadoop fs -ls /

Found 4 items

-rw-r--r--  1 wangshc supergroup          02017-03-30 19:20 /emptyfile

drwxr-xr-x  - root    supergroup          0 2017-03-30 19:09 /temp

drwxr-xr-x  - root    supergroup          0 2017-03-30 19:01 /user

drwxr-xr-x  - root    supergroup          0 2017-03-13 22:58 /usr

[root@localhost85 ~]#

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop fs -chgrp root /emptyfile

 [root@localhost85~]# hadoop fs -ls /

Found 4 items

-rw-r--r--  1 wangshc root                0 2017-03-30 19:20 /emptyfile

drwxr-xr-x  - root    supergroup          0 2017-03-30 19:09 /temp

drwxr-xr-x  - root    supergroup          0 2017-03-30 19:01 /user

drwxr-xr-x  - root    supergroup          0 2017-03-13 22:58 /usr

[root@localhost85 ~]#

[root@localhost85 ~]#

 

 

4.21.  chgrp修改hdfs文件所属组

用法:hadoop jar  [jar]

[root@localhost85 hadoop-1.1.2]#

[root@localhost85 hadoop-1.1.2]# hadoop jar hadoop-examples-1.1.2.jar

An example program must be given as thefirst argument.

Valid program names are:

 aggregatewordcount: An Aggregate based map/reduce program that countsthe words in the input files.

 aggregatewordhist: An Aggregate based map/reduce program that computesthe histogram of the words in the input files.

 dbcount: An example job that count the pageview counts from a database.

 grep: A map/reduce program that counts the matches of a regex in theinput.

 join: A job that effects a join over sorted, equally partitioneddatasets

 multifilewc: A job that counts words from several files.

 pentomino: A map/reduce tile laying program to find solutions topentomino problems.

  pi:A map/reduce program that estimates Pi using monte-carlo method.

 randomtextwriter: A map/reduce program that writes 10GB of randomtextual data per node.

 randomwriter: A map/reduce program that writes 10GB of random data pernode.

 secondarysort: An example defining a secondary sort to the reduce.

 sleep: A job that sleeps at each map and reduce task.

 sort: A map/reduce program that sorts the data written by the randomwriter.

 sudoku: A sudoku solver.

 teragen: Generate data for the terasort

 terasort: Run the terasort

 teravalidate: Checking results of terasort

 wordcount: A map/reduce program that counts the words in the inputfiles.

[root@localhost85 hadoop-1.1.2]#

 

 

5.  hadoop1.x集群

Hdfs包含:nameNode,dataNode, secondNode

 

可以在原来的伪分布基础上安装。各节点检查以下配置:

1) 防火墙是否永久关闭

2) 静态ip 是否设置

3) 主机名称是否设置

4) /etc/hosts 是否添加了所有节点的ip 与主机名映射

5) 各节点的ssh 是否可以免密码登录自己的主机名

以上检查如果读者没有通过,请参考前面的配置说明进行。

 

4.  

3.  

4.  

5.  

5.1. ip映射配置

在localhost85、localhost86上分别配置一下信息

[root@localhost85 ~]#

[root@localhost85 ~]# vim /etc/hosts

 

127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4

::1        localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.1.85 localhost85

192.168.1.86 localhost86

[root@localhost85 ~]#

 

5.2. 无秘钥登录配置

在localhost85配置秘钥

[root@localhost85 ~]# vim /root/.ssh/authorized_keys

 

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAxItvW99zHW1jEpqEedhm1n3cTiH/jRklu5eNp6fJDZmcyx9dCUg/UAiAM5F/yeZuqhtde8YZhoLV6xc94C3dwtrqaodyl3CgTMDp60Eeh3IbO1c3VtbmfYvjrcs7hp5uAKvoC6wvB/+KCddxeAFbGcI088xhoVWJ7pu5OAqgZwwYUUNYAv5b1z0DymWPgS/UG9/ZUkCQR5NofwPoMJVdJU4afSxFxkTetGjfpKuZfCztEE1j3PURjNZ/6VQYvVTnkcjqZIpvYrDNR/6dt/ZCsuo4eihixZmbEmrOj6Y193/tmBEjFk6nxGGuaxzNfYj5y/e0FC02OGDxwOnZDmjpZQ==root@localhost85

ssh-rsaAAAAB3NzaC1yc2EAAAABIwAAAQEAvzi9qXd25VqjaLnK2J0GruwmODTq2IJB1x5srpRURXD1VsecLSbzmycmHfk/gzcEttvbNl3m7DKFRPpdfjNAvqGVZC38EZ8qG+cYANBKlL7XFzp3brgeT8j4sD0jnxvD5izgb1OFZLNnyy70gG46t0jihQLviQvxxJqipmHYpcgTI9aW9krYVQ2kjOVNvHTwnehRUwi8LMBj0Y21IgCMtgtYGGsdidYBcKTn3+A302f0tvclwVj+I1Dv2Q12f8zdFn8W9zLN0YtlQs0QlwzUKbFudlT9TGqGs5sRF70EhxCBA+phUNQuVqtIov1h2IMW3ZCuYSsmTL1RzXq4F85XEQ==root@localhost86

[root@localhost85 ~]#

 

 

localhost86上配置秘钥

[root@ localhost86 ~]# vim /root/.ssh/authorized_keys

 

ssh-rsaAAAAB3NzaC1yc2EAAAABIwAAAQEAxItvW99zHW1jEpqEedhm1n3cTiH/jRklu5eNp6fJDZmcyx9dCUg/UAiAM5F/yeZuqhtde8YZhoLV6xc94C3dwtrqaodyl3CgTMDp60Eeh3IbO1c3VtbmfYvjrcs7hp5uAKvoC6wvB/+KCddxeAFbGcI088xhoVWJ7pu5OAqgZwwYUUNYAv5b1z0DymWPgS/UG9/ZUkCQR5NofwPoMJVdJU4afSxFxkTetGjfpKuZfCztEE1j3PURjNZ/6VQYvVTnkcjqZIpvYrDNR/6dt/ZCsuo4eihixZmbEmrOj6Y193/tmBEjFk6nxGGuaxzNfYj5y/e0FC02OGDxwOnZDmjpZQ==root@localhost85

ssh-rsaAAAAB3NzaC1yc2EAAAABIwAAAQEAvzi9qXd25VqjaLnK2J0GruwmODTq2IJB1x5srpRURXD1VsecLSbzmycmHfk/gzcEttvbNl3m7DKFRPpdfjNAvqGVZC38EZ8qG+cYANBKlL7XFzp3brgeT8j4sD0jnxvD5izgb1OFZLNnyy70gG46t0jihQLviQvxxJqipmHYpcgTI9aW9krYVQ2kjOVNvHTwnehRUwi8LMBj0Y21IgCMtgtYGGsdidYBcKTn3+A302f0tvclwVj+I1Dv2Q12f8zdFn8W9zLN0YtlQs0QlwzUKbFudlT9TGqGs5sRF70EhxCBA+phUNQuVqtIov1h2IMW3ZCuYSsmTL1RzXq4F85XEQ==root@localhost86

[root@ localhost86 ~]#

 

 

5.3. Localhost86上安装jdk 和hadoop

[root@localhost85 ~]#

[root@localhost85 ~]# scp -rq /usr/local/jdk1.7.0_15 localhost86:/usr/local/

The authenticity of host 'localhost86(192.168.1.86)' can't be established.

RSA key fingerprint is8e:85:ef:8b:df:ad:9b:e9:47:57:24:0d:60:0c:51:21.

Are you sure you want to continueconnecting (yes/no)? yes

 

 

 

[root@localhost85 ~]#

[root@localhost85 ~]# scp -rq /usr/local/hadoop-1.1.2 localhost86:/usr/local/

[root@localhost85 ~]#

[root@localhost85 ~]# scp -rq /etc/profile localhost86:/etc

[root@localhost85 ~]#

[root@localhost85 ~]# scp -rq /etc/hosts localhost86:/etc/

[root@localhost85 ~]#

[root@localhost85 ~]# ssh localhost86

Last login: Fri Mar 31 16:27:33 2017 fromlocalhost85

[root@localhost86 ~]#

[root@localhost86 ~]# source /etc/profile

[root@localhost86 ~]#

[root@localhost86 ~]# exit

logout

Connection to localhost86 closed.

[root@localhost85 ~]#

[root@localhost85 ~]#

 

5.4. Localhost85配置集群核心文件

hadoop 的配置文件slaves,把内容localhost改为localhsot86,即在节点localhsot86运行datanode 和tasktracker 节点。

 

root@localhost85 hadoop-1.1.2]# cd conf/

[root@localhost85 conf]# ls

capacity-scheduler.xml      hadoop-policy.xml      slaves

configuration.xsl           hdfs-site.xml          ssl-client.xml.example

core-site.xml               log4j.properties       ssl-server.xml.example

fair-scheduler.xml          mapred-queue-acls.xml  taskcontroller.cfg

hadoop-env.sh               mapred-site.xml

hadoop-metrics2.properties  masters

[root@localhost85 conf]#

[root@localhost85 conf]# vim slaves

[root@localhost85 conf]# vim slaves

 

localhost86

~        

[root@localhost85 conf]#

 

 

 

 

 

 

[root@localhost86 local]#

 

5.5. Localhost85格式化文件系统

[root@localhost85 conf]#

[root@localhost85 conf]# hadoop namenode -format

17/03/31 17:44:10 INFO namenode.NameNode:STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = localhost85/192.168.1.85

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 1.1.2

STARTUP_MSG:   build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782;compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013

************************************************************/

Re-format filesystem in/usr/local/hadoop/tmp/dfs/name ? (Y or N) y

Format aborted in/usr/local/hadoop/tmp/dfs/name

17/03/31 17:44:13 INFO namenode.NameNode:SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode atlocalhost85/192.168.1.85

************************************************************/

[root@localhost85 conf]#

[root@localhost85 conf]#

 

5.6. 重新启动

[root@localhost85 conf]# /usr/local/hadoop-1.1.2/bin/start-all.sh

starting namenode, logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-namenode-localhost85.out

localhost86: starting datanode, logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-datanode-localhost86.out

localhost: starting secondarynamenode,logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-secondarynamenode-localhost85.out

starting jobtracker, logging to/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-jobtracker-localhost85.out

localhost86: starting tasktracker, loggingto/usr/local/hadoop-1.1.2/libexec/../logs/hadoop-root-tasktracker-localhost86.out

[root@localhost85 conf]#

[root@localhost85 conf]#

5.7. 查看进程

[root@localhost85 conf]# jps

8634 SecondaryNameNode

5332 TaskTracker

8791 Jps

8487 NameNode

8704 JobTracker

5011 DataNode

[root@localhost85 conf]#

 

5.8. 验证

[root@localhost86 local]#

[root@localhost86 local]# jps

3286 TaskTracker

3342 Jps

3212 DataNode

[root@localhost86 local]#

虽然我们执行命令“start-all.sh”,可以看到很多输出信息,但是并不意味着,hadoop

集群启动成功了,只是表示开始启动了,是否启动成功,需要我们自己判断一下。

在节点localhost85 中执行jps,可以观察到NameNode、SecondaryNameNode、JobTracker

三个java 进程。

在节点localhost86中执行jps,可以观察到DataNode、TaskTracker 两个java 进程。

如果看到这些信息,表明确实启动成功了。

 

 

 

6.  hbase分布式安装

安装条件:hadoop集群已经安装完成

[root@localhost85 ~]# java -version

java version"1.7.0_15"

Java(TM) SE Runtime Environment (build1.7.0_15-b03)

Java HotSpot(TM) 64-Bit Server VM (build23.7-b01, mixed mode)

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop version

Hadoop 1.1.2

Subversionhttps://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782

Compiled by hortonfo on Thu Jan 31 02:03:24UTC 2013

From source with checksumc720ddcf4b926991de7467d253a79b8b

[root@localhost85 ~]#

 

5.    

6.    

6.1.   解压缩到当前目录

[root@localhost85 local]# tar -xvf /root/download/hbase-0.94.7-security.tar.gz -C .

[root@localhost85 local]# ls

bin    hadoop-1.1.2           lib      share

etc    hbase-0.94.7-security  lib64    src

games  include               libexec VMwareTools-9.6.2-1688356.tar.gz

hadoop jdk1.7.0_15            sbin     vmware-tools-distrib

6.2.   hbase重命名

[root@localhost85 local]# mv hbase-0.94.7-security/ hbase-0.94.7 #重命名

[root@localhost85 local]# ls

bin    hadoop-1.1.2  lib      share

etc    hbase-0.94.7  lib64    src

games  include       libexec  VMwareTools-9.6.2-1688356.tar.gz

hadoop jdk1.7.0_15   sbin     vmware-tools-distrib

[root@localhost85 local]#

 

6.3.   修改配置文件hbase-env.sh

[root@localhost85 hbase-0.94.7]# ls

bin         docs                            hbase-webapps  NOTICE.txt  sbin

CHANGES.txt hbase-0.94.7-security.jar       lib            pom.xml     security

conf        hbase-0.94.7-security-tests.jar LICENSE.txt    README.txt  src

[root@localhost85 hbase-0.94.7]# cd conf/

[root@localhost85 conf]# ls

hadoop-metrics.properties  hbase-env.sh~     hbase-site.xml    regionservers

hbase-env.sh               hbase-policy.xml  log4j.properties

[root@localhost85 conf]# vim hbase-env.sh

#添加jdk运行环境

# The java implementation to use.  Java 1.6 required.

exportJAVA_HOME=/usr/local/jdk1.7.0_15

 

#添加hadoop运行环境

# Extra Java CLASSPATH elements.  Optional.

exportHBASE_CLASSPATH=/usr/local/hadoop-1.1.2/conf

# Tell HBase whether it should manage it'sown instance of Zookeeper or not.

exportHBASE_MANAGES_ZK=true

[root@localhost85 conf]#

 

 

6.4.   修改hbase-site.xml

[root@localhost85 conf]# vim hbase-site.xml

<?xmlversion="1.0"?>

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<!--

/**

 * Copyright 2010 The Apache SoftwareFoundation

 *

 * Licensed to the Apache Software Foundation(ASF) under one

 * or more contributor license agreements.  See the NOTICE file

 * distributed with this work for additionalinformation

 * regarding copyright ownership.  The ASF licenses this file

 * to you under the Apache License, Version 2.0(the

 * "License"); you may not use thisfile except in compliance

 * with the License.  You may obtain a copy of the License at

 *

 *    http://www.apache.org/licenses/LICENSE-2.0

 *

 * Unless required by applicable law or agreedto in writing, software

 * distributed under the License is distributedon an "AS IS" BASIS,

 * WITHOUT WARRANTIES OR CONDITIONS OF ANYKIND, either express or implied.

 * See the License for the specific languagegoverning permissions and

 * limitations under the License.

 */

-->

<configuration>

 <property>

    <name>hbase.rootdir</name>

   <value>hdfs://192.168.1.85:9000/hbase</value>

    <description>The directory shared byregion servers and into

    which HBase persists.  The URL should be 'fully-qualified'

    to include the filesystem scheme.  For example, to specify the

    HDFS directory '/hbase' where the HDFSinstance's namenode is

    running at namenode.example.org on port9000, set this value to:

    hdfs://namenode.example.org:9000/hbase.  By default HBase writes

    into /tmp. Change this configuration else all data will be lost

    on machine restart.

    </description>

  </property>

 

<property>

   <name>hbase.cluster.distributed</name>

    <value>true</value>

    <description>The mode the clusterwill be in. Possible values are

      false for standalone mode and true fordistributed mode.  If

      false, startup will run all HBase andZooKeeper daemons together

      in the one JVM.

    </description>

  </property>

 

<property>

    <name>hbase.tmp.dir</name>

   <value>/usr/local/hbase/tmp</value>

    <description>Temporary directory onthe local filesystem.

    Change this setting to point to a locationmore permanent

    than '/tmp' (The '/tmp' directory is oftencleared on

    machine restart).

    </description>

  </property>

 

 <property>

   <name>hbase.zookeeper.quorum</name>

   <value>localhost85,localhost86</value>

    <description>Comma separated list ofservers in the ZooKeeper Quorum.

    For example,"host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".

    By default this is set to localhost forlocal and pseudo-distributed modes

    of operation. For a fully-distributedsetup, this should be set to a full

    list of ZooKeeper quorum servers. IfHBASE_MANAGES_ZK is set in hbase-env.sh

    this is the list of servers which we willstart/stop ZooKeeper on.

    </description>

  </property>

 

 <property>

   <name>hbase.zookeeper.property.dataDir</name>

    <value>${hbase.tmp.dir}/zookeeper</value>

    <description>Property fromZooKeeper's config zoo.cfg.

    The directory where the snapshot is stored.

    </description>

  </property>

 

</configuration>

 

 

 

6.5.   修改regionservers

[root@localhost85 conf]# vim regionservers

 

localhost86

[root@localhost85 conf]#

 

 

6.6.   拷贝到localhost86  /usr/local目录下

[root@localhost85 local]# scp -r hbase-0.94.7/ root@192.168.1.86:/usr/local/

[root@localhost86 local]# ls

bin games   hadoop-1.1.2  include     lib    libexec  share VMwareTools-9.6.2-1688356.tar.gz

etc hadoop  hbase-0.94.7  jdk1.7.0_15 lib64  sbin     src   vmware-tools-distrib

[root@localhost86 local]#

[root@localhost86 local]#

 

6.7.   添加配置信息

[root@localhost85 ~]#

[root@localhost85 ~]# vim /etc/profile

 

unset i

unset -f pathmunge

export JAVA_HOME=/usr/local/jdk1.7.0_15

export JRE_HOME=/usr/local/jdk1.7.0_15/jre

exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

export HADOOP_HOME_WARN_SUPPRESS=1

export HADOOP_HOME=/usr/local/hadoop-1.1.2

export HBASE_HOME=/usr/local/hbase-0.94.7

exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/lib:$HBASE_HOME/bin:$PATH

"/etc/profile" 85L, 2144C                           

[root@localhost85 ~]#

 

6.8.   将配置信息拷贝到localhost86上

[root@localhost85 ~]# scp /etc/profile root@localhost86:/etc/

[root@localhost86 local]#

[root@localhost86 local]# source /etc/profile

[root@localhost86 local]# jps

2456 DataNode

3888 Jps

2531 TaskTracker

[root@localhost86 local]#

[root@localhost86 local]# cat /etc/profile

# /etc/profile

 

unset i

unset -f pathmunge

export JAVA_HOME=/usr/local/jdk1.7.0_15

export JRE_HOME=/usr/local/jdk1.7.0_15/jre

exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

export HADOOP_HOME_WARN_SUPPRESS=1

export HADOOP_HOME=/usr/local/hadoop-1.1.2

export HBASE_HOME=/usr/local/hbase-0.94.7

exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/lib:$HBASE_HOME/bin:$PATH

[root@localhost86 local]#

 

 

6.9.   启动hbase服务

[root@localhost85 logs]#

[root@localhost85 logs]# hadoop dfs -ls hdfs://localhost85:9000/hbase

[root@localhost85 logs]#

[root@localhost85 logs]# stop-hbase.sh

stopping hbase

192.168.1.85: no zookeeper to stop becauseno pid file /tmp/hbase-root-zookeeper.pid

192.168.1.86: stopping zookeeper.

[root@localhost85 logs]#

[root@localhost85 logs]#

[root@localhost85 logs]# start-hbase.sh

192.168.1.85: starting zookeeper, loggingto /usr/local/hbase-0.94.7/bin/../logs/hbase-root-zookeeper-localhost85.out

192.168.1.86: starting zookeeper, loggingto /usr/local/hbase-0.94.7/bin/../logs/hbase-root-zookeeper-localhost86.out

starting master, logging to/usr/local/hbase-0.94.7/logs/hbase-root-master-localhost85.out

localhost86: starting regionserver, loggingto /usr/local/hbase-0.94.7/bin/../logs/hbase-root-regionserver-localhost86.out

[root@localhost85 logs]#

 

6.10.       测试hbase是否正常

[root@localhost85 logs]#

[root@localhost85 logs]# hbase shell

HBase Shell; enter 'help<RETURN>' forlist of supported commands.

Type "exit<RETURN>" toleave the HBase Shell

Version 0.94.7, r1471806, Wed Apr 24 18:44:36PDT 2013

 

hbase(main):001:0> list

TABLE                                                                                          

0 row(s) in 1.6050 seconds

 

hbase(main):002:0>

以上表示hbase配置成功

 

6.11.       停止hbase服务

[root@localhost85 logs]#

[root@localhost85 logs]# stop-hbase.sh

stopping hbase............

192.168.1.86: stopping zookeeper.

192.168.1.85: stopping zookeeper.

[root@localhost85 logs]#

 

 

 

6.12.       hadoop创建目录错误

现错误提示:mkdir:org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot createdirectory /hbase. Name node is in safe mode

解决方法:

进入Hadoop的bin目录运行命令手动离开安全模式:

[root@localhost85bin]#

[root@localhost85bin]# ./hadoop dfsadmin -safemode leave

[root@localhost85bin]#

出错原因:

这是因为在分布式文件系统启动的时候,开始的时候会有安全模式,当分布式文件系统处于安全模式的情况下,文件系统中的内容不允许修改也不允许删除,直到安全模式结束。安全模式主要是为了系统启动的时候检查各个DataNode上数据块的有效性,同时根据策略必要的复制或者删除部分数据块。运行期通过命令也可以进入安全模式。在实践过程中,系统启动的时候去修改和删除文件也会有安全模式不允许修改的出错提示,只需要等待一会即可。

 

 

 

6.13.       hbase启动错误

[root@localhost86 local]# hbase shell

HBase Shell; enter 'help<RETURN>' forlist of supported commands.

Type "exit<RETURN>" toleave the HBase Shell

Version 0.94.7, r1471806, Wed Apr 2418:44:36 PDT 2013

 

hbase(main):001:0> list

TABLE                                                                                          

17/04/01 19:15:06 ERRORzookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries

17/04/01 19:15:06 WARN zookeeper.ZKUtil:hconnection Unable to set watcher on znode (/hbase/hbaseid)

org.apache.zookeeper.KeeperException$ConnectionLossException:KeeperErrorCode = ConnectionLoss for /hbase/hbaseid

         atorg.apache.zookeeper.KeeperException.create(KeeperException.java:99)

         atorg.apache.zookeeper.KeeperException.create(KeeperException.java:51)

         atorg.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)

         atorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)

         atorg.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:450)

         atorg.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)

         atorg.apache.hadoop.hbase.zookeeper.ClusterId.getId(ClusterId.java:50)

         atorg.apache.hadoop.hbase.zookeeper.ClusterId.hasId(ClusterId.java:44)

         at

 

 

日志文件记录

[root@localhost85 hbase-0.94.7]#

[root@localhost85 hbase-0.94.7]# ls

bin         docs                            hbase-webapps  logs        README.txt  src

CHANGES.txt hbase-0.94.7-security.jar       lib            NOTICE.txt  sbin

conf        hbase-0.94.7-security-tests.jar LICENSE.txt    pom.xml     security

[root@localhost85 hbase-0.94.7]# cd logs/

[root@localhost85 logs]# ls

hbase-root-master-localhost85.log    hbase-root-master-localhost85.out.2   SecurityAuth.audit

hbase-root-master-localhost85.out    hbase-root-zookeeper-localhost85.log

hbase-root-master-localhost85.out.1  hbase-root-zookeeper-localhost85.out

[root@localhost85 logs]# vim hbase-root-master-localhost85.log

2017-04-01 18:57:30,503 INFOorg.apache.zookeeper.ZooKeeper: Client environment:user.home=/root

2017-04-01 18:57:30,503 INFOorg.apache.zookeeper.ZooKeeper: Client environment:user.dir=/usr/local

2017-04-01 18:57:30,504 INFOorg.apache.zookeeper.ZooKeeper: Initiating client connection,connectString=localhost86:2181,localhost85:2181 sessionTimeout=180000 watcher=master:60000

2017-04-01 18:57:30,566 INFOorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of thisprocess is 4818@localhost85

2017-04-01 18:57:30,585 INFOorg.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost85/192.168.1.85:2181.Will not attempt to authenticate using SASL (unknown error)

2017-04-01 18:57:30,597 WARNorg.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,closing socket connection and attempting reconnect

java.net.ConnectException: 拒绝连接

         atsun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

         atsun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)

         atorg.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)

         atorg.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

2017-04-01 18:57:30,724 INFOorg.apache.zookeeper.ClientCnxn: Opening socket connection to serverlocalhost86/192.168.1.86:2181. Will not attempt to authenticate using SASL(unknown error)

2017-04-01 18:57:30,725 INFOorg.apache.zookeeper.ClientCnxn: Socket connection established tolocalhost86/192.168.1.86:2181, initiating session

2017-04-01 18:57:30,742 WARNorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transientZooKeeper exception:org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =ConnectionLoss for /hbase

2017-04-01 18:57:30,742 INFOorg.apache.zookeeper.ClientCnxn: Unable to read additional data from serversessionid 0x0, likely server has closed socket, closing socket connection andattempting reconnect

2017-04-01 18:57:30,743 INFOorg.apache.hadoop.hbase.util.RetryCounter: Sleeping 2000ms before retry #1...

2017-04-01 18:57:32,616 INFOorg.apache.zookeeper.ClientCnxn: Opening socket connection to serverlocalhost85/192.168.1.85:2181. Will not attempt to authenticate using SASL(unknown error)

2017-04-01 18:57:32,617 WARNorg.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,closing socket connection and attempting reconnect

java.net.ConnectException: 拒绝连接

         atsun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

 

原因:主要是hbase-site.xml中hbase.zookeeper.quorum的值配置错误

<property>

<name>hbase.zookeeper.quorum</name>

<!--<value>localhost85,localhost86</value>-->

修改为以下内容

    <value>192.168.1.85,192.168.1.86</value>

 </property>

 

 

 

 

7.  sqoop安装

安装条件:hadoop集群、hbase已经安装完成

 

[root@localhost85 ~]# java -version

java version"1.7.0_15"

Java(TM) SE Runtime Environment (build1.7.0_15-b03)

Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01,mixed mode)

[root@localhost85 ~]#

[root@localhost85 ~]# hadoop version

Hadoop 1.1.2

Subversionhttps://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782

Compiled by hortonfo on Thu Jan 31 02:03:24UTC 2013

From source with checksumc720ddcf4b926991de7467d253a79b8b

[root@localhost85 ~]#

[root@localhost85 ~]# hbase version

17/04/01 21:46:09 INFO util.VersionInfo: HBase 0.94.7

17/04/01 21:46:09 INFO util.VersionInfo:Subversion https://svn.apache.org/repos/asf/hbase/tags/0.94.7RC1 -r 1471806

17/04/01 21:46:09 INFO util.VersionInfo:Compiled by lhofhans on Wed Apr 24 18:44:36 PDT 2013

[root@localhost85 ~]#

[root@localhost85 ~]#

 

7.    

7.1.   解压缩到当前目录

解压缩mysql链接到当前目录

[root@localhost85 download]# tar -xf mysql-connector-java-5.1.39.tar.gz  .

[root@localhost85 download]# ls

hbase-0.94.7-security.tar.gz        mysql-connector-java-5.1.39.tar.gz

hive-0.9.0.tar.gz                  MySQL-server-5.5.31-2.el6.i686.rpm

jdk-6u24-linux-i586.bin             pig-0.11.1.tar.gz

MySQL-client-5.5.31-2.el6.i686.rpm  sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz

mysql-connector-java-5.1.10.jar     zookeeper-3.4.5.tar.gz

mysql-connector-java-5.1.39

[root@localhost85 download]#

 

7.2.   解压缩sqoop到当前目录

[root@localhost85 download]# tar -xf sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz  .

[root@localhost85 download]# ls

hbase-0.94.7-security.tar.gz        mysql-connector-java-5.1.39.tar.gz

hive-0.9.0.tar.gz                  MySQL-server-5.5.31-2.el6.i686.rpm

jdk-6u24-linux-i586.bin             pig-0.11.1.tar.gz

MySQL-client-5.5.31-2.el6.i686.rpm  sqoop-1.4.3.bin__hadoop-1.0.0

mysql-connector-java-5.1.10.jar     sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz

mysql-connector-java-5.1.39         zookeeper-3.4.5.tar.gz

[root@localhost85 download]# cdsqoop-1.4.3.bin__hadoop-1.0.0

[root@localhost85 sqoop-1.4.3.bin__hadoop-1.0.0]#ls

bin            COMPILING.txt  ivy     LICENSE.txt  README.txt            src

build.xml      conf           ivy.xml  NOTICE.txt  sqoop-1.4.3.jar       testdata

CHANGELOG.txt  docs          lib     pom-old.xml  sqoop-test-1.4.3.jar

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]#

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]# cd lib/

[root@localhost85 lib]# pwd

/root/download/sqoop-1.4.3.bin__hadoop-1.0.0/lib

[root@localhost85 lib]# ls

ant-contrib-1.0b3.jar       avro-mapred-1.5.3.jar       jackson-mapper-asl-1.7.3.jar

ant-eclipse-1.0-jvm1.2.jar  commons-io-1.4.jar          jopt-simple-3.2.jar

avro-1.5.3.jar              hsqldb-1.8.0.10.jar         paranamer-2.3.jar

avro-ipc-1.5.3.jar          jackson-core-asl-1.7.3.jar  snappy-java-1.0.3.2.jar

[root@localhost85 lib]#

 

7.3.   将mysql链接文件拷贝到sqoop安装目录中的lib目录

[root@localhost85 lib]# cp/root/download/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar  .

[root@localhost85 lib]# ls

ant-contrib-1.0b3.jar       commons-io-1.4.jar            mysql-connector-java-5.1.39-bin.jar

ant-eclipse-1.0-jvm1.2.jar  hsqldb-1.8.0.10.jar           paranamer-2.3.jar

avro-1.5.3.jar              jackson-core-asl-1.7.3.jar    snappy-java-1.0.3.2.jar

avro-ipc-1.5.3.jar          jackson-mapper-asl-1.7.3.jar

avro-mapred-1.5.3.jar       jopt-simple-3.2.jar

[root@localhost85 lib]#

 

7.4.   将sqoop解压文件拷贝到/usr/local/目录下

root@localhost85 download]#

[root@localhost85 download]# cp -r sqoop-1.4.3.bin__hadoop-1.0.0 /usr/local/

[root@localhost85 download]# cd /usr/local/

[root@localhost85 local]# ls

bin           hbase         lib64                          src

etc           hbase-0.94.7  libexec                       VMwareTools-9.6.2-1688356.tar.gz

games         include       sbin                           vmware-tools-distrib

hadoop        jdk1.7.0_15   share

hadoop-1.1.2  lib          sqoop-1.4.3.bin__hadoop-1.0.0

[root@localhost85 local]#

[root@localhost85 local]#

 

7.5.   配置sqoop路径,使其命令生效

[root@localhost85 sqoop-1.4.3.bin__hadoop-1.0.0]#vim /etc/profile

 

done

 

unset i

unset -f pathmunge

export JAVA_HOME=/usr/local/jdk1.7.0_15

export JRE_HOME=/usr/local/jdk1.7.0_15/jre

exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

 

export HADOOP_HOME_WARN_SUPPRESS=1

export HADOOP_HOME=/usr/local/hadoop-1.1.2

exportPATH=$HADOOP_HOME/bin:$HADOOP_HOME/lib:$PATH

 

exportZOOKEEPER_HOME=/usr/local/hbase-0.94.7/lib

export HBASE_HOME=/usr/local/hbase-0.94.7

export PATH=$HBASE_HOME/bin:$PATH

 

export SQOOP_HOME=/usr/local/sqoop-1.4.3.bin__hadoop-1.0.0

exportPATH=$SQOOP_HOME/bin:$PATH

 

"/etc/profile" 94L, 2327C             

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]# source /etc/profile #使其文件立即生效

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]#

 

7.6.   将sqoop拷贝到localhost86上

 

[root@localhost85 local]# ls

bin           hbase         lib64                          src

etc           hbase-0.94.7  libexec                       VMwareTools-9.6.2-1688356.tar.gz

games         include       sbin                           vmware-tools-distrib

hadoop        jdk1.7.0_15   share

hadoop-1.1.2  lib          sqoop-1.4.3.bin__hadoop-1.0.0

[root@localhost85 local]# scp -r sqoop-1.4.3.bin__hadoop-1.0.0/  root@localhost86:/usr/local/

NOTICE.txt                                                    100%  251    0.3KB/s   00:00   

build.xml                                                    100%   46KB  46.0KB/s  00:00   

sqoop-site-template.xml                                       100%5064     5.0KB/s   00:00   

sqoop-site.xml                        

[root@localhost85 local]#

 

 

root@localhost85local]#

[root@localhost85local]# scp -r /etc/profile root@localhost86:/etc/

profile                                                      100% 2327     2.3KB/s   00:00   

[root@localhost85local]#

[root@localhost86 local]#

[root@localhost86 local]#

[root@localhost86 local]#

[root@localhost86 local]# source/etc/profile

[root@localhost86 local]# vim /etc/profile

[root@localhost86 local]# vim /etc/profile

 

unset i

unset -f pathmunge

exportJAVA_HOME=/usr/local/jdk1.7.0_15

exportJRE_HOME=/usr/local/jdk1.7.0_15/jre

exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

 

exportHADOOP_HOME_WARN_SUPPRESS=1

exportHADOOP_HOME=/usr/local/hadoop-1.1.2

exportPATH=$HADOOP_HOME/bin:$HADOOP_HOME/lib:$PATH

 

exportZOOKEEPER_HOME=/usr/local/hbase-0.94.7/lib

exportHBASE_HOME=/usr/local/hbase-0.94.7

exportPATH=$HBASE_HOME/bin:$PATH

 

exportSQOOP_HOME=/usr/local/sqoop-1.4.3.bin__hadoop-1.0.0

exportPATH=$SQOOP_HOME/bin:$PATH

 

"/etc/profile" 94L, 2327C  

[root@localhost86 local]#

 

      

7.7.   测试安装sqoop结果

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]# java -version

java version"1.7.0_15"

Java(TM) SE Runtime Environment (build1.7.0_15-b03)

Java HotSpot(TM) 64-Bit Server VM (build23.7-b01, mixed mode)

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]#

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]# hadoop version

Hadoop 1.1.2

Subversionhttps://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782

Compiled by hortonfo on Thu Jan 31 02:03:24UTC 2013

From source with checksumc720ddcf4b926991de7467d253a79b8b

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]#

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]# hbase version

17/04/01 22:32:06 INFO util.VersionInfo: HBase 0.94.7

17/04/01 22:32:06 INFO util.VersionInfo:Subversion https://svn.apache.org/repos/asf/hbase/tags/0.94.7RC1 -r 1471806

17/04/01 22:32:06 INFO util.VersionInfo:Compiled by lhofhans on Wed Apr 24 18:44:36 PDT 2013

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]#

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]#

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]# sqoop help

usage: sqoop COMMAND[ARGS]

 

Available commands:

 codegen            Generate codeto interact with database records

 create-hive-table  Import a tabledefinition into Hive

 eval               Evaluate a SQLstatement and display the results

 export             Export an HDFSdirectory to a database table

 help               List availablecommands

 import             Import a tablefrom a database to HDFS

 import-all-tables  Import tablesfrom a database to HDFS

 job                Work with savedjobs

 list-databases     List availabledatabases on a server

 list-tables        List availabletables in a database

 merge              Merge resultsof incremental imports

 metastore          Run astandalone Sqoop metastore

 version            Display versioninformation

 

See 'sqoop help COMMAND' for information ona specific command.

[root@localhost85sqoop-1.4.3.bin__hadoop-1.0.0]#

以上表示sqoop安装成功

 

 

8.  sqoop基本操作

 

 

 

 

 

 

 

 

 

 

 

 

 

 

9.  hadoop2.x安装及配置

1.1.  hadoop2.x安装及配置

1.1.1.  基本环境

参考配置基本环境

 

1.1.2.  解压缩文件

[root@xuegod63~]# ls

anaconda-ks.cfg  c.txt    install.log         test1   模板  图片  下载  桌面

b.txt            download  install.log.syslog  公共的  视频  文档  音乐

[root@xuegod63~]# cd download/

[root@xuegod63download]# tar hadoop-2.6.2.tar.gz

[root@xuegod63download]# ls

apache-tomcat-7.0.77.tar.gz  jdk-8u65-linux-x64.rpm

commons-pool2-2.0.jar        jedis-2.5.0.jar

hadoop-2.6.2                 nginx-1.6.3.tar.gz

hadoop-2.6.2.tar.gz          redis-3.2.8.tar.gz

hadoop-native-64-2.6.0.tar   tomcat-redis-session-manager-2.0.0.jar

[root@xuegod63download]#

 

 

 

1.1.3.  创建目录

[root@xuegod63hadoop]# mkdir -p /home/hadoop/dfs/name/home/hadoop/dfs/data /home/hadoop/tmp

[root@xuegod63hadoop]#

[root@xuegod63hadoop]#

 

1.1.4.  修改下面7个配置文件

         文件名称:hadoop-env.sh、yarn-evn.sh、slaves、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml

hadoop-env.sh          //java环境变量脚本

yarn-env.sh                         //制定yarn框架的java运行环境,YARN,它将资源管理和处理组件分开。基于 YARN 的架构不受 MapReduce 约束。本文将介绍 YARN,以及它相对于 Hadoop 中以前的分布式处理层的一些优势。

slaves                          //指定datanode数据存储服务器

core-site.xml             //指定访问hadoop web界面的路径

hdfs-site.xml             //文件系统的配置文件

mapred-site.xml      //mapreducer任务配置文件

yarn-site.xml             //该文件为yarn框架的配置,主要是一些任务的启动位置

 

 

修改hadoop-env.sh

[root@xuegod64 hadoop-2.6.2]# cd etc/hadoop/

[root@xuegod64 hadoop]# ls

capacity-scheduler.xml      kms-env.sh

configuration.xsl           kms-log4j.properties

container-executor.cfg      kms-site.xml

core-site.xml               log4j.properties

hadoop-env.cmd              mapred-env.cmd

hadoop-env.sh               mapred-env.sh

hadoop-metrics2.properties  mapred-queues.xml.template

hadoop-metrics.properties   mapred-site.xml.template

hadoop-policy.xml           slaves

hdfs-site.xml               ssl-client.xml.example

httpfs-env.sh               ssl-server.xml.example

httpfs-log4j.properties     yarn-env.cmd

httpfs-signature.secret     yarn-env.sh

httpfs-site.xml             yarn-site.xml

kms-acls.xml

[root@xuegod63hadoop]# vim hadoop-env.sh

#The only required environment variable is JAVA_HOME.  All others are

#optional.  When running a distributedconfiguration it is best to

#set JAVA_HOME in this file, so that it is correctly defined on

#remote nodes.

 

#The java implementation to use.

export JAVA_HOME=/usr/local/jdk1.7.0_15

 

[root@xuegod63hadoop]#

 

 

修改yarn-env.sh

[root@xuegod63hadoop]# vim yarn-env.sh

 

#Licensed to the Apache Software Foundation (ASF) under one or more

#contributor license agreements.  See theNOTICE file distributed with

#this work for additional information regarding copyright ownership.

#The ASF licenses this file to You under the Apache License, Version 2.0

#(the "License"); you may not use this file except in compliance with

#the License.  You may obtain a copy ofthe License at

#

#     http://www.apache.org/licenses/LICENSE-2.0

#

#Unless required by applicable law or agreed to in writing, software

#distributed under the License is distributed on an "AS IS" BASIS,

#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

#See the License for the specific language governing permissions and

#limitations under the License.

 

#User for YARN daemons

exportHADOOP_YARN_USER=${HADOOP_YARN_USER:-yarn}

 

#resolve links - $0 may be a softlink

exportYARN_CONF_DIR="${YARN_CONF_DIR:-$HADOOP_YARN_HOME/conf}"

 

# some Java parameters

export JAVA_HOME=/usr/local/jdk1.7.0_15

[root@xuegod63hadoop]#

 

 

修改core-site.xml

这个是 hadoop 的核心配置文件,这里需要配置的就这两个属性,fs.default.name 配置了 hadoop 的HDFS 系统的命名,位置为主机的 9000 端口; hadoop.tmp.dir 配置了 hadoop 的 tmp 目彔的根位置

[root@xuegod63hadoop]# vimcore-site.xml

<!-- Putsite-specific property overrides in this file. -->

 

<configuration>

         <property>

              <name>fs.defaultFS</name>

              <value>hdfs://192.168.1.63:9000</value>

         </property>

         <property>

                   <name>io.file.buffer.size</name>

                   <value>131072</value>

         </property>

         <property>

                   <name>hadoop.tmp.dir</name>

                   <value>file:/home/hadoop/tmp</value>

                   <description>Abase forother temporary directories.</description>

         </property>

 

</configuration>

[root@xuegod63hadoop]#

 

 

修改hdfs-site.xml

这个是 hdfs 的配置文件,dfs.http.address 配置了 hdfs 的 http 的访问位置; dfs.replication 配置了文件块的副本数,一般不大于从机的个数

 

[root@xuegod63hadoop]#vimhdfs-site.xml

<?xmlversion="1.0" encoding="UTF-8"?>

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version2.0 (the "License");

  you may not use this file except incompliance with the License.

  You may obtain a copy of the License at

 

    http://www.apache.org/licenses/LICENSE-2.0

 

  Unless required by applicable law or agreedto in writing, software

  distributed under the License is distributedon an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,either express or implied.

  See the License for the specific languagegoverning permissions and

  limitations under the License. Seeaccompanying LICENSE file.

-->

 

<!--Put site-specific property overrides in this file. -->

 

<configuration>

         <property>

              <name>dfs.namenode.secondary.http-address</name>

              <value>192.168.1.63:9001</value>

         </property>

         <property>

              <name>dfs.namenode.name.dir</name>

              <value>file:/home/hadoop/dfs/name</value>

         </property>

         <property>

              <name>dfs.datanode.data.dir</name>

              <value>file:/home/hadoop/dfs/data</value>

         </property>

         <property>

               <name>dfs.replication</name>

               <value>2</value>

         </property>

         <property>

               <name>dfs.webhdfs.enabled</name>

               <value>true</value>

         </property>

 

</configuration>

[root@xuegod63hadoop]#

 

 

修改mapred-site.xml

这个是mapreduce任务的配置,由于hadoop2.x使用了yarn框架,所以要实现分布式部署,必须在mapreduce.framework.name属性下配置为yarn。mapred.map.tasks和mapred.reduce.tasks分别为map和reduce的任务数,同时指定:Hadoop的历史服务器historyserver!

Hadoop自带了一个历史服务器,可以通过历史服务器查看已经运行完的Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。默认情况下,Hadoop历史服务器是没有启动的,我们可以通过下面的命令来启动Hadoop历史服务器

 

[root@xuegod63hadoop]# vim mapred-site.xml

<!--Put site-specific property overrides in this file. -->

 

<configuration>

         <property>

            <name>mapreduce.framework.name</name>

             <value>yarn</value>

         </property>

         <property>

              <name>mapreduce.jobhistory.address</name>

              <value>192.168.1.63:10020</value>

         </property>

         <property>

              <name>mapreduce.jobhistory.webapp.address</name>

              <value>192.168.1.63:19888</value>

         </property>

 

</configuration>

 

[root@xuegod63hadoop]#

 

 

修改vim yarn-site.xml                    

该文件为yarn框架的配置,主要是一些任务的启动位置

[root@xuegod63hadoop]# vim yarn-site.xml

<configuration>

 

<!-- Site specific YARN configuration properties -->

         <property>

             <name>yarn.nodemanager.aux-services</name>

             <value>mapreduce_shuffle</value>

         </property>

         <property>

              <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

              <value>org.apache.hadoop.mapred.ShuffleHandler</value>

         </property>

         <property>

             <name>yarn.resourcemanager.address</name>

             <value>192.168.1.63:8032</value>

         </property>

         <property>

               <name>yarn.resourcemanager.scheduler.address</name>

              <value>192.168.1.63:8030</value>

         </property>

         <property>

              <name>yarn.resourcemanager.resource-tracker.address</name>

              <value>192.168.1.63:8031</value>

         </property>

         <property>

               <name>yarn.resourcemanager.admin.address</name>

              <value>192.168.1.63:8033</value>

         </property>

         <property>

             <name>yarn.resourcemanager.webapp.address</name>

             <value>192.168.1.63:8088</value>

         </property>

 

</configuration>

[root@xuegod63hadoop]#

 

 

1.1.5.  格式化节点

[root@xuegod63bin]#

[root@xuegod63bin]# ls

container-executor  hdfs     mapred.cmd               yarn

hadoop              hdfs.cmd  rcc                      yarn.cmd

hadoop.cmd          mapred    test-container-executor

[root@xuegod63bin]#

[root@xuegod63bin]# hdfs namenode -format

17/04/0520:49:37 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG:Starting NameNode

STARTUP_MSG:   host = xuegod63/192.168.1.63

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 2.6.2

STARTUP_MSG:   classpath =

。。。。。。。。。。

2017四月05 20:49:40

17/04/0520:49:40 INFO util.GSet: Computing capacity for map BlocksMap

17/04/0520:49:40 INFO util.GSet: VM type       =64-bit

17/04/0520:49:40 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB

17/04/0520:49:40 INFO util.GSet: capacity      =2^21 = 2097152 entries

17/04/0520:49:40 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false

17/04/0520:49:40 INFO blockmanagement.BlockManager: defaultReplication         = 2

17/04/0520:49:40 INFO blockmanagement.BlockManager: maxReplication             = 512

17/04/0520:49:40 INFO blockmanagement.BlockManager: minReplication             = 1

17/04/0520:49:40 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2

17/04/0520:49:40 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false

17/04/0520:49:40 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000

17/04/0520:49:40 INFO blockmanagement.BlockManager: encryptDataTransfer        = false

17/04/0520:49:40 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000

17/04/0520:49:40 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)

17/04/0520:49:40 INFO namenode.FSNamesystem: supergroup          = supergroup

17/04/0520:49:40 INFO namenode.FSNamesystem: isPermissionEnabled = true

17/04/0520:49:40 INFO namenode.FSNamesystem: HA Enabled: false

17/04/0520:49:40 INFO namenode.FSNamesystem: Append Enabled: true

17/04/0520:49:40 INFO util.GSet: Computing capacity for map INodeMap

17/04/0520:49:40 INFO util.GSet: VM type       =64-bit

17/04/0520:49:40 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB

17/04/0520:49:40 INFO util.GSet: capacity      =2^20 = 1048576 entries

17/04/0520:49:40 INFO namenode.NameNode: Caching file names occuring more than 10 times

17/04/0520:49:40 INFO util.GSet: Computing capacity for map cachedBlocks

17/04/0520:49:40 INFO util.GSet: VM type       =64-bit

17/04/0520:49:40 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB

17/04/0520:49:40 INFO util.GSet: capacity      =2^18 = 262144 entries

17/04/0520:49:40 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct =0.9990000128746033

17/04/0520:49:40 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0

17/04/0520:49:40 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000

17/04/0520:49:40 INFO namenode.FSNamesystem: Retry cache on namenode is enabled

17/04/0520:49:40 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heapand retry cache entry expiry time is 600000 millis

17/04/0520:49:40 INFO util.GSet: Computing capacity for map NameNodeRetryCache

17/04/0520:49:40 INFO util.GSet: VM type       =64-bit

17/04/0520:49:40 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB

17/04/0520:49:40 INFO util.GSet: capacity      =2^15 = 32768 entries

17/04/0520:49:40 INFO namenode.NNConf: ACLs enabled? false

17/04/0520:49:40 INFO namenode.NNConf: XAttrs enabled? true

17/04/0520:49:40 INFO namenode.NNConf: Maximum size of an xattr: 16384

17/04/0520:49:41 INFO namenode.FSImage: Allocated new BlockPoolId:BP-822209180-192.168.1.63-1491396580931

17/04/0520:49:41 INFO common.Storage: Storage directory /home/hadoop/dfs/name has beensuccessfully formatted.

17/04/0520:49:41 INFO namenode.NNStorageRetentionManager: Going to retain 1 images withtxid >= 0

17/04/0520:49:41 INFO util.ExitUtil: Exiting with status 0

17/04/0520:49:41 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG:Shutting down NameNode at xuegod63/192.168.1.63

************************************************************/

[root@xuegod63bin]#

 

 

1.1.6.  查看格式化数据

[root@xuegod63bin]#

[root@xuegod63bin]# tree /home/hadoop/dfs

/home/hadoop/dfs

└── name

    └── current

        ├──fsimage_0000000000000000000

        ├──fsimage_0000000000000000000.md5

        ├── seen_txid

        └── VERSION

 

2directories, 4 files

[root@xuegod63bin]#

[root@xuegod63bin]#

[root@xuegod63bin]#

 

 

1.1.7.  启动hdfs节点服务

[root@xuegod63bin]#

[root@xuegod63bin]# cd ..

[root@xuegod63hadoop-2.6.2]# ls

bin  include LICENSE.txt  NOTICE.txt  sbin

etc  libexec logs         README.txt  share

[root@xuegod63hadoop-2.6.2]# ./sbin/start-dfs.sh

17/04/0520:53:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... using builtin-java classes where applicable

Startingnamenodes on [xuegod63]

xuegod63:starting namenode, logging to /root/download/hadoop-2.6.2/logs/hadoop-root-namenode-xuegod63.out

localhost:starting datanode, logging to/root/download/hadoop-2.6.2/logs/hadoop-root-datanode-xuegod63.out

Startingsecondary namenodes [xuegod63]

xuegod63:starting secondarynamenode, logging to /root/download/hadoop-2.6.2/logs/hadoop-root-secondarynamenode-xuegod63.out

17/04/0520:53:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... using builtin-java classes where applicable

[root@xuegod63hadoop-2.6.2]#

 

1.1.8.  查看hdfs节点

[root@xuegod63hadoop-2.6.2]# ps -aux | grep namenode --color

root     17512 0.0  0.0 103260   940 pts/3   S+   20:54   0:00 grep namenode--color

[root@xuegod63hadoop-2.6.2]#

 

 

1.1.9.  查看datande节点

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]# ps -aux | grep datanode --color

root     17518 0.0  0.0 103256   860 pts/3   S+   20:56   0:00 grep datanode--color

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]#

 

 

 

1.1.10.          启动yarn分布式计算

 

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]#./sbin/start-yarn.

start-yarn.cmd  start-yarn.sh  

[root@xuegod63hadoop-2.6.2]# ./sbin/start-yarn.sh

startingyarn daemons

startingresourcemanager, logging to/root/download/hadoop-2.6.2/logs/yarn-root-resourcemanager-xuegod63.out

localhost:starting nodemanager, logging to/root/download/hadoop-2.6.2/logs/yarn-root-nodemanager-xuegod63.out

[root@xuegod63hadoop-2.6.2]#

 

 

1.1.11.          查看resourcemanager进程

[root@xuegod63hadoop-2.6.2]# ps -aux | grep resourcemanager

root     17962 0.0  0.0 103260   868 pts/3   S+   20:59   0:00 grep resourcemanager

[root@xuegod63hadoop-2.6.2]#

 

注:start-dfs.sh  和start-yarn.sh 这两个脚本可用 start-all.sh 代替。

 

[root@xuegod63hadoop-2.6.2]# /sbin/start-all.sh                //开启

[root@xuegod63hadoop-2.6.2]# /sbin/stop-all.sh                 //关闭

 

 

 

1.1.12.          启动jobhistory服务

启动jobhistory服务,查看mapreduce运行状态

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]# ./sbin/mr-jobhistory-daemon.sh

Usage:mr-jobhistory-daemon.sh [--config <conf-dir>] (start|stop)<mapred-command>

[root@xuegod63hadoop-2.6.2]#./sbin/mr-jobhistory-daemon.sh start historyserver

startinghistoryserver, logging to/root/download/hadoop-2.6.2/logs/mapred-root-historyserver-xuegod63.out

[root@xuegod63hadoop-2.6.2]#

 

 

1.1.13.          查看HDFS状态

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]# ls

bin  include LICENSE.txt  NOTICE.txt  sbin

etc  libexec logs         README.txt  share

[root@xuegod63hadoop-2.6.2]# ./bin/hdfs dfsadmin -report

17/04/0521:04:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... using builtin-java classes where applicable

ConfiguredCapacity: 10568916992 (9.84 GB)

PresentCapacity: 3472830464 (3.23 GB)

DFSRemaining: 3472805888 (3.23 GB)

DFSUsed: 24576 (24 KB)

DFSUsed%: 0.00%

Underreplicated blocks: 0

Blockswith corrupt replicas: 0

Missingblocks: 0

 

-------------------------------------------------

Livedatanodes (1):

 

Name:192.168.1.63:50010 (xuegod63)

Hostname:xuegod63

DecommissionStatus : Normal

ConfiguredCapacity: 10568916992 (9.84 GB)

DFSUsed: 24576 (24 KB)

NonDFS Used: 7096086528 (6.61 GB)

DFSRemaining: 3472805888 (3.23 GB)

DFSUsed%: 0.00%

DFSRemaining%: 32.86%

ConfiguredCache Capacity: 0 (0 B)

CacheUsed: 0 (0 B)

CacheRemaining: 0 (0 B)

CacheUsed%: 100.00%

CacheRemaining%: 0.00%

Xceivers:1

Lastcontact: Wed Apr 05 21:04:20 CST 2017

 

 

[root@xuegod63hadoop-2.6.2]#

 

启动错误:

[root@xuegod63hadoop-2.6.2]# sbin/start-all.sh

Thisscript is Deprecated. Instead use start-dfs.sh and start-yarn.sh

17/05/15 18:58:59 WARN util.NativeCodeLoader: Unable to loadnative-hadoop library for your platform... using builtin-java classes whereapplicable

Startingnamenodes on [xuegod63]

Theauthenticity of host 'xuegod63 (192.168.1.63)' can't be established.

RSAkey fingerprint is 06:42:3e:1a:f6:55:f1:ac:4d:15:c5:ac:c4:d8:0c:26.

Areyou sure you want to continue connecting (yes/no)? yes

xuegod63:Warning: Permanently added 'xuegod63' (RSA) to the list of known hosts.

xuegod63:starting namenode, logging to /usr/local/hadoop-2.6.2/logs/hadoop-root-namenode-xuegod63.out

Theauthenticity of host 'localhost (::1)' can't be established.

RSAkey fingerprint is 06:42:3e:1a:f6:55:f1:ac:4d:15:c5:ac:c4:d8:0c:26.

Areyou sure you want to continue connecting (yes/no)? yes

localhost:Warning: Permanently added 'localhost' (RSA) to the list of known hosts.

localhost:starting datanode, logging to/usr/local/hadoop-2.6.2/logs/hadoop-root-datanode-xuegod63.out

Startingsecondary namenodes [xuegod63]

xuegod63:starting secondarynamenode, logging to /usr/local/hadoop-2.6.2/logs/hadoop-root-secondarynamenode-xuegod63.out

17/05/15 18:59:43 WARN util.NativeCodeLoader: Unable to loadnative-hadoop library for your platform... using builtin-java classes whereapplicable

starting yarn daemons

resourcemanagerrunning as process 3400. Stop it first.

localhost:starting nodemanager, logging to/usr/local/hadoop-2.6.2/logs/yarn-root-nodemanager-xuegod63.out

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]#

解决办法:下载hadoop-native包,解压到lib/native目录下即可。

1.1.14.          查看文件块组成

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]#

[root@xuegod63hadoop-2.6.2]# ./bin/hdfs  fsck / -files -blocks

17/04/0521:05:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... using builtin-java classes where applicable

Connectingto namenode via http://xuegod63:50070

FSCKstarted by root (auth:SIMPLE) from /192.168.1.63 for path / at Wed Apr 0521:05:45 CST 2017

/<dir>

/tmp<dir>

/tmp/hadoop-yarn<dir>

/tmp/hadoop-yarn/staging<dir>

/tmp/hadoop-yarn/staging/history<dir>

/tmp/hadoop-yarn/staging/history/done<dir>

/tmp/hadoop-yarn/staging/history/done_intermediate<dir>

Status:HEALTHY

 Total size:       0B

 Total dirs:       7

 Total files:      0

 Total symlinks:                 0

 Total blocks (validated):         0

 Minimally replicated blocks: 0

 Over-replicated blocks: 0

 Under-replicated blocks:        0

 Mis-replicated blocks:            0

 Default replication factor:     2

 Average block replication:     0.0

 Corrupt blocks:                0

 Missing replicas:             0

 Number of data-nodes:          1

 Number of racks:            1

FSCKended at Wed Apr 05 21:05:45 CST 2017 in 10 milliseconds

 

 

Thefilesystem under path '/' is HEALTHY

[root@xuegod63hadoop-2.6.2]#

 

 

 

1.1.15.          Web查看HDFS

浏览器输入:http://192.168.1.63:50070

 

 

 

1.1.16.          Web查看集群状态

浏览器输入:http://192.168.1.63:8088

 

 

 

1.2.  hadoop2.x开发

1.2.1.  创建文件及目录

 

public classApp{

   private static final String HDFS_PATH="hdfs://192.168.1.85:9000/hello";

   private static final String DIR_PATH="/demo1";

   private static final String FILE_PATH="/demo1/f100";

   public static void main(String[] args) {

      try {

         FileSystemfileSystem = FileSystem.get(new URI(HDFS_PATH),new  Configuration());

         //創建文件夾

         fileSystem.mkdirs(new Path(DIR_PATH));

        

         //創建文件

         FSDataOutputStreamout = fileSystem.create(new Path(FILE_PATH));

         FileInputStreamin = newFileInputStream("H:/hadoop开发.docx");

         IOUtils.copyBytes(in,out, 1024,true);

        

      }catch(Exception e) {

         e.printStackTrace();

      }

     

     

   }

}

 

 

HDFS文件操作

public classHDFSApp{

 

   private static final String HDFS_PATH="hdfs://192.168.1.85:9000/hello";

   public static void main(String[] args) {

      try {

         URL.setURLStreamHandlerFactory(newFsUrlStreamHandlerFactory());

         URLurl = newURL(HDFS_PATH);

         IOUtils.copyBytes(url.openStream(),System.out,1024,true);

      }catch(Exception e) {

         e.printStackTrace();

      }

   }

}

 

1.2.2.  RPC远程调用

服务端:

public classMyServer {

 

   public static final String SERVER_ADDRESS="localhost";

   public static final int SERVER_PORT = 12345;

 

   public String hello(Stringname) {

      return "hello:"+ name;

   }

}

 

public classRpcServer {

   public static final String SERVER_ADDRESS="localhost";

   public static final int SERVER_PORT = 12345;

 

   public static void main(String[] args) {

      try {

         /**

          * Parameters:

         *  instance :实例中的方法被客户端调用

         *  bindAddress ip地址,用于监听客户端连接

         *  port ip地址,用于监听客户端连接

         *  conf

         *  Throws:

          */

         Serverserver = RPC.getServer(new MyBiz(),SERVER_ADDRESS,SERVER_PORT,new Configuration());

         server.start();//启动服务

      }catch(IOExceptione) {

         e.printStackTrace();

      }

   }

}

 

客户端:

import java.net.InetSocketAddress;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.ipc.RPC;

 

public classMyClient {

 

   public static void main(String[] args)throws Exception {

      /**

       *  Class<? extends VersionedProtocol> protocol:构造一个客户端代理对象

            long clientVersion,

            InetSocketAddress addr,

            Configuration conf

       */

      MyBizInterfaceproxy = (MyBizInterface)RPC.waitForProxy(MyBizInterface.class,

            MyBizInterface.VERSION,

            newInetSocketAddress(MyServer.SERVER_ADDRESS, MyServer.SERVER_PORT),

            new Configuration());

      Stringname = proxy.hello("hello world");

      System.out.println("客户端:"+name);

      RPC.stopProxy(proxy);

   }

}

 

 

接口:

public interfaceMyBizInterface extends VersionedProtocol {

   final static long  VERSION = 12345;

   public abstract Stringhello(String name) throws IOException;

 

}

 

实现类:

public classMyBiz implements MyBizInterface {

 

   /**

    * @see rpc.MyBizInterface#hello(java.lang.String)

    * @param name

    * @return

    * @throws IOException

    */

   @Override

   public String hello(Stringname)throwsIOException {

      return name;

   }

 

   @Override

   public longgetProtocolVersion(String protocol, long clientVersion)throws IOException {

      return MyBizInterface.VERSION;

   }

 

}