sqoop-1.4.6 安装及配置

来源:互联网 发布:h5页面源码下载 编辑:程序博客网 时间:2024/05/17 21:53

1.环境信息

[hadoop@master sqoop-1.4.6]$ cat /etc/redhat-release CentOS Linux release 7.1.1503 (Core) [hadoop@master sqoop-1.4.6]$ 

[hadoop@master sqoop-1.4.6]$ mysql --versionmysql  Ver 14.14 Distrib 5.6.37, for Linux (x86_64) using  EditLine wrapper

[hadoop@master sqoop-1.4.6]$ hadoop version Hadoop 2.8.1Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 20fe5304904fc2f5a18053c389e43cd26f7a70feCompiled by vinodkv on 2017-06-02T06:14ZCompiled with protoc 2.5.0From source with checksum 60125541c2b3e266cbf3becc5bda666This command was run using /home/hadoop/hadoop-2.8.1/share/hadoop/common/hadoop-common-2.8.1.jar

[hadoop@master sqoop-1.4.6]$ hive --versionHive 2.1.1Subversion git://jcamachorodriguez-rMBP.local/Users/jcamachorodriguez/src/workspaces/hive/HIVE-release2/hive -r 1af77bbf8356e86cabbed92cfa8cc2e1470a1d5cCompiled by jcamachorodriguez on Tue Nov 29 19:46:12 GMT 2016From source with checksum 569ad6d6e5b71df3cb04303183948d90


[hadoop@master sqoop-1.4.6]$ hbase versionHBase 1.2.6Source code repository file:///home/busbey/projects/hbase/hbase-assembly/target/hbase-1.2.6 revision=UnknownCompiled by busbey on Mon May 29 02:25:32 CDT 2017From source with checksum 7e8ce83a648e252758e9dae1fbe779c9

2.下载

http://mirror.bit.edu.cn/apache/sqoop/1.4.6/sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz

这个目录下有多个sqoop ,请在请注意下上面这个版本, sqoop-1.4.6.tar.gz 这个包少sqoop-1.4.6.jar文件。

2.1 解压配置

解压到/home/hadoop 下面,并重命名为sqoop-1.4.6

[hadoop@master sqoop-1.4.6]$ cd /home/hadoop/sqoop-1.4.6
2.1.1 拷贝文件

[hadoop@master sqoop-1.4.6]$ cp sqoop-1.4.6.jar  lib/     #拷贝sqoop-1.4.6.jar到lib目录
[hadoop@master sqoop-1.4.6]$ cp mysql-connector-java-5.1.44-bin.jar  lib/     #拷贝mysql驱动包到lib目录,这个包需要下载解压,前面hive安装提过


2.2 配置

sqoop-env.sh

[hadoop@master conf]$ cat sqoop-env.sh # Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# included in all the hadoop scripts with source command# should not be executable directly# also should not be passed any arguments, since we need original $*# Set Hadoop-specific environment variables here.#Set path to where bin/hadoop is availableexport HADOOP_COMMON_HOME=/home/hadoop/hadoop-2.8.1/#Set path to where hadoop-*-core.jar is availableexport HADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.8.1/share/hadoop/mapreduce#set the path to where bin/hbase is availableexport HBASE_HOME=/home/hadoop/hbase-1.2.6#Set the path to where bin/hive is availableexport HIVE_HOME=/home/hadoop/apache-hive-2.1.1#Set the path for where zookeper config dir is#export ZOOCFGDIR=                               #使用自带的

/etc/profile 变量设置 最后添加

export JAVA_HOME=/usr/java/jdk1.8.0_131/export HADOOP_HOME=/home/hadoop/hadoop-2.8.1/export HIVE_HOME=/home/hadoop/apache-hive-2.1.1export HBASE_HOME=/home/hadoop/hbase-1.2.6export SQOOP_HOME=/home/hadoop/sqoop-1.4.6    #添加sqoop变量export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$SQOOP_HOME/bin  #添加sqoop变量


configure-sqoop 

注释下面部分(启动会告警):

   [hadoop@master bin]$ cat -n configure-sqoop     ...........   134  ## Moved to be a runtime check in sqoop.
   135  #if [ ! -d "${HCAT_HOME}" ]; then   136  #  echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."   137  #  echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'   138  #fi   139   140  #if [ ! -d "${ACCUMULO_HOME}" ]; then   141  #  echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."   142  #  echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'   143  #fi    ..........


3.验证

[hadoop@master bin]$  sqoop import --connect jdbc:mysql://10.0.1.98/ykt --table paper_detail    --username root --password 123456 --direct -m 1SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.8.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/home/hadoop/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]17/09/23 19:52:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.617/09/23 19:52:01 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.17/09/23 19:52:02 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.17/09/23 19:52:02 INFO tool.CodeGenTool: Beginning code generation17/09/23 19:52:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `paper_detail` AS t LIMIT 117/09/23 19:52:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `paper_detail` AS t LIMIT 117/09/23 19:52:02 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop-2.8.1/share/hadoop/mapreduceNote: /tmp/sqoop-hadoop/compile/f2ada91047344c6af2723a1a8044f440/paper_detail.java uses or overrides a deprecated API.Note: Recompile with -Xlint:deprecation for details.17/09/23 19:52:06 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/f2ada91047344c6af2723a1a8044f440/paper_detail.jar17/09/23 19:52:06 INFO manager.DirectMySQLManager: Beginning mysqldump fast path import17/09/23 19:52:06 INFO mapreduce.ImportJobBase: Beginning import of paper_detail17/09/23 19:52:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar17/09/23 19:52:10 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps17/09/23 19:52:11 INFO client.RMProxy: Connecting to ResourceManager at master/10.0.1.118:1804017/09/23 19:52:22 INFO db.DBInputFormat: Using read commited transaction isolation17/09/23 19:52:23 INFO mapreduce.JobSubmitter: number of splits:117/09/23 19:52:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1506138213755_000417/09/23 19:52:28 INFO impl.YarnClientImpl: Submitted application application_1506138213755_000417/09/23 19:52:29 INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1506138213755_0004/17/09/23 19:52:29 INFO mapreduce.Job: Running job: job_1506138213755_000417/09/23 19:52:55 INFO mapreduce.Job: Job job_1506138213755_0004 running in uber mode : false17/09/23 19:52:55 INFO mapreduce.Job:  map 0% reduce 0%17/09/23 19:53:26 INFO mapreduce.Job:  map 100% reduce 0%17/09/23 19:53:27 INFO mapreduce.Job: Job job_1506138213755_0004 completed successfully17/09/23 19:53:28 INFO mapreduce.Job: Counters: 30        File System Counters                FILE: Number of bytes read=0                FILE: Number of bytes written=160883                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=87                HDFS: Number of bytes written=27954                HDFS: Number of read operations=4                HDFS: Number of large read operations=0                HDFS: Number of write operations=2        Job Counters                 Launched map tasks=1                Other local map tasks=1                Total time spent by all maps in occupied slots (ms)=27272                Total time spent by all reduces in occupied slots (ms)=0                Total time spent by all map tasks (ms)=27272                Total vcore-seconds taken by all map tasks=27272                Total megabyte-seconds taken by all map tasks=27926528        Map-Reduce Framework                Map input records=1                Map output records=962                Input split bytes=87                Spilled Records=0                Failed Shuffles=0                Merged Map outputs=0                GC time elapsed (ms)=92                CPU time spent (ms)=1150                Physical memory (bytes) snapshot=105013248                Virtual memory (bytes) snapshot=2092593152                Total committed heap usage (bytes)=17776640        File Input Format Counters                 Bytes Read=0        File Output Format Counters                 Bytes Written=2795417/09/23 19:53:28 INFO mapreduce.ImportJobBase: Transferred 27.2988 KB in 77.2979 seconds (361.6399 bytes/sec)17/09/23 19:53:28 INFO mapreduce.ImportJobBase: Retrieved 962 records.

查看HDFS文件数据
如果不指定hdfs目录默认生成在/user/someuser/paper_detail/

[hadoop@master bin]$ hadoop fs -cat  /user/hadoop/paper_detail/part-m-00000 |more1,1,填空,22,5,1,11,1,填空,23,5,2,21,1,填空,31,5,3,31,2,解答题,403,5,1,41,2,解答题,394,5,2,51,3,多选题,987,5,1,61,4,单选题,757,5,1,71,4,单选题,133,5,2,82,1,单项选择,19,1,1,12,2,数字选择,18,2,1,22,2,数字选择,21,2,2,32,2,数字选择,24,2,3,42,2,数字选择,25,2,4,52,3,填空,20,5,1,62,3,填空,23,10,2,72,4,233,395,12,1,83,1,单项选择,16,2,1,13,1,单项选择,17,2,2,23,1,单项选择,18,2,3,33,1,单项选择,21,2,4,43,1,单项选择,24,2,5,53,1,单项选择,25,2,6,63,2,解答,62,2,1,73,2,解答,63,2,2,84,1,选择题,717,2,1,14,1,选择题,718,2,2,2......


sqoop 官方手册:http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_controlling_the_import_process

原创粉丝点击