sqoop导入导出表/数据到Hive

来源：互联网发布：短文本相似算法编辑：程序博客网时间：2024/04/29 19:33

笔记:

将关系型数据库表及表中的数据复制到hive中
sqoop import : RMDBS——>hive
语法：
sqoop import --connect jdbc:mysql://IP:PORT/database --username root --password PWD --table tablename --hive-import --hive-table hivetable -m 1

[hadoop@nameNode src]$ sqoop import --connect jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8 --username root --password root --table t_user --hive-import --hive-table hivetest1 -m 1

以下是返回结果:

16/04/22 15:16:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.616/04/22 15:16:48 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.16/04/22 15:16:48 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override16/04/22 15:16:48 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.16/04/22 15:16:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.16/04/22 15:16:48 INFO tool.CodeGenTool: Beginning code generation16/04/22 15:16:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 116/04/22 15:16:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 116/04/22 15:16:49 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop/src/hadoop-2.5.0注: /tmp/sqoop-hadoop/compile/14d08d5f7ecc890e985eeb17942619db/t_user.java使用或覆盖了已过时的 API。注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。16/04/22 15:16:54 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/14d08d5f7ecc890e985eeb17942619db/t_user.jar16/04/22 15:16:54 WARN manager.MySQLManager: It looks like you are importing from mysql.16/04/22 15:16:54 WARN manager.MySQLManager: This transfer can be faster! Use the --direct16/04/22 15:16:54 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.16/04/22 15:16:54 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)16/04/22 15:16:54 INFO mapreduce.ImportJobBase: Beginning import of t_user16/04/22 15:16:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable16/04/22 15:16:55 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar16/04/22 15:16:56 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps16/04/22 15:16:56 INFO client.RMProxy: Connecting to ResourceManager at nameNode/192.168.2.246:803216/04/22 15:17:00 INFO db.DBInputFormat: Using read commited transaction isolation16/04/22 15:17:00 INFO mapreduce.JobSubmitter: number of splits:116/04/22 15:17:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461290770642_001316/04/22 15:17:01 INFO impl.YarnClientImpl: Submitted application application_1461290770642_001316/04/22 15:17:01 INFO mapreduce.Job: The url to track the job: http://nameNode:8088/proxy/application_1461290770642_0013/16/04/22 15:17:01 INFO mapreduce.Job: Running job: job_1461290770642_001316/04/22 15:17:16 INFO mapreduce.Job: Job job_1461290770642_0013 running in uber mode : false16/04/22 15:17:16 INFO mapreduce.Job:  map 0% reduce 0%16/04/22 15:17:25 INFO mapreduce.Job:  map 100% reduce 0%16/04/22 15:17:25 INFO mapreduce.Job: Job job_1461290770642_0013 completed successfully16/04/22 15:17:25 INFO mapreduce.Job: Counters: 30        File System Counters                FILE: Number of bytes read=0                FILE: Number of bytes written=118487                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=87                HDFS: Number of bytes written=1111                HDFS: Number of read operations=4                HDFS: Number of large read operations=0                HDFS: Number of write operations=2        Job Counters                 Launched map tasks=1                Other local map tasks=1                Total time spent by all maps in occupied slots (ms)=6570                Total time spent by all reduces in occupied slots (ms)=0                Total time spent by all map tasks (ms)=6570                Total vcore-seconds taken by all map tasks=6570                Total megabyte-seconds taken by all map tasks=6727680        Map-Reduce Framework                Map input records=11                Map output records=11                Input split bytes=87                Spilled Records=0                Failed Shuffles=0                Merged Map outputs=0                GC time elapsed (ms)=95                CPU time spent (ms)=1420                Physical memory (bytes) snapshot=107536384                Virtual memory (bytes) snapshot=2064531456                Total committed heap usage (bytes)=30474240        File Input Format Counters                 Bytes Read=0        File Output Format Counters                 Bytes Written=111116/04/22 15:17:25 INFO mapreduce.ImportJobBase: Transferred 1.085 KB in 29.4684 seconds (37.7015 bytes/sec)16/04/22 15:17:25 INFO mapreduce.ImportJobBase: Retrieved 11 records.16/04/22 15:17:25 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 116/04/22 15:17:25 WARN hive.TableDefWriter: Column create_date had to be cast to a less precise type in Hive16/04/22 15:17:26 INFO hive.HiveImport: Loading uploaded data into Hive16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative16/04/22 15:17:29 INFO hive.HiveImport: 16/04/22 15:17:29 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/hadoop/hadoop/src/hive-0.12.0-bin/lib/hive-common-0.12.0.jar!/hive-log4j.properties16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/src/hadoop-2.5.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/src/hive-0.12.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]16/04/22 15:17:39 INFO hive.HiveImport: OK16/04/22 15:17:39 INFO hive.HiveImport: Time taken: 8.927 seconds16/04/22 15:17:39 INFO hive.HiveImport: Loading data to table default.hivetest116/04/22 15:17:39 INFO hive.HiveImport: Table default.hivetest1 stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 1111, raw_data_size: 0]16/04/22 15:17:39 INFO hive.HiveImport: OK16/04/22 15:17:39 INFO hive.HiveImport: Time taken: 0.789 seconds16/04/22 15:17:40 INFO hive.HiveImport: Hive import complete.16/04/22 15:17:40 INFO hive.HiveImport: Export directory is empty, removing it.

显示这样说明OK Sqoop导入到Hive中成功

查询Hive结果

hive> <span style="color:#ff0000;">show databases;</span>OKdefaultTime taken: 6.138 seconds, Fetched: 1 row(s)hive><span style="color:#ff0000;"> use default;</span>OKTime taken: 0.038 secondshive> <span style="color:#ff0000;">show tables;</span>OKhivetest1testTime taken: 0.336 seconds, Fetched: 2 row(s)hive> select * from hivetest1;OK6       2014-01-12 13:36:19.0   admin@163.com   admin   f6fdffe48c908deb0f4c3bd36c032e72        1234567890      1       admin45      2015-04-01 23:55:52.0   zhenjavax@1.com internet4       6512bd43d9caa6e02c990b0a82652dca        134567890       1       146      2015-04-02 17:10:18.0   zhenja222222vax@1.com   internet3       b59c67bf196a4758191e42f76670ceba        1234567890      1       11147      2015-04-05 00:43:11.0   lm@163.com      lm      94a4ebeb834f927228209c2dd71c10e5        1234567890      1       lm48      2015-04-05 03:39:19.0   admin@163.com   internet1       c6c73fee8fd9af7d3f1abd1611a40b32        123456790-      1       internet149      2015-04-05 03:39:53.0   admin@163.com   internet2       79e5b03556b7e87b508b5977d5f3f7e0        internet2       1       internet250      2015-04-05 03:41:50.0   lm@163.com      internet5       9ac010daa64dc3065b7b6eb584d231ae        123456780       1       internet551      2015-04-05 03:42:24.0   admin@163.com   internet6       1614782b1a29db28ce58ce1173d97ee0        123456780       1       internet652      2015-04-05 03:42:55.0   lm@163.com      internet7       6f3edab07447258b53b2f821187f6f1f        123456780       1       internet753      2015-04-05 03:43:21.0   admin@163.com   internet8       c7ef3b5c4827d840d0c264a1c06f1717        123456780       1       internet854      2015-04-05 03:43:53.0   admin@163.com   internet9       33c7785a99eeda46b8feb7a18b8dc45 123456780       1       internet9Time taken: 3.335 seconds, Fetched: 11 row(s)hive>

hive-------->mysql

1.查询hive test表中的数据
hive> select * from test;
OK
1 iPhone6s 132G 4700
2 iPhone6s 016G 4000
3 iPhone6s 12G 2400
4 iPhone5s 132G 400
5 iPhone4s 132G 400
6 iPhone7s 132G 400
7 xiaomiNoto 016G 700
8 huaweiX7 400G 7
9 sanxingX7s 100G 2
10 xiaomiH2 132G 5000
2.Hive存储在HDFS上的元数据
[hadoop@nameNode ~]$ hadoop fs -cat /user/hive/warehouse/test/ce.txt
16/04/22 15:44:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1 iPhone6s 132G 4700
2 iPhone6s 016G 4000
3 iPhone6s 12G 2400
4 iPhone5s 132G 400
5 iPhone4s 132G 400
6 iPhone7s 132G 400
7 xiaomiNoto 016G 700
8 huaweiX7 400G 7
9 sanxingX7s 100G 2
10 xiaomiH2 132G 5000
3.将Hive中的表导入到mysql中
使用:sqoop export --connect jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8 --username root --password root --table test --export-dir /user/hive/warehouse/test/ce.txt

出现bug

Error: java.io.IOException: Can't export data, please check failed map task logs        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)        at java.security.AccessController.doPrivileged(Native Method)        at javax.security.auth.Subject.doAs(Subject.java:422)        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)Caused by: java.lang.RuntimeException: Can't parse input data: '1       iPhone6s        132G    4700'        at test.__loadFromFields(test.java:335)        at test.parse(test.java:268)        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)        ... 10 moreCaused by: java.lang.NumberFormatException: For input string: "1        iPhone6s        132G    4700"        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)        at java.lang.Integer.parseInt(Integer.java:580)        at java.lang.Integer.valueOf(Integer.java:766)        at test.__loadFromFields(test.java:317)        ... 12 more16/04/22 16:04:27 INFO mapreduce.Job: Task Id : attempt_1461290770642_0018_m_000001_0, Status : FAILEDError: java.io.IOException: Can't export data, please check failed map task logs        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)        at java.security.AccessController.doPrivileged(Native Method)        at javax.security.auth.Subject.doAs(Subject.java:422)        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)Caused by: java.lang.RuntimeException: Can't parse input data: '6       iPhone7s        132G    400'        at test.__loadFromFields(test.java:335)        at test.parse(test.java:268)        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)        ... 10 moreCaused by: java.lang.NumberFormatException: For input string: "6        iPhone7s        132G    400"        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)        at java.lang.Integer.parseInt(Integer.java:580)        at java.lang.Integer.valueOf(Integer.java:766)        at test.__loadFromFields(test.java:317)        ... 12 more

更改为:sqoop export --connect "jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8" --username root --password root --table test --input-fields-terminated-by '\t' --export-dir '/user/hive/warehouse/test/ce.txt' -m 2

查看数据是否导入mysql表中

Sqoop笔记

0 0