sqoop导入导出表/数据到Hive
来源:互联网 发布:短文本相似算法 编辑:程序博客网 时间:2024/04/29 19:33
笔记:
将关系型数据库表及表中的数据复制到hive中
sqoop import : RMDBS——>hive
语法:
sqoop import --connect jdbc:mysql://IP:PORT/database --username root --password PWD --table tablename --hive-import --hive-table hivetable -m 1
[hadoop@nameNode src]$ sqoop import --connect jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8 --username root --password root --table t_user --hive-import --hive-table hivetest1 -m 1
以下是返回结果:
16/04/22 15:16:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.616/04/22 15:16:48 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.16/04/22 15:16:48 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override16/04/22 15:16:48 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.16/04/22 15:16:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.16/04/22 15:16:48 INFO tool.CodeGenTool: Beginning code generation16/04/22 15:16:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 116/04/22 15:16:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 116/04/22 15:16:49 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop/src/hadoop-2.5.0注: /tmp/sqoop-hadoop/compile/14d08d5f7ecc890e985eeb17942619db/t_user.java使用或覆盖了已过时的 API。注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。16/04/22 15:16:54 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/14d08d5f7ecc890e985eeb17942619db/t_user.jar16/04/22 15:16:54 WARN manager.MySQLManager: It looks like you are importing from mysql.16/04/22 15:16:54 WARN manager.MySQLManager: This transfer can be faster! Use the --direct16/04/22 15:16:54 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.16/04/22 15:16:54 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)16/04/22 15:16:54 INFO mapreduce.ImportJobBase: Beginning import of t_user16/04/22 15:16:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable16/04/22 15:16:55 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar16/04/22 15:16:56 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps16/04/22 15:16:56 INFO client.RMProxy: Connecting to ResourceManager at nameNode/192.168.2.246:803216/04/22 15:17:00 INFO db.DBInputFormat: Using read commited transaction isolation16/04/22 15:17:00 INFO mapreduce.JobSubmitter: number of splits:116/04/22 15:17:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461290770642_001316/04/22 15:17:01 INFO impl.YarnClientImpl: Submitted application application_1461290770642_001316/04/22 15:17:01 INFO mapreduce.Job: The url to track the job: http://nameNode:8088/proxy/application_1461290770642_0013/16/04/22 15:17:01 INFO mapreduce.Job: Running job: job_1461290770642_001316/04/22 15:17:16 INFO mapreduce.Job: Job job_1461290770642_0013 running in uber mode : false16/04/22 15:17:16 INFO mapreduce.Job: map 0% reduce 0%16/04/22 15:17:25 INFO mapreduce.Job: map 100% reduce 0%16/04/22 15:17:25 INFO mapreduce.Job: Job job_1461290770642_0013 completed successfully16/04/22 15:17:25 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=118487 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=1111 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=6570 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=6570 Total vcore-seconds taken by all map tasks=6570 Total megabyte-seconds taken by all map tasks=6727680 Map-Reduce Framework Map input records=11 Map output records=11 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=95 CPU time spent (ms)=1420 Physical memory (bytes) snapshot=107536384 Virtual memory (bytes) snapshot=2064531456 Total committed heap usage (bytes)=30474240 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=111116/04/22 15:17:25 INFO mapreduce.ImportJobBase: Transferred 1.085 KB in 29.4684 seconds (37.7015 bytes/sec)16/04/22 15:17:25 INFO mapreduce.ImportJobBase: Retrieved 11 records.16/04/22 15:17:25 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 116/04/22 15:17:25 WARN hive.TableDefWriter: Column create_date had to be cast to a less precise type in Hive16/04/22 15:17:26 INFO hive.HiveImport: Loading uploaded data into Hive16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative16/04/22 15:17:29 INFO hive.HiveImport: 16/04/22 15:17:29 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/hadoop/hadoop/src/hive-0.12.0-bin/lib/hive-common-0.12.0.jar!/hive-log4j.properties16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/src/hadoop-2.5.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/src/hive-0.12.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]16/04/22 15:17:39 INFO hive.HiveImport: OK16/04/22 15:17:39 INFO hive.HiveImport: Time taken: 8.927 seconds16/04/22 15:17:39 INFO hive.HiveImport: Loading data to table default.hivetest116/04/22 15:17:39 INFO hive.HiveImport: Table default.hivetest1 stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 1111, raw_data_size: 0]16/04/22 15:17:39 INFO hive.HiveImport: OK16/04/22 15:17:39 INFO hive.HiveImport: Time taken: 0.789 seconds16/04/22 15:17:40 INFO hive.HiveImport: Hive import complete.16/04/22 15:17:40 INFO hive.HiveImport: Export directory is empty, removing it.
显示这样说明OK Sqoop导入到Hive中成功
查询Hive结果
hive> <span style="color:#ff0000;">show databases;</span>OKdefaultTime taken: 6.138 seconds, Fetched: 1 row(s)hive><span style="color:#ff0000;"> use default;</span>OKTime taken: 0.038 secondshive> <span style="color:#ff0000;">show tables;</span>OKhivetest1testTime taken: 0.336 seconds, Fetched: 2 row(s)hive> select * from hivetest1;OK6 2014-01-12 13:36:19.0 admin@163.com admin f6fdffe48c908deb0f4c3bd36c032e72 1234567890 1 admin45 2015-04-01 23:55:52.0 zhenjavax@1.com internet4 6512bd43d9caa6e02c990b0a82652dca 134567890 1 146 2015-04-02 17:10:18.0 zhenja222222vax@1.com internet3 b59c67bf196a4758191e42f76670ceba 1234567890 1 11147 2015-04-05 00:43:11.0 lm@163.com lm 94a4ebeb834f927228209c2dd71c10e5 1234567890 1 lm48 2015-04-05 03:39:19.0 admin@163.com internet1 c6c73fee8fd9af7d3f1abd1611a40b32 123456790- 1 internet149 2015-04-05 03:39:53.0 admin@163.com internet2 79e5b03556b7e87b508b5977d5f3f7e0 internet2 1 internet250 2015-04-05 03:41:50.0 lm@163.com internet5 9ac010daa64dc3065b7b6eb584d231ae 123456780 1 internet551 2015-04-05 03:42:24.0 admin@163.com internet6 1614782b1a29db28ce58ce1173d97ee0 123456780 1 internet652 2015-04-05 03:42:55.0 lm@163.com internet7 6f3edab07447258b53b2f821187f6f1f 123456780 1 internet753 2015-04-05 03:43:21.0 admin@163.com internet8 c7ef3b5c4827d840d0c264a1c06f1717 123456780 1 internet854 2015-04-05 03:43:53.0 admin@163.com internet9 33c7785a99eeda46b8feb7a18b8dc45 123456780 1 internet9Time taken: 3.335 seconds, Fetched: 11 row(s)hive>
hive-------->mysql
1.查询hive test表中的数据hive> select * from test;
OK
1 iPhone6s 132G 4700
2 iPhone6s 016G 4000
3 iPhone6s 12G 2400
4 iPhone5s 132G 400
5 iPhone4s 132G 400
6 iPhone7s 132G 400
7 xiaomiNoto 016G 700
8 huaweiX7 400G 7
9 sanxingX7s 100G 2
10 xiaomiH2 132G 5000
2.Hive存储在HDFS上的元数据
[hadoop@nameNode ~]$ hadoop fs -cat /user/hive/warehouse/test/ce.txt
16/04/22 15:44:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1 iPhone6s 132G 4700
2 iPhone6s 016G 4000
3 iPhone6s 12G 2400
4 iPhone5s 132G 400
5 iPhone4s 132G 400
6 iPhone7s 132G 400
7 xiaomiNoto 016G 700
8 huaweiX7 400G 7
9 sanxingX7s 100G 2
10 xiaomiH2 132G 5000
3.将Hive中的表导入到mysql中
使用:sqoop export --connect jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8 --username root --password root --table test --export-dir /user/hive/warehouse/test/ce.txt
出现bug
Error: java.io.IOException: Can't export data, please check failed map task logs at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)Caused by: java.lang.RuntimeException: Can't parse input data: '1 iPhone6s 132G 4700' at test.__loadFromFields(test.java:335) at test.parse(test.java:268) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) ... 10 moreCaused by: java.lang.NumberFormatException: For input string: "1 iPhone6s 132G 4700" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.valueOf(Integer.java:766) at test.__loadFromFields(test.java:317) ... 12 more16/04/22 16:04:27 INFO mapreduce.Job: Task Id : attempt_1461290770642_0018_m_000001_0, Status : FAILEDError: java.io.IOException: Can't export data, please check failed map task logs at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)Caused by: java.lang.RuntimeException: Can't parse input data: '6 iPhone7s 132G 400' at test.__loadFromFields(test.java:335) at test.parse(test.java:268) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) ... 10 moreCaused by: java.lang.NumberFormatException: For input string: "6 iPhone7s 132G 400" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.valueOf(Integer.java:766) at test.__loadFromFields(test.java:317) ... 12 more
更改为:sqoop export --connect "jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8" --username root --password root --table test --input-fields-terminated-by '\t' --export-dir '/user/hive/warehouse/test/ce.txt' -m 2
查看数据是否导入mysql表中
0 0
- sqoop导入导出表/数据到Hive
- 利用sqoop将hive数据导入导出数据到mysql
- 利用sqoop将hive数据导入导出数据到mysql
- 利用sqoop将hive数据导入导出数据到mysql
- 利用sqoop将hive数据导入导出数据到mysql
- 利用sqoop将hive数据导入导出数据到mysql
- 利用sqoop将hive数据导入导出数据到mysql
- 利用sqoop将hive数据导入导出数据到mysql
- sqoop导出hive表数据到mysql
- Sqoop数据迁移,工作机制,sqoop安装(配置),Sqoop的数据导入,导入表数据到HDFS,导入关系表到HIVE,导入到HDFS指定目录,导入表数据子集,按需导入,增量导入,sqoop数据导出
- Hive数据导入、sqoop数据导入导出
- sqoop导入数据到hive
- sqoop导入数据到hive
- Sqoop导入数据到Hive
- sqoop 导出 hive分区表 数据到 mysql
- 利用sqoop导出hive数据到 oracle
- sqoop把hive表数据导入到mysql中
- 使用Sqoop将Hive表数据导入到mysql
- InetAddress.getLocalHost().getHostAddress()获取的ip为127.0.0.1
- 使用Intent传递对象--Android学习笔记4
- 最近面试Android的一些面试题
- 16-4.21随笔
- 10、用户与安全
- sqoop导入导出表/数据到Hive
- 返回固定长度的字符串,必须有数字,大字母小写字母,不能连续4位以上
- jtable中某列实现html中a标签效果,鼠标移到上面去的时候显示手型效果
- Android复习笔记1
- 1035. Password (20)
- JSP+Servlet使用commons-fileupload上传文件连接被重置解决方案
- linux挂载文件系统过程-转载内容与自己的一些理解
- OpenCV2.4.10之samples_cpp_tutorial-code_learn-----ImgTrans(仿射变换)
- POJ 2955 Brackets