把数据从mysql导入到hdfs中

来源：互联网发布：手机截图软件编辑：程序博客网时间：2024/04/19 06:00

1.设置MySQL 数据库驱动

复制mysql-connector-java-5.1.6.jar 到sqoop的lib目录

2.Sqoop操作

sqoop help import 查看帮助信息
查看数据库列表

[root@single bin]# **./sqoop list-databases --connect jdbc:mysql://192.168.2.1:3306/ --username root --P**15/12/12 23:05:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6Enter password: 15/12/12 23:06:04 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.information_schemaalibabanewechshopmysqlnextappnumysqlperformance_schemasakilasysweixinmgrwordpressworld

执行导入命令将ts_user表导入到hdfs：

[root@single bin]#./sqoop import --connect jdbc:mysql://192.168.2.1:3306/weixinmgr --username root --P --table ts_user -m 1

用sqoop导入mysql出现错误

上搜索是mysql的bug，下载新的jar就可以了

[root@single bin]#wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.32.tar.gz99% [===============================================================================================================> ] 3,774,587   16.4K/s eta(英国中部时99% [===============================================================================================================> ] 3,780,347   17.2K/s eta(英国中部时99% [===============================================================================================================> ] 3,786,107   16.9K/s eta(英国中部时99% [===============================================================================================================> ] 3,791,867   17.6K/s eta(英国中部时100%[================================================================================================================>] 3,795,067   17.0K/s eta(英国中部时100%[================================================================================================================>] 3,795,067   17.0K/s   in 3m 32s  2015-12-12 23:29:13 (17.5 KB/s) - 已保存 “mysql-connector-java-5.1.32.tar.gz” [3795067/3795067])

重新执行

[hadoop@single bin]$ ./sqoop import --connect jdbc:mysql://192.168.2.1:3306/weixinmgr --username root --P --table ts_user -m 115/12/12 23:48:18 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6Enter password: 15/12/12 23:48:22 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.15/12/12 23:48:22 INFO tool.CodeGenTool: Beginning code generation15/12/12 23:48:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `ts_user` AS t LIMIT 115/12/12 23:48:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `ts_user` AS t LIMIT 115/12/12 23:48:23 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/java/hadoop-2.6.2注: /tmp/sqoop-hadoop/compile/ac79c6428b049576508e90af2cb05afe/ts_user.java使用或覆盖了已过时的 API。注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。15/12/12 23:48:26 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/ac79c6428b049576508e90af2cb05afe/ts_user.jar15/12/12 23:48:26 WARN manager.MySQLManager: It looks like you are importing from mysql.15/12/12 23:48:26 WARN manager.MySQLManager: This transfer can be faster! Use the --direct15/12/12 23:48:26 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.15/12/12 23:48:26 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)15/12/12 23:48:26 INFO mapreduce.ImportJobBase: Beginning import of ts_user15/12/12 23:48:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable15/12/12 23:48:27 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar15/12/12 23:48:28 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps15/12/12 23:48:28 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:803215/12/12 23:48:38 INFO db.DBInputFormat: Using read commited transaction isolation15/12/12 23:48:38 INFO mapreduce.JobSubmitter: number of splits:115/12/12 23:48:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449928075440_000215/12/12 23:48:43 INFO impl.YarnClientImpl: Submitted application application_1449928075440_000215/12/12 23:48:43 INFO mapreduce.Job: The url to track the job: http://single.hadoop.com:8088/proxy/application_1449928075440_0002/15/12/12 23:48:43 INFO mapreduce.Job: Running job: job_1449928075440_000215/12/12 23:49:31 INFO mapreduce.Job: Job job_1449928075440_0002 running in uber mode : false15/12/12 23:49:31 INFO mapreduce.Job:  map 0% reduce 0%15/12/12 23:49:51 INFO mapreduce.Job:  map 100% reduce 0%15/12/12 23:49:53 INFO mapreduce.Job: Job job_1449928075440_0002 completed successfully15/12/12 23:49:54 INFO mapreduce.Job: Counters: 30        File System Counters                FILE: Number of bytes read=0                FILE: Number of bytes written=124049                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=87                HDFS: Number of bytes written=40                HDFS: Number of read operations=4                HDFS: Number of large read operations=0                HDFS: Number of write operations=2        Job Counters                 Launched map tasks=1                Other local map tasks=1                Total time spent by all maps in occupied slots (ms)=15767                Total time spent by all reduces in occupied slots (ms)=0                Total time spent by all map tasks (ms)=15767                Total vcore-seconds taken by all map tasks=15767                Total megabyte-seconds taken by all map tasks=16145408        Map-Reduce Framework                Map input records=1                Map output records=1                Input split bytes=87                Spilled Records=0                Failed Shuffles=0                Merged Map outputs=0                GC time elapsed (ms)=178                CPU time spent (ms)=2150                Physical memory (bytes) snapshot=101388288                Virtual memory (bytes) snapshot=2060509184                Total committed heap usage (bytes)=28442624        File Input Format Counters                 Bytes Read=0        File Output Format Counters                 Bytes Written=4015/12/12 23:49:54 INFO mapreduce.ImportJobBase: Transferred 40 bytes in 85.9276 seconds (0.4655 bytes/sec)15/12/12 23:49:54 INFO mapreduce.ImportJobBase: Retrieved 1 records.

查看结果

[hadoop@single hadoop-2.6.2]$ bin/hadoop fs -lsdrwxr-xr-x   - hadoop supergroup          0 2015-12-12 17:04 outputdrwxr-xr-x   - hadoop supergroup          0 2015-12-12 23:49 ts_user[hadoop@single hadoop-2.6.2]$ bin/hadoop fs -text ts_user/part-m-00000结果：  1,admin,admin,0,1,2015-12-03,2015-12-23

0 0