Sqoop 1.99.7 客户端操作实践
来源:互联网 发布:手机上能编程的软件 编辑:程序博客网 时间:2024/06/02 00:57
自1.99.7开始,客户端的一些操作关键字和参数发生了变化,网上很多还是早期版本,对照操作时难免一些命令或选项会报错。这里把本地实际操作过程及注意事项记录如下,作后续参考。
蓝色表示输入的内容, //后后面的为备注和解释,红色表示注意的地方。
1. bin/sqoop.sh client 启动客户端
hadoop@master:~/bigdata/sqoop$ bin/sqoop.sh client
Setting conf dir: bin/../conf
Sqoop home directory: /home/hadoop/bigdata/sqoop
Sqoop Shell: Type 'help' or '\h' for help.
sqoop:000>
sqoop:000> show version
client version:
Sqoop 1.99.7 source revision 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
Compiled by abefine on Tue Jul 19 16:08:27 PDT 2016
sqoop:000>
sqoop:000> set server --host master --port 12000 --webapp sqoop // host 即hadoop NameNode host
Server is set successfully
sqoop:000> show connector
0 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable // hadoop checknative -a 能显示多数库都是true即可,目前好像不受影响。可能是so不存在,so的格式不对如32bit, 64bit 等, refer to http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html
+------------------------+---------+------------------------------------------------------------+----------------------+
| Name | Version | Class | Supported Directions |
+------------------------+---------+------------------------------------------------------------+----------------------+
| generic-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO |
| kite-connector | 1.99.7 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO |
| oracle-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO |
| ftp-connector | 1.99.7 | org.apache.sqoop.connector.ftp.FtpConnector | TO |
| hdfs-connector | 1.99.7 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO |
| kafka-connector | 1.99.7 | org.apache.sqoop.connector.kafka.KafkaConnector | TO |
| sftp-connector | 1.99.7 | org.apache.sqoop.connector.sftp.SftpConnector | TO |
+------------------------+---------+------------------------------------------------------------+----------------------+
因为知识产权的原因,sqoop 默认支持以上7种connector,最后一列"Supported Directions" 表示是否支持数据双向传输
2. 创建链接
sqoop:000> create link -c link1_generic_jdbc // 以前版本中的--cid 选项已不支持
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0003:Connector do not exists - link1_generic_jdbc
sqoop:000> create link -c generic-jdbc-connector // -c 的参数必须是上面connector之一
Creating link for connector with name generic-jdbc-connector
Please fill following values to create new link object
Name: link1_generic_jdbc // 任意名字以表示这个link
Database connection
Driver class: com.mysql.jdbc.Driver
Connection String: jdbc:mysql://master/mydb
Username: root
Password: ******
Fetch Size:
Connection Properties:
There are currently 0 values in the map:
entry#
SQL Dialect
Identifier enclose: // 这里一定不能什么也不输就回车,输入空格就行,否则start job时会报“GENERIC_JDBC_CONNECTOR_0016:Can't fetch schema -" 类型错误,说明如下: “Identifier enclose:指定SQL中标识符的定界符,也就是说,有的SQL标示符是一个引号:select * from "table_name",这种定界符在MySQL中是会报错的。这个属性默认值就是双引号,所以不能使用回车,必须有值将之覆盖,如没什么影响的空格。
New link was successfully created with validation status WARNING and name link1_generic_jdbc
sqoop:000>
sqoop:000> show link
+--------------------+------------------------+---------+
| Name | Connector Name | Enabled |
+--------------------+------------------------+---------+
| link1_generic_jdbc | generic-jdbc-connector | true |
+--------------------+------------------------+---------+
sqoop:000> create link -c hdfs-connector
Creating link for connector with name hdfs-connector
Please fill following values to create new link object
Name: link2_hdfs
HDFS cluster
URI: hdfs://master:9000 // hdfs 中配置的defaultFS
Conf directory:
Additional configs::
There are currently 0 values in the map:
entry#
New link was successfully created with validation status OK and name link2_hdfs
sqoop:000> show link
+--------------------+------------------------+---------+
| Name | Connector Name | Enabled |
+--------------------+------------------------+---------+
| link1_generic_jdbc | generic-jdbc-connector | true |
| link2_hdfs | hdfs-connector | true |
+--------------------+------------------------+---------+
3. 创建job
sqoop:000> create job -f "link1_generic_jdbc" -t "link2_hdfs" // from link1 to link2
Creating job for links with from name link1_generic_jdbc and to name link2_hdfs
Please fill following values to create new job object
Name: jdbcToHdfs
Database source
Schema name: mysql // mysql 中和数据库名字一致,ie.d db name
Table name: user // mysql table name
SQL statement:
Column names:
There are currently 0 values in the list:
element#
Partition column:
Partition column nullable:
Boundary query:
Incremental read
Check column:
Last value:
Target configuration
Override null value:
Null value:
File format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
2 : PARQUET_FILE
Choose: 0
Compression codec:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom codec: // 任意输入后可能会再最后一步报错,修改后继续报错,结束这次的create job 命令,再来一次即可
Output directory: /sqoop // 导向hadoop后的目的目录
Append mode:
Throttling resources
Extractors: 1 //这两个值随意,没太了解具体含义
Loaders: 1
Classpath configuration
Extra mapper jars:
There are currently 0 values in the list:
element#
New job was successfully created with validation status OK and name jdbcToHdfs
sqoop:000>
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: COMMON_0023:Unable to create new job data - ///// 已经存在相同名字的job, 改名再建一个
sqoop:000> show job
+----+---------------+---------------------------------------------+-----------------------------+---------+
| Id | Name | From Connector | To Connector | Enabled |
+----+---------------+---------------------------------------------+-----------------------------+---------+
| 1 | jdbcToHdfs | link1_generic_jdbc (generic-jdbc-connector) | link2_hdfs (hdfs-connector) | true |
+----+---------------+---------------------------------------------+-----------------------------+---------+
4. 执行任务
sqoop:000> start job -n jdbcToHdfs
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_JDBC_CONNECTOR_0001:Unable to get a connection - // 通信失败,日志有限,打开日志开关看下具体原因
qoop:000> set option --name verbose --value true // enable verbose mode, 以便能看到更多日志
Verbose option was changed to true
Caused by: Exception: java.lang.Throwable Message: Unknown database 'mydb' // 前面创建link1的时候填的数据路名字不存在,要改成真实存在的DB名字后再执行
Stack trace:
at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor (GenericJdbcExecutor.java:121)
at org.apache.sqoop.connector.jdbc.GenericJdbcFromInitializer (GenericJdbcFromInitializer.java:58)
at org.apache.sqoop.connector.jdbc.GenericJdbcFromInitializer (GenericJdbcFromInitializer.java:47)
at org.apache.sqoop.driver.JobManager$1 (JobManager.java:520)
sqoop:000> status job -n jdbcToHdfs
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: MAPREDUCE_0003:Can't get RunningJob instance -
Caused by: Exception: java.io.IOException Message: java.net.ConnectException: Call From master/10.76.9.230 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Stack trace:
at org.apache.sqoop.submission.mapreduce.MapreduceSubmissionEngine (MapreduceSubmissionEngine.java:484)
at org.apache.sqoop.driver.JobManager (JobManager.java:714)
Hadoop 的historyServer没启动,
etc/hadoop/mapred-site.xml:
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
hadoop@master:~/bigdata/hadoop/sbin$ ./mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /home/hadoop/bigdata/hadoop/logs/mapred-hadoop-historyserver-master.out
hadoop@master:~/bigdata/hadoop/sbin$ jps | grep JobHistoryServe
1717 JobHistoryServer
hadoop@master:~/bigdata/hadoop/sbin$
hadoop@master: hdfs dfs -ls / // 可查看HDFS中是否有数据导入
sqoop:000> status job -n jdbc2hdfs_2
Submission details
Job Name: jdbc2hdfs_2
Server URL: http://master:12000/sqoop/
Created by: hadoop
Creation date: 2017-06-15 10:41:25 CST
Lastly updated by: hadoop
External ID: job_1497255721792_0002
http://master:8088/proxy/application_1497255721792_0002/
2017-06-15 10:41:38 CST: RUNNING - 0.00 %
sqoop:000> status job -n jdbc2hdfs_2
Submission details
Job Name: jdbc2hdfs_2
Server URL: http://master:12000/sqoop/
Created by: hadoop
Creation date: 2017-06-15 10:41:25 CST
Lastly updated by: hadoop
External ID: job_1497255721792_0002
http://master:8088/proxy/application_1497255721792_0002/
2017-06-15 10:42:07 CST: FAILED
Exception: Job Failed with status:3
通过任务url进入浏览器,值显示失败,没显示具体原因,后续还得再查下原因。
- Sqoop 1.99.7 客户端操作实践
- Sqoop操作实践
- SQOOP 操作
- sqoop入门实践
- sqoop同步数据实践
- RabbitMQ实践--安装、JAVA客户端操作
- Sqoop-1.99.7安装部署
- Sqoop-1.99.7安装、配置
- Sqoop 数据导入导出实践
- sqoop操作与使用
- sqoop基本 操作
- sqoop的常见操作
- Sqoop 操作笔记
- sqoop基本操作
- Java操作Sqoop对象
- sqoop工具操作
- java操作sqoop
- JAVA操作SQOOP对象
- 设计模式
- table列表中不换行处理(适应不同分辨率)
- Android GridView的用法总结
- 《UNIX网络编程 卷1》 笔记: 原始套接字—ping程序
- 时间复杂度和空间复杂度详解
- Sqoop 1.99.7 客户端操作实践
- linux ftp 文件修改时间 ModifiedDate 与本地相差 8小时
- spring配置文件详解
- ubantu网卡驱动安装过程
- 引用计数的原理和实例
- mysql 存储引擎 索引数据结构
- 分针网—IT教育:关于php 高并发解决的一点思路
- HTML 5 Web 存储 与 应用缓存
- 设置图案解锁后,开机走到解锁界面时等待几分钟会出现咔咔声