Sqoop1 从Oracle往Hive迁移数据
来源:互联网 发布:虚拟货币网站源码php 编辑:程序博客网 时间:2024/06/05 23:38
本篇幅讲述如何利用CDH Hadoop版本上面的Sqoop1从Oracle往Hive迁移数据,例子比较简单,仅供参考。
1 列出CDH目录下面的sqoop所有相关命令
[root@n12 bin]# pwd/opt/cloudera/parcels/CDH/bin[root@n12 bin]# ll sqoop*-rwxr-xr-x 1 root root 918 Jul 23 2016 sqoop-rwxr-xr-x 1 root root 1486 Jul 23 2016 sqoop2-rwxr-xr-x 1 root root 1795 Jul 23 2016 sqoop2-server-rwxr-xr-x 1 root root 1842 Jul 23 2016 sqoop2-tool-rwxr-xr-x 1 root root 926 Jul 23 2016 sqoop-codegen-rwxr-xr-x 1 root root 936 Jul 23 2016 sqoop-create-hive-table-rwxr-xr-x 1 root root 923 Jul 23 2016 sqoop-eval-rwxr-xr-x 1 root root 925 Jul 23 2016 sqoop-export-rwxr-xr-x 1 root root 923 Jul 23 2016 sqoop-help-rwxr-xr-x 1 root root 925 Jul 23 2016 sqoop-import-rwxr-xr-x 1 root root 936 Jul 23 2016 sqoop-import-all-tables-rwxr-xr-x 1 root root 922 Jul 23 2016 sqoop-job-rwxr-xr-x 1 root root 933 Jul 23 2016 sqoop-list-databases-rwxr-xr-x 1 root root 930 Jul 23 2016 sqoop-list-tables-rwxr-xr-x 1 root root 924 Jul 23 2016 sqoop-merge-rwxr-xr-x 1 root root 928 Jul 23 2016 sqoop-metastore-rwxr-xr-x 1 root root 926 Jul 23 2016 sqoop-version[root@n12 bin]# sqoop helpWarning: /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.17/02/24 14:29:04 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.2usage: sqoop COMMAND [ARGS]Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version informationSee 'sqoop help COMMAND' for information on a specific command.
2 根据上述结果可知,运行sqoop命令,可以使用”sqoop tool-name [tool-arguments]”运行,也可以使用对应的Aliaes,如”sqoop import …”可以直接运行”sqoop-import”命令
3 sqoop支持从多种RDBMS与Hadoop平台的数据导入导出,典型的有Oracle,Mysql,Postgresql,具体请参考官网:http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_supported_databases
4 这里以从Oracle往Hive抽取数据为例,首先需要准备Oracle JDBC驱动ojdbc6.jar,Jar请自行从网上下载,这里不再赘述;将下载的ojdbc6.jar放到$SQOOP_HOME/lib目录下,如下
/opt/cloudera/parcels/CDH/lib/sqoop/lib[root@n12 lib]# ll ojdbc6.jar-rwxrwxrwx 1 root root 2739670 Feb 24 14:50 ojdbc6.jar
5 执行”sqoop list-tables”命令列出Oracle中的表,
[root@n12 bin]# sqoop-list-tables --driver oracle.jdbc.OracleDriver --connect jdbc:oracle:thin:@10.10.11.16:1521/esgyn --username system --password 12345Warning: /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.17/02/24 14:58:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.217/02/24 14:58:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.17/02/24 14:58:53 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.17/02/24 14:58:53 INFO manager.SqlManager: Using default fetchSize of 1000SYS_IOT_OVER_70800WWV_COLUMN_EXCEPTIONSWWV_FLOWSWWV_FLOWS_RESERVEDWWV_FLOW_ACTIVITY_LOG1$WWV_FLOW_ACTIVITY_LOG2$WWV_FLOW_ACTIVITY_LOG_NUMBER$WWV_FLOW_ALTERNATE_CONFIGWWV_FLOW_ALT_CONFIG_DETAILWWV_FLOW_ALT_CONFIG_PICKWWV_FLOW_APPLICATION_GROUPSWWV_FLOW_APP_BUILD_PREFWWV_FLOW_APP_COMMENTSWWV_FLOW_BANNERWWV_FLOW_BUILDER_AUDIT_TRAILWWV_FLOW_BUTTON_TEMPLATESWWV_FLOW_CALS...
6 执行”sqoop import”命令抽取Oracle中的一个表到Hive,如下
[root@n12 bin]# sqoop-import --connect jdbc:oracle:thin:@10.10.11.16:1521/esgyn --username system --password 12345 --table TEST_GROUPBY --split-by A --hive-import --create-hive-table --hive-table test_groupby --hive-overwrite --null-string '\\N' --null-non-string '\\N' --hive-drop-import-delims --verboseWarning: /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.17/02/24 15:08:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.217/02/24 15:08:46 DEBUG tool.BaseSqoopTool: Enabled debug logging....Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/jars/hive-common-1.1.0-cdh5.7.2.jar!/hive-log4j.propertiesOKTime taken: 2.409 secondsLoading data to table default.test_groupbychgrp: changing ownership of 'hdfs://n11.trafodion.local:8020/user/hive/warehouse/test_groupby/part-m-00000': User does not belong to hivechgrp: changing ownership of 'hdfs://n11.trafodion.local:8020/user/hive/warehouse/test_groupby/part-m-00001': User does not belong to hivechgrp: changing ownership of 'hdfs://n11.trafodion.local:8020/user/hive/warehouse/test_groupby/part-m-00002': User does not belong to hivechgrp: changing ownership of 'hdfs://n11.trafodion.local:8020/user/hive/warehouse/test_groupby/part-m-00003': User does not belong to hiveTable default.test_groupby stats: [numFiles=4, numRows=0, totalSize=60, rawDataSize=0]OKTime taken: 0.886 seconds
7 在Hive CLI中检查是否导入成功
hive> show tables;test_groupbyTime taken: 1.726 seconds, Fetched: 11 row(s)hive> select * from test_groupby;OK1.0 A B NULL NULL2.0 A C NULL NULLTime taken: 0.755 seconds, Fetched: 2 row(s)
8 除了可以直接使用”sqoop-import”这种命令直接运行sqoop命令,也可以使用”sqoop-job”将一个sqoop命令保存为一个job,这样可以在需要的时间再调用它,而且可以重复执行。使用–create创建job,–list查看job,–delete删除job,–exec执行job
[root@n12 bin]# sqoop-job --create job1 -- import --connect jdbc:oracle:thin:@10.10.11.16:1521/esgyn --username system --password 12345 --table TEST_GROUPBY --split-by A --hive-import --create-hive-table --hive-table test_groupby --hive-overwrite --null-string '\\N' --null-non-string '\\N' --hive-drop-import-delims --verboseWarning: /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.17/02/24 17:00:43 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.217/02/24 17:00:44 DEBUG tool.BaseSqoopTool: Enabled debug logging....[root@n12 bin]# sqoop-job --listWarning: /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.17/02/24 17:02:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.2Available jobs: job1[root@n12 bin]# sqoop-job --delete job1Warning: /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.17/02/24 17:03:12 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.2
- Sqoop1 从Oracle往Hive迁移数据
- hive元数据从mysql迁移到oracle
- OOzie调度sqoop1 Action 从mysql导入数据到hive
- sqoop1.4.2 从Oracle装载数据到HDFS
- 使用sqoop1,将sqlserver数据导入hive
- 数据迁移:从SqlServer到Oracle
- 从Sql Server迁移数据到Oracle
- 从oracle迁移数据到GreenPlum
- 数据迁移,从mysql到Oracle
- GoldenGate从oracle迁移数据到mysql
- HIVE 数据迁移
- hive数据迁移
- Hive数据迁移
- hive数据迁移
- hive数据迁移
- hive数据迁移 导表
- Hive 批量数据迁移
- 使用sqoop将oracle数据迁移到hive中
- 程序到进程
- javascript 中闭包
- 郑俊雅团队设计的布娃娃,好萌好可爱!
- jQuery大全
- RabbitMQ使用简记
- Sqoop1 从Oracle往Hive迁移数据
- redis安全策略
- accp7.0S2优化myschool数据库设计第五章项目
- codeforces 768C
- AppletViewer使用方法介绍
- SwaggerUI ASP.Net WebAPI2
- 理解spark闭包
- mybatis多对多关系映射
- GreenDAO