sqoop-1.4.6安装及常见问题解决

来源:互联网 发布:steam mac dota2 编辑:程序博客网 时间:2024/06/15 02:25

环境:ubuntu16.0.4,hadoop-2.7.3,hbase-1.3.0,zookeeper-3.4.10,hive-2.1.1


从网上下载sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz。点我下载


然后解压到用户主目录下面

ubuntuserver@Master:~$tar -zxvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz

重命名

ubuntuserver@Master:~$sudo mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha sqoop-1.4.6

移动到/usr/local

ubuntuserver@Master:~$sudo mv sqoop-1.4.6 /usr/local/

修改sqoop-env.sh文件

ubuntuserver@Master:/usr/local/sqoop-1.4.6/conf$sudo cp sqoop-env-template.sh sqoop-env.sh

ubuntuserver@Master:/usr/local/sqoop-1.4.6/conf$sudo vim sqoop-env.sh

修改内容如下:

# Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# included in all the hadoop scripts with source command# should not be executable directly# also should not be passed any arguments, since we need original $*# Set Hadoop-specific environment variables here.#Set path to where bin/hadoop is available#export HADOOP_COMMON_HOME=export HADOOP_COMMON_HOME=/usr/local/hadoop#Set path to where hadoop-*-core.jar is available#export HADOOP_MAPRED_HOME=export HADOOP_MAPRED_HOME=/usr/local/hadoop#set the path to where bin/hbase is available#export HBASE_HOME=export HBASE_HOME=/usr/local/hbase-1.3.0#Set the path to where bin/hive is available#export HIVE_HOME=export HIVE_HOME=/usr/local/apache-hive-2.1.1-bin#Set the path for where zookeper config dir is#export ZOOCFGDIR=export ZOOCFGDIR=/usr/local/zookeeper-3.4.10

修改环境变量

ubuntuserver@Master:~$sudo vim /etc/profile

添加以下内容:

export SQOOP_HOME=/usr/local/sqoop-1.4.6

export PATH=$SQOOP_HOME/bin:$PATH


保存文件,执行source /etc/profile使环境变量生效。


执行configure-sqoop命令,如果报如下错误的话:

ubuntuserver@Master:~$ configure-sqoopWarning: /usr/local/sqoop-1.4.6/../hcatalog does not exist! HCatalog jobs will fail.Please set $HCAT_HOME to the root of your HCatalog installation.Warning: /usr/local/sqoop-1.4.6/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.ubuntuserver@Master:~$ 

在$SQOOP_HOME/bin目录下面修改configure-sqoop文件,注释掉以下内容:


#if [ -z "${HCAT_HOME}" ]; then#  if [ -d "/usr/lib/hive-hcatalog" ]; then#    HCAT_HOME=/usr/lib/hive-hcatalog#  elif [ -d "/usr/lib/hcatalog" ]; then#    HCAT_HOME=/usr/lib/hcatalog#  else#    HCAT_HOME=${SQOOP_HOME}/../hive-hcatalog#    if [ ! -d ${HCAT_HOME} ]; then#       HCAT_HOME=${SQOOP_HOME}/../hcatalog#    fi#  fi#fi#if [ -z "${ACCUMULO_HOME}" ]; then#  if [ -d "/usr/lib/accumulo" ]; then#    ACCUMULO_HOME=/usr/lib/accumulo#  else#    ACCUMULO_HOME=${SQOOP_HOME}/../accumulo#  fi#fi## Moved to be a runtime check in sqoop.#if [ ! -d "${HCAT_HOME}" ]; then#  echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."#  echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'#fi#if [ ! -d "${ACCUMULO_HOME}" ]; then#  echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."#  echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'#fi


接下来验证是否可以使用。先执行start-dfs.sh与start-yarn.sh启动hadoop集群。然后执行hive命令进入到CLI。



现在可以看到要上传的表(techbbs_2017_06_18),表结构,已经表的内容。

然后进入到mysql,创建数据库以及表。


验证是否可以连接mysql:

执行命令:sqoop list-databases -connect jdbc:mysql://localhost:3306 -username root -password 123456


注意,经本人实验,这里的连接语句中的localhost要根据hive的元数据库配置来写,如果是使用的本地mysql数据库作为元数据库,则需要写localhost,如果是使用的远程服务器的mysql作为hive的元数据库,则应该要写ip地址。

可以看到红色框中有个techbbs,表明连接成功了。


现在开始从hdfs导入数据到mysql中,执行以下命令:

sqoop export --connect jdbc:mysql://localhost:3306/techbbs --username root --password 123456 --table techbbs_logs_stat --fields-terminated-by '\001' --export-dir '/user/hive/warehouse/techbbs_2017_06_18'

注意该命令中没有换行。


显示成功导入了一条数据到mysql中,现在到mysql中看是否一致:


可以看到mysql中的数据跟在hdfs中查询到的数据是完全一致的。

如果在导入的时候报如下的错误:

java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated        at java.util.ServiceLoader.fail(ServiceLoader.java:232)        at java.util.ServiceLoader.access$100(ServiceLoader.java:185)        at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)        at java.util.ServiceLoader$1.next(ServiceLoader.java:480)        at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2631)        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2650)        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)        at org.apache.sqoop.mapreduce.ExportJobBase.getInputPath(ExportJobBase.java:223)        at org.apache.sqoop.mapreduce.ExportJobBase.getInputFileType(ExportJobBase.java:505)        at org.apache.sqoop.mapreduce.JdbcExportJob.configureInputFormat(JdbcExportJob.java:68)        at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:424)        at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:912)        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:81)        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)Caused by: java.lang.NoClassDefFoundError: com/amazonaws/AmazonServiceException        at java.lang.Class.getDeclaredConstructors0(Native Method)        at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)        at java.lang.Class.getConstructor0(Class.java:3075)        at java.lang.Class.newInstance(Class.java:412)        at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)        ... 23 moreCaused by: java.lang.ClassNotFoundException: com.amazonaws.AmazonServiceException        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)        ... 28 more

解决方法:

把$HADOOP_HOME/share/hadoop/tools/lib中的aws-java-sdk*.jar文件复制到sqoop的lib目录当中,则不会报错了。

可以参考点此参考该问题解决方法。

原创粉丝点击