读取hdfs文件内容导入mysql(续)

来源:互联网 发布:单点登录认证系统源码 编辑:程序博客网 时间:2024/06/06 16:54

现在想单独的写个类实现读取hdfs文件内容导入mysql,也就是使用java api 来写main方法那种形式来实现。


Configuration conf = new Configuration(true);conf.set("fs.default.name", "hdfs://<span style="font-family: Arial, Helvetica, sans-serif;">cluster2</span>");conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");FileSystem fs = null;//try {    fs = FileSystem.get(conf);} catch (Exception e) {    LOG.error("getFileSystem failed :"  + e.getMessage());}

但是上述内容会报错,java.net.UnknownHostException:   hdfs://cluster2


至此,因为是hadoop yarn 2.2,所以根据 http://www.oschina.net/code/snippet_121248_34430 博文中的配置,增加了conf中的属性。


修正如下

conf = new Configuration();conf.set("fs.defaultFS", "hdfs://cluster2");conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");conf.set("ha.zookeeper.quorum",    "xx:2181,xx:2181,xx:2181");conf.set("dfs.nameservices", "cluster2");conf.set("dfs.ha.namenodes.cluster2", "nn1,nn2");conf.set("dfs.namenode.rpc-address.cluster2.nn1", "xx:8020");conf.set("dfs.namenode.rpc-address.cluster2.nn2", "xx:8020");conf.set("dfs.client.failover.proxy.provider.cluster2",    "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider");conf.set("hadoop.security.authentication", "kerberos");conf.set("yarn.resourcemanager.scheduler.address", "xx:8030");

错误提示终于变了,但是这个错误也没解决。

org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)        at java.lang.reflect.Constructor.newInstance(Constructor.java:534)        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1681)        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)        at com.netease.weblogOffline.exp.mysql.OrgMediaSQL.main(OrgMediaSQL.java:126)Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]        at org.apache.hadoop.ipc.Client.call(Client.java:1347)        at org.apache.hadoop.ipc.Client.call(Client.java:1300)        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)        at com.sun.proxy.$Proxy7.getFileInfo(Unknown Source)        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        at java.lang.reflect.Method.invoke(Method.java:622)        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)        at com.sun.proxy.$Proxy8.getFileInfo(Unknown Source)        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)        



能力有限,没有找到有效的解决方案后,只能回到最初的方法来解决。


利用hadoop命令来跑一个空的任务,主要执行读取hdfs文件内容。


最好自己想了下,java -classpath这种形式组织的configuration中的属性值肯定少于hadoop下的配置文件中的属性值。

还是老老实实的走hadoop吧,我释然了。






0 0
原创粉丝点击