Eclipse打包mapreduce程序并提交至hadoop集群运行

来源:互联网 发布:在淘宝上怎么买开山刀 编辑:程序博客网 时间:2024/05/16 17:17

在命令行里能够将程序运行在hadoop集群环境后,将Eclipse里的各种配置也相应配好,点击run on hadoop。

作业成功运行,hdfs上能够看到结果,但是仍然,没有提交至真正的集群环境。

查了好久资料,直接在代码中指定远程jobtracker地址,仍然未果。

于是在Eclipse里调试程序,运行成功后打成jar包上传至hadoop集群中运行:

直接export,保证jar文件的META-INF/MANIFEST.MF文件中存在Main-Class映射:

Main-Class: WordCount

其实直接next自动文件里就有这个关系。

将打好的jar上传至服务器,假设在/opt目录下,则命令:

hadoop jar /opt/myWordCount.jar WordCount /test_in /output12

报错:

xception in thread "main" java.lang.UnsupportedClassVersionError: WordCount : Unsupported major.minor version 52.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:270)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:205)

网上查资料,怀疑是java版本不同导致,win7上的Eclipse是java1.8.而服务器上的是java1.7

在Eclipse里面 windows--preference--java--compile--compile level,选择1.7

重新导入运行

出现错误:

14/11/07 10:33:46 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:47 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:48 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:49 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:50 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:51 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:52 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

resourcemanager连不上。检查yarn-site.xml都配置好了

但是发现端口号与默认的端口号不一致,于是修改

配置文件改为如下:

        <property>
                <name>yarn.resourcemanager.address</name>
                <value>localhost:8032</value>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>localhost:8030</value>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>localhost:8031</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>192.168.0.7</value>
        </property>

重新运行,仍然出现同样错误,于是将代码中显式指定的job.tracker注释掉。

竟然又出现错误:

Usage: wordcount <in> <out>

检查代码,发现这是因为输入参数不是两个而导致。但是检查了命令没有发现错误,只能将路径写死在程序中,再打jar包

  FileInputFormat.addInputPath(job, new Path("hdfs://192.168.0.7:9000/test_in"));
  FileOutputFormat.setOutputPath(job, new Path("hdfs://192.168.0.7:9000/out1"));

提交至hadoop集群,结果出来了。


但是还是没有想通为什么路径写在外面不可以。先记录 mark下






0 0
原创粉丝点击