windows下不需要插件使用MyEclipse运行hadoop2.6.0 MapReduce程序。

来源:互联网 发布:淘宝店铺上架宝贝软件 编辑:程序博客网 时间:2024/05/18 03:05

1.新建一个maven工程,pom文件这种添加依赖包

<dependency>

   <groupId>org.apache.hadoop</groupId>
   <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
   <version>2.6.0</version>
</dependency>

<dependency>
   <groupId>org.apache.hadoop</groupId>
   <artifactId>hadoop-common</artifactId>
   <version>2.6.0</version>
</dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.6.0</version>

</dependency>


2.将集群环境中hadoop-2.6.0/etc/hadoop下的core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml、log4j.properties配置文件copy出来,复制到工程的src/main/resource下,在windows中的hosts文件添加相应的主机名和ip对应关系。

注意点:

1>.mapred-site.xml文件中必须添加以下配置,

<property>
<name>mapred.remote.os</name>
<value>Linux</value>
</property>
<property>
        <name>mapreduce.app-submission.cross-platform</name>
        <value>true</value>
    </property>
</configuration>

不加会报以下错误

org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control  at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)  at org.apache.hadoop.util.Shell.run(Shell.java:418)  at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)  at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)  at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)  at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)  at java.util.concurrent.FutureTask.run(FutureTask.java:262)  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)  at java.lang.Thread.run(Thread.java:744)
2>.将写好的mr程序通过myeclipse eport方式打成jar包,在conf中conf.set("mapred.jar", "你得jar包位置"); 不加出现如下错误

Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.xxx.hadoop.WordCount$TokenizerMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:742)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class com.xxx.hadoop.WordCount$TokenizerMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)
... 8 more
16/09/09 11:30:15 INFO mapreduce.Job: Task Id : attempt_1473235727794_0049_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.xxx.hadoop.WordCount$TokenizerMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:742)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class com.xxx.hadoop.WordCount$TokenizerMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)
... 8 more

添加上以上配置就可以完美的在本机调试运行mr程序了


运行结果如下:

16/09/09 11:32:21 INFO client.RMProxy: Connecting to ResourceManager at /192.168.1.169:8032
16/09/09 11:32:27 INFO input.FileInputFormat: Total input paths to process : 1
16/09/09 11:32:28 INFO mapreduce.JobSubmitter: number of splits:1
16/09/09 11:32:28 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/09/09 11:32:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1473235727794_0050
16/09/09 11:32:28 INFO impl.YarnClientImpl: Submitted application application_1473235727794_0050
16/09/09 11:32:28 INFO mapreduce.Job: The url to track the job: http://mss9:8088/proxy/application_1473235727794_0050/
16/09/09 11:32:28 INFO mapreduce.Job: Running job: job_1473235727794_0050
16/09/09 11:32:38 INFO mapreduce.Job: Job job_1473235727794_0050 running in uber mode : false
16/09/09 11:32:38 INFO mapreduce.Job:  map 0% reduce 0%
16/09/09 11:32:42 INFO mapreduce.Job:  map 100% reduce 0%
16/09/09 11:32:46 INFO mapreduce.Job:  map 100% reduce 100%
16/09/09 11:32:47 INFO mapreduce.Job: Job job_1473235727794_0050 completed successfully
16/09/09 11:32:48 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=228
FILE: Number of bytes written=216169
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=287
HDFS: Number of bytes written=170
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters 
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=2014
Total time spent by all reduces in occupied slots (ms)=2042
Total time spent by all map tasks (ms)=2014
Total time spent by all reduce tasks (ms)=2042
Total vcore-seconds taken by all map tasks=2014
Total vcore-seconds taken by all reduce tasks=2042
Total megabyte-seconds taken by all map tasks=2062336
Total megabyte-seconds taken by all reduce tasks=2091008
Map-Reduce Framework
Map input records=10
Map output records=14
Map output bytes=238
Map output materialized bytes=228
Input split bytes=103
Combine input records=14
Combine output records=13
Reduce input groups=13
Reduce shuffle bytes=228
Reduce input records=13
Reduce output records=13
Spilled Records=26
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=36
CPU time spent (ms)=1020
Physical memory (bytes) snapshot=425680896
Virtual memory (bytes) snapshot=1784610816
Total committed heap usage (bytes)=328204288
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters 
Bytes Read=184
File Output Format Counters 
Bytes Written=170


0 0
原创粉丝点击