spring-hadoop wordcount run on yarn

来源：互联网发布：消费者数据库编辑：程序博客网时间：2024/06/04 18:09

之前我以为是要在IDEA配置一个hadoop路径，其实不用，只要增加一些配置就好了。

mapred-site.xml

<property>    <name>mapred.remote.os</name>    <value>Linux</value>    <description>Remote MapReduce framework's OS, can be either Linux or Windows</description></property><property>    <name>mapreduce.app-submission.cross-platform</name>    <value>true</value></property>

本人用的是hdp的版本，用hdp下载下来的配置文件，会有点问题，无法识别

yarn-site和mapred-site

<property>    <name>yarn.application.classpath</name>    <value>        /usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*,/usr/hdp/current/hadoop-mapreduce-client/*,/usr/hdp/current/hadoop-mapreduce-client/lib/*    </value></property>

application.yml

spring.main.web-environment: falsespring:  hadoop:    fs-uri: hdfs://mycluster:8020    resource-manager-host: 10.50.1.152    resource-manager-port: 8050

bean.xml 之前那种也可以，下面这种也行

<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"       xmlns:hdp="http://www.springframework.org/schema/hadoop"       xsi:schemaLocation="http://www.springframework.org/schema/beans       http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/hadoop       http://www.springframework.org/schema/hadoop/spring-hadoop.xsd">    <!--&lt;!&ndash;create a hadoop job&ndash;&gt;-->    <!--<hdp:job id="mr-job"-->    <!--input-path="/user/wsh/input/"-->    <!--output-path="/user/wsh/output/"-->    <!--mapper="org.apache.hadoop.examples.WordCount$TokenizerMapper"-->    <!--reducer="org.apache.hadoop.examples.WordCount$IntSumReducer"/>-->    <!--&lt;!&ndash;run hadoop job&ndash;&gt;-->    <!--<hdp:job-runner id="myjobs-runner" job-ref="mr-job" run-at-startup="true" />-->    <!--&lt;!&ndash;use hadoop job tasklet&ndash;&gt;-->    <!--<hdp:job-tasklet id="hadoop-tasklet" job-ref="mr-job" wait-for-completion="true"/>-->    <!--use hadoop tool run a hadoop jar-->    <hdp:jar-runner jar="hadoop-mapreduce-examples.jar" id="wordcount" run-at-startup="true">        <hdp:arg value="wordcount"/>        <hdp:arg value="/user/wsh/input"/>        <hdp:arg value="/user/wsh/output"/>    </hdp:jar-runner></beans>

跑的log记录：

2016-09-27 11:46:38.222 INFO 11016 --- [ main] o.a.h.m.lib.input.FileInputFormat : Total input paths to process : 1
2016-09-27 11:46:38.354 INFO 11016 --- [ main] o.apache.hadoop.mapreduce.JobSubmitter : number of splits:1
2016-09-27 11:46:38.442 INFO 11016 --- [ main] o.apache.hadoop.mapreduce.JobSubmitter : Submitting tokens for job: job_1474422695520_0050
2016-09-27 11:46:38.891 INFO 11016 --- [ main] o.a.h.y.client.api.impl.YarnClientImpl : Submitted application application_1474422695520_0050
2016-09-27 11:46:38.947 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : The url to track the job: http://rm.hadoop:8088/proxy/application_1474422695520_0050/
2016-09-27 11:46:38.948 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : Running job: job_1474422695520_0050
2016-09-27 11:46:47.141 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : Job job_1474422695520_0050 running in uber mode : false
2016-09-27 11:46:47.144 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : map 0% reduce 0%
2016-09-27 11:46:54.310 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : map 100% reduce 0%
2016-09-27 11:47:00.384 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : map 100% reduce 100%
2016-09-27 11:47:01.422 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : Job job_1474422695520_0050 completed successfully
2016-09-27 11:47:01.635 INFO 11016 --- [ main] org.apache.hadoop.mapreduce.Job : Counters: 49

0 0