hadoop程序由1.X升级至2.x

来源:互联网 发布:淘宝卖家购物车营销 编辑:程序博客网 时间:2024/05/16 13:46

最近我们的平台环境需要进行升级,这里我们的源程序是MAHOUT0.9版本的(直接改的源代码),而计算平台的环境之前是1.2.1的,目前要升级至2.6.0,因此直接将Mahout程序放在上面是执行不了的。

1.首先修改mahout父类maven工程的依赖的HADOOP jar包

在mahout父类maven工程中,将pom.xml的<hadoop.version>1.2.1</hadoop.version>修改为<hadoop.version>2.6.0</hadoop.version>

去掉hadoop-core这个依赖替换成hadoop-hdfs

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>

</dependency>

一个完整的hadoop程序依赖应该包括:hadoop-core(1.x版本)或者hadoop-hdfs(2.x版本),hadoop-common,hadoop-mapreduce-client-core,>hadoop-mapreduce-client-common.

替换之后会发现报错,错误信息是jdk mising这样的信息,这个时候只要加入jdk依赖即可

<dependency>  
   <groupId>jdk.tools</groupId>  
   <artifactId>jdk.tools</artifactId>  
   <version>1.6</version>  
   <scope>system</scope>  
   <systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>  
</dependency>  

2.修改其子工程mahout-core,将pom.xml的

<profiles>
<profile>
<id>hadoop-0.20</id>
<activation>
<property>
<name>!hadoop.version</name>
</property>
</activation>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
</dependency>
</dependencies>

</profile>
<profile>
<id>hadoop-0.23</id>
<activation>
<property>
<name>hadoop.version</name>
</property>
</activation>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-common</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

修改为

<profiles>
<profile>
<id>hadoop-0.20</id>
<activation>
<property>
<name>!hadoop.version</name>
</property>
</activation>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-common</artifactId>
</dependency>
</dependencies>

</profile>
<profile>
<id>hadoop-0.23</id>
<activation>
<property>
<name>hadoop.version</name>
</property>
</activation>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-common</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

目的就是为了替换依赖jar,为什么要这么改呢,当然我们也可以重新建一个profile,这个主要是上因为mahout工程默认的用的是这个profile,如下图


所以我们最好是建立自己的profile,然后把对应的依赖加进去,我为了省事直接改的0.20的。最后我们在编译mahout源码时候,也是可以指定profile的,最后终于测试通过

0 0
原创粉丝点击