hadoop2.6.1+spark1.5.1

来源：互联网发布：数据库sql2000恢复编辑：程序博客网时间：2024/05/22 09:45

1.部署hadoop集群时，注意先规划好集群IP地址并与host绑定，在/etc/hosts 里设置。

这样方便修改ip地址，和区分不同的机器

2.hadoop-eclipse-plugin的编译。

github源码： https://github.com/winghc/hadoop2x-eclipse-plugin

note: 1.需要安装ant;

2.需要修改 hadoop2x-eclipse-plugin-master/ivy/libraries.properties 里面对应的hadoop版本号

3.编译指令 ant jar -Dversion=x.x.x -Dhadoop.version=x.x.x -Declipse.home=/path/to/eclipse -Dhadoop.home=/path/to/hadoop

将编译好的jar文件复制到eclipse的plugins文件夹里面。

重启eclipse后，如果无法加载map/reduce locations模块，可能是因为eclipse所指定的jdk版本太低。

3.map/reduce location设置时

Host为hadoop集群master的ip地址，对应的port为core-site.xml中的配置，

Username需要为对应的集群所在的linux系统中运行的user的username.

而且为了在windows的eclipse中可以删除DFS中需要修改windows的用户与组里的用户为对应的linux系统运行hadoop的user.

4.centos firewalld的配置。

不知道为什么使用命令 firewall-cmd --zone=public --remove-forward-port 命令后，显示success.

但是不管是firewall-cmd --reload,还是systemctl restart firewalld.service

转发规则还在。

5.hadoop和spark集群运行时，注意它们所使用的端口是否开放。

0 0