hadoop环境搭建

来源：互联网发布：js 红到蓝渐变编辑：程序博客网时间：2024/06/05 12:43

虽然已经运行好几个月了但是一直没时间整理，现在整理一下。

一、环境准备

我是用的是Linux version 2.6.32-504.1.3.el6.x86_64 cdh4.7.7

1. 关闭防火墙

Service iptables stop

Chkconfig iptables off 永久性的关闭

Service iptables status查看状态

2. 修改ip dns

Vim /etc/sysconfig/network-scripts/ifcfg-eth0

[root@hadoop6 ~]# cd /etc/sysconfig/network-scripts
[root@hadoop6 network-scripts]# ls
ifcfg-eth0 ifdown ifdown-ippp ifdown-post ifdown-sit ifup-aliases ifup-ippp ifup-plip ifup-ppp ifup-tunnel net.hotplug
ifcfg-eth1 ifdown-bnep ifdown-ipv6 ifdown-ppp ifdown-tunnel ifup-bnep ifup-ipv6 ifup-plusb ifup-routes ifup-wireless network-functions
ifcfg-lo ifdown-eth ifdown-isdn ifdown-routes ifup ifup-eth ifup-isdn ifup-post ifup-sit init.ipv6-global network-functions-ipv6
[root@hadoop6 network-scripts]# vi /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
TYPE=Ethernet
UUID=b4d8a22e-f413-4f18-8d7c-5e5dced6e93d
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=static
IPADDR=192.168.17.134
NETMASK=255.255.255.0
GATEWAY=192.168.17.2

DNS1=211.99.25.1
DNS2=202.106.0.20
DNS3=202.106.46.151
DNS4=8.8.8.8
DEFROUTE=yes
PEERDNS=no
PEERROUTES=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME="System eth0"

Service network restart 使其生效

Ifconfig查看配置是否生效

3. 修改主机名

Vim /etc/sysconfig/network永久性改变

重启生效 reboot

Hostname查看

4. 关闭selinux

Vim /etc/selinux/config永久性的关闭

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
# SELINUX=enforcing
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
# SELINUX=enforcing
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted

你也可以使用setenforce 0 命令来设置但这是临时性的

5、禁用ipv6

在/etc/modprobe.d目录下添加一个文件，写入 install ipv6 /bin/true 后，保存退出，重启计算机。
vim /etc/modprobe.d/ipv6off.conf
install ipv6 /bin/true

6. 调整vm.overcommit_memory的值为、swap内存交换频率设置为零

内核参数说明如下：

overcommit_memory文件指定了内核针对内存分配的策略，其值可以是0、1、2。

0，表示内核将检查是否有足够的可用内存供应用进程使用；如果有足够的可用内存，内存申请允许；否则，内存申请失败，并把错误返回给应用进程。

1，表示内核允许分配所有的物理内存，而不管当前的内存状态如何。

2，表示内核允许分配超过所有物理内存和交换空间总和的内存

vim /etc/sysctl.conf

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

# control the value of vm.overcommit_memory
vm.overcommit.memory=1

# control the value of vm.swappiness
vm.swappiness=0

7. 调整ulinux和nproc在所有机器上

在/etc/security/limits.conf文件

hdfs - nofile 32768

hbase - nofile 32768

如果这个配置没有生效你可以在/etc/security/limits.d

我认为正确的做法，应该是修改/etc/security/limits.conf
里面有很详细的注释，比如
* soft nofile 32768
* hard nofile 65536

service sshd restart

8. 安装jdk配环境变量

我用的是jdk-8u20-linux-x64rpm

Rpm –ivh jdk-8u20-linux-x64rpm

修改文件vim /etc/profile

fi

for i in /etc/profile.d/*.sh ; do
if [ -r "$i" ]; then
if [ "${-#*i}" != "$-" ]; then
. "$i"
else
. "$i" >/dev/null 2>&1
fi
fi
done

unset i
unset -f pathmunge
# set java environment
export JAVA_HOME=/usr/java/jdk1.6.0_45
export PATH=$JAVA_HOME/bin:$PATH

9. Ssh免密码登录

ssh–keygen –t rsa 这个命令一路回车会在/root/.ssh下产生两个文件

进入到文件目录下cd /root/.ssh

把pub文件的内容copy到authorized文件cp id_rsa.pub authorized_keys

删除pub文件rm id_rsa.pub

把所有zuthorized_keys的内容copy到一个文件中，在copy到各个机器见（初次登陆需要yes确认）

10. 时间同步所有机器

略。

二、安装cdh

$ sudo yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm

$ sudo rpm --import

http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

这是用了在线安装，如果是用自己配制的yum源内容就要改成你自己的路径。

三、zookeeper集群的安装和配置

1. yum安装zookeeper

因为有依赖关系就不需要在安装zookeeper，系统会自动安装

只需要

Yum install zookeeper-server

如果不是这样安装就还需要yum install zookeeper

2. 在/var/lib/zookeeper下vim myid 内容为（1-255）的数字0. 在集群各节点进行初始化

Service zookeeper-server init –myid=1

Service zookeeper-server start

Jps

配置集群中的各个节点

Vim /etc/zookeeper/conf/zoo.cfg

server.1=hadoop1:2888:3888

server.2=hadoop2:2888:3888

server.3=hadoop3:2888:3888

server.4=hadoop4:2888:3888

server.5=hadoop5:2888:3888

server.6=hadoop6:2888:3888

3. 进行集群验证

你可以通过netstat –an | grep 2181 or 2888 or 3888

/usr/lib/zookeeper/bin/zkServer.sh status

遇到问题不要慌！可能是别的节点没起来也会报错。可以通过日志查看错误.

hdfs zkfc -formatZK

四、在各节点安装你需要的安装包

Hadoop1

Hadoop2

Hadoop3

Hadoop4

Hadoop5

Hadoop6

namenode

datanode

namenode

管理节点

数据节点

管理节点

安装在哪里

安装什么

在JobTracker主机