Zookeeper + Hadoop + HBase Cluster Installer

来源:互联网 发布:悉尼动物园 知乎 编辑:程序博客网 时间:2024/06/08 06:35
download hadoop from http://archive.cloudera.com/cdh4/cdh/4/

download hbase from http://archive.cloudera.com/cdh4/cdh/4/

======================================Hadoop集群配置如下:

-------------------------------------------Master Server------------------------------------------

tar -zxvf *.tar.gz

configure /etc/hosts DOMAIN HOSTNAME
configure /etc/profile $HADOOP_HOME $HBASE_HOME $PATH 

---------------------------------------------配置ssh 服务器集群之间免登陆 -----------------------------------------------

注:很重要,因为服务器集群启动的时候都要提示登录,输入密码,操作很不方便而且可能带来其他问题!


install Open SSH
ssh-keygen -t rsa -P '' > Enter(回车) generate .ssh/id_rsa id_rsa.pub
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

--------------------------------------------Slave1 Server------------------------------------------
login other server and enter:
scp root@master_server:/root/.ssh/id_rsa.pub /root/.ssh/id_rsa.pub
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

--------------------------------------------Slave2 Server------------------------------------------
login other server and enter:
scp root@master_server:/root/.ssh/id_rsa.pub /root/.ssh/id_rsa.pub
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

then change to master server and test:
ssh master_server_ip
ssh slave1_server_ip
ssh slave2_server_ip

-------------------------------------------- Configure /etc/hosts ---------------------------------
127.0.0 localhost.localdomain localhsot

ip4_address(0.0.0.0) host.name

(这里最好提前把ip4_adress配成外网可访问的IP地址,后面hbase集群配置的时候还是会提到)


finally install jdk and others and configure hadoop-env.sh core-site.xml hdfs-site.xml and format namenode and start-dfs.sh

-------------------------------------------- hadoop-env.sh -----------------------------------------
specified jdk install location
eg. export JAVA_HOME=/apps/svr/jdk1.7.0_51/

-------------------------------------------- core-site.xml -----------------------------------------
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/vclound/developer/hadoop/tmp/</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master_ip:9000/hbase</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/vclound/developer/hadoop/namedir/</value>
</property>
</configuration>

-------------------------------------------- hdfs-site.xml -----------------------------------------
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/vclound/developer/hadoop/datadir/</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

--------------------------------------------   Other Jobs  -------------------------------------------
$> vi masters
input master_ip_address(0.0.0.0) or hostname(you must specified ip/hostname entry in the /etc/hosts)

vi slaves
input datanode ip_address(0.0.0.0) or hostname(you must specified ip/hostname entry in the /etc/hosts)

At last reboot!!!

cd master_dir/bin> ./hadoop namenode -format > ../sbin/start-dfs.sh
http://master_ip:50070/


if you start mapreduce thread

visit http://master_ip:50030/

切记:访问页面的时候,尤其是前面出现过失败的情况,一定要清楚浏览器缓存,关闭浏览器重新访问,否则访问不了页面!!!!(折腾了一个下午)

同时可有用以下命令进行测试:

./hadoop dfsadmin -report 查看可用节点数 Availabel Nodes

同时确保hadoop以非安全模式启动,可用以下命令查看:

./hadoop dfsadmin -safemode get

结果返回:

Hadoop Safemode Off

如果为On,则使用以下命令关闭,否则会出现其他问题:

./hadoop dfsadmin -safemode leave

======================================zookeeper集群配置如下:

zoo.cfg目录指定dataDir和clientPort,同时指定服务器选举的IP和端口,结构如下:

server.sid1 = ip: port1: port2

server.sid2 = ip : port1:port2

server.sid3 = ip: port1: port2

每台服务器的zookeeper的dataDir目录创建myid文件,文件内容就是sid1的数字

依次启动./zkServer.sh start-foreground

======================================hbase集群的配置如下:

<property>
    <name>hbase.rootdir</name>
    <value>hdfs://master_ip:9000/hbase</value>-->
<!--<value>file:///home/vclound/developer/hbase/rootdir/</value>-->
  </property>
<property> 
    <name>dfs.datanode.max.xcievers</name> 
    <value>4096</value> 
</property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
<property>
      <name>hbase.zookeeper.property.clientPort</name>
      <value>2181</value>
    </property>
    <property>
      <name>hbase.zookeeper.quorum</name>
      <value>master_ip,slave1_ip,slave2_ip</value>
    </property>
<property>
<name>hbase.master</name>
<value>hdfs://master:60000</value>
</property>
<property>
<name>hbase.master.port</name>
<value>60000</value>
</property>
<property>
<name>hbase.master.info.bindAddress</name>
<value>master</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>60010</value>
</property>
    <property>
      <name>hbase.zookeeper.property.dataDir</name>
      <value>/home/vclound/developer/zookeeper/data</value>
    </property>


================================特别声明===============================

如果机器是公司内部云服务器或者是外网服务器,一定要将/etc/hosts文件中的如下内容作更改:

127.0.0.1 localhost.localdomain localhost

127.0.0.1 hostname

master_ip hostname

slave1_ip   hostname

slave2_ip   hostname

slave3_ip  hostname

一定要将IP改为外网可访问的IP地址,否则hbase的slave端不能连接到master,在slave端./hbase shell上进行操作会提示:MasterNotRunningException日志也会提示各种错误,莫名其妙,比较难定位!!!

ubuntu /etc/profile 添加全局环境变量配置:

export HADOOP_HOME=/home/vcloud/developer/hadoop/hadoop-2.7.1/
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOOME/sbin:$HADOOP_HOME/lib
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"


0 0
原创粉丝点击