CDH3安装
来源:互联网 发布:跑跑卡丁车淘宝网 编辑:程序博客网 时间:2024/05/21 19:47
Cloudera为Hadoop开源软件整体方案供应商,提供基于Apache Hadoop软件
和服务领域。去年11月获得4千万美金风险投资,Oracle, Dell等与Cloudera达成了
合作计划。IBM,Amazon,微软等都加入了Hadoop俱乐部,并发布了各自的
Hadoop-as-a-Service。相信该计划将成为云计算平台的一个主流。以下是基于
CDH3在RedHat 5 安装Hadoop产品步骤。(CDH3:Cloudera's Distribution
including Apache Hadoop Version 3)
1、三台主机地址和IP如下:
#vi /etc/hosts
172.16.130.136 masternode
172.16.130.137 slavenode1
172.16.130.138 slavenode2
2、root用户配置SSH
masternode节点
#ssh-keygen -t rsa
#cat /root/.ssh/id_rsa_pub >>/root/.ssh/authorized_keys
slavenode
#ssh-keygen -t rsa
拷贝authorized_keys 到slavenode节点上
测试是否能不用密码登陆:
masternode
#ssh slavenode 1
#ssh slavenode2
3、下载Cloudera安装包,下载地址:
http://archive.cloudera.com/redhat/cdh/cdh3-repository-1.0-1.noarch.rpm
4、在各个节点上执行
#sudo yum --nogpgcheck localinstall cdh3-repository-1.0-1.noarch.rpm
5、各个节点安装hadoop core包
# yum search hadoop
# sudo yum install hadoop-0.20
6、在masternode上安装namenode和jobtracker
#sudo yum install hadoop-0.20-namenode
#sudo yum install hadoop-0.20-jobtracker
7、在slvaenode节点上安装datanode和tasktracker
#sudo yum install hadoop-0.20-datanode
#sudo yum install hadoop-0.20-tasktracker
8、配置cluster,在masternode操作
#sudo cp -r /etc/hadoop-0.20/conf.empty /etc/hadoop-0.20/conf.my_cluster
添加自己配置
#sudo alternatives --install /etc/hadoop-0.20/conf hadoop-0.20-conf /etc/hadoop-0.20/conf.my_cluster 50
设置所定义的配置
#sudo alternatives --set hadoop-0.20-conf /etc/hadoop-0.20/conf.my_cluster
显示配置
#sudo alternatives --display hadoop-0.20-conf
删除配置
#sudo alternatives --remove hadoop-0.20-conf /etc/hadoop-0.20/conf.my_cluster
9、配置/etc/hadoop-0.20/conf/core-site.xml文件(缺省端口8020)
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://masternode/</value>
</property>
</configuration>
10、配置/etc/hadoop-0.20/conf/hdfs-site.xml文件
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/data/1/dfs/nn,/data/2/dfs/nn</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
11、配置/etc/hadoop-0.20/conf/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>masternode:54311</value>
<description>The host and port that the Mapreduce job tracker runs
at. If "local", then jobs are run in-process as a single map and
ruduce task.
</description>
</property>
<property>
<name>mapred.local.dir</name>
<value>/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local</value>
</property>
</configuration>
12、配置/etc/hadoop-0.20/conf/masters和slaves
masters文件添加masternode
slaves文件添加slavenode1 和slavenode2
13、根据配置创建文件
masternode:
#sudo mkdir -p /data/1/dfs/nn /data/2/dfs/nn
#sudo chown -R hdfs:hadoop /data/1/dfs/nn /data/2/dfs/nn
#sudo chmod 700 /data/1/dfs/nn /data/2/dfs/nn
slavenode:
#sudo mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn
#sudo mkdir -p /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local
#sudo chown -R hdfs:hadoop /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn
#sudo chown -R hdfs:hadoop /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local
14、将配置文件conf_mycluster打包,分发到各slavenode节点
15、按照步骤8步骤,激活配置文件
16、在masternode节点执行初始化
#sudo -u hdfs hadoop namenode -format
17、启动后台进程
masternode:
#sudo service hadoop-0.20-namenode start
slavenode
#sudo service hadoop-0.20-datanode start
18、创建HDFS文件目录
#sudo -u hdfs hadoop fs -mkdir /tmp
#sudo -u hdfs hadoop fs -chown -R 1777 /tmp
#sudo -u hdfs hadoop fs -mkdir /mapred/system
#sudo -u hdfs hadoop fs -chown mapred:hadoop /mapred/system
19、启动mapred
masternode:
#sudo service hadoop-0.20-jobtracker start
slavenode
#sudo service hadoop-0.20-tasjtracker start
20、配置机器启动之后,后台服务自动运行
masternode:
#sudo chkconfig hadoop-0.20-namenode on
#sudo chkconfig hadoop-0.20-jobtracker on
slavenode:
#sudo chkconfig hadoop-0.20-datanode on
#sudo chkconfig hadoop-0.20-tasktracker on
和服务领域。去年11月获得4千万美金风险投资,Oracle, Dell等与Cloudera达成了
合作计划。IBM,Amazon,微软等都加入了Hadoop俱乐部,并发布了各自的
Hadoop-as-a-Service。相信该计划将成为云计算平台的一个主流。以下是基于
CDH3在RedHat 5 安装Hadoop产品步骤。(CDH3:Cloudera's Distribution
including Apache Hadoop Version 3)
1、三台主机地址和IP如下:
#vi /etc/hosts
172.16.130.136 masternode
172.16.130.137 slavenode1
172.16.130.138 slavenode2
2、root用户配置SSH
masternode节点
#ssh-keygen -t rsa
#cat /root/.ssh/id_rsa_pub >>/root/.ssh/authorized_keys
slavenode
#ssh-keygen -t rsa
拷贝authorized_keys 到slavenode节点上
测试是否能不用密码登陆:
masternode
#ssh slavenode 1
#ssh slavenode2
3、下载Cloudera安装包,下载地址:
http://archive.cloudera.com/redhat/cdh/cdh3-repository-1.0-1.noarch.rpm
4、在各个节点上执行
#sudo yum --nogpgcheck localinstall cdh3-repository-1.0-1.noarch.rpm
5、各个节点安装hadoop core包
# yum search hadoop
# sudo yum install hadoop-0.20
6、在masternode上安装namenode和jobtracker
#sudo yum install hadoop-0.20-namenode
#sudo yum install hadoop-0.20-jobtracker
7、在slvaenode节点上安装datanode和tasktracker
#sudo yum install hadoop-0.20-datanode
#sudo yum install hadoop-0.20-tasktracker
8、配置cluster,在masternode操作
#sudo cp -r /etc/hadoop-0.20/conf.empty /etc/hadoop-0.20/conf.my_cluster
添加自己配置
#sudo alternatives --install /etc/hadoop-0.20/conf hadoop-0.20-conf /etc/hadoop-0.20/conf.my_cluster 50
设置所定义的配置
#sudo alternatives --set hadoop-0.20-conf /etc/hadoop-0.20/conf.my_cluster
显示配置
#sudo alternatives --display hadoop-0.20-conf
删除配置
#sudo alternatives --remove hadoop-0.20-conf /etc/hadoop-0.20/conf.my_cluster
9、配置/etc/hadoop-0.20/conf/core-site.xml文件(缺省端口8020)
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://masternode/</value>
</property>
</configuration>
10、配置/etc/hadoop-0.20/conf/hdfs-site.xml文件
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/data/1/dfs/nn,/data/2/dfs/nn</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
11、配置/etc/hadoop-0.20/conf/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>masternode:54311</value>
<description>The host and port that the Mapreduce job tracker runs
at. If "local", then jobs are run in-process as a single map and
ruduce task.
</description>
</property>
<property>
<name>mapred.local.dir</name>
<value>/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local</value>
</property>
</configuration>
12、配置/etc/hadoop-0.20/conf/masters和slaves
masters文件添加masternode
slaves文件添加slavenode1 和slavenode2
13、根据配置创建文件
masternode:
#sudo mkdir -p /data/1/dfs/nn /data/2/dfs/nn
#sudo chown -R hdfs:hadoop /data/1/dfs/nn /data/2/dfs/nn
#sudo chmod 700 /data/1/dfs/nn /data/2/dfs/nn
slavenode:
#sudo mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn
#sudo mkdir -p /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local
#sudo chown -R hdfs:hadoop /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn
#sudo chown -R hdfs:hadoop /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local
14、将配置文件conf_mycluster打包,分发到各slavenode节点
15、按照步骤8步骤,激活配置文件
16、在masternode节点执行初始化
#sudo -u hdfs hadoop namenode -format
17、启动后台进程
masternode:
#sudo service hadoop-0.20-namenode start
slavenode
#sudo service hadoop-0.20-datanode start
18、创建HDFS文件目录
#sudo -u hdfs hadoop fs -mkdir /tmp
#sudo -u hdfs hadoop fs -chown -R 1777 /tmp
#sudo -u hdfs hadoop fs -mkdir /mapred/system
#sudo -u hdfs hadoop fs -chown mapred:hadoop /mapred/system
19、启动mapred
masternode:
#sudo service hadoop-0.20-jobtracker start
slavenode
#sudo service hadoop-0.20-tasjtracker start
20、配置机器启动之后,后台服务自动运行
masternode:
#sudo chkconfig hadoop-0.20-namenode on
#sudo chkconfig hadoop-0.20-jobtracker on
slavenode:
#sudo chkconfig hadoop-0.20-datanode on
#sudo chkconfig hadoop-0.20-tasktracker on
- CDH3安装
- CDH3 Install Guide
- CDH3 Installation Guide
- 编译CDH3 eclipse插件
- CDH3 Hadoop集群摘除节点
- 设置Cloudera CDH3源镜像(Redhat)
- Install Cloudera CDH3 on Ubuntu 11.04 单机
- hadoop版本及cloudera的CDH3 CDH4
- hadoop版本及cloudera的CDH3 CDH4
- Hadoop版本及cloudera的CDH3 CDH4
- hadoop CDH3版和apache 0.20版本的对比
- CDH3(Hadoop 0.20) Install -- RedHat6/CentOS6 -- Single Node
- 安装
- 安装
- 安装
- 安装
- 安装
- 安装
- 内存溢出之Tomcat内存配置
- CDH4需要注意的几点
- 菜鸟学Java(十三)——将MyEclipse项目导入到Eclipse
- javascript实现的DES加密
- 人脸检测文章
- CDH3安装
- [Java入学测试] 技术博客书写规则,请认真遵照执行!!!
- 博客参考
- Search Insert Position
- JAVA内存分配方面的知识(摘自网络)
- destination host unreachable 问题解决
- Collection接口
- Android SDK Manager无法更新的解决
- 黑马程序员入学考试评分细则