CDH3 Install Guide

来源:互联网 发布:ubuntu arm 版本 编辑:程序博客网 时间:2024/05/21 05:07

CDH3 Install Guide

1       Install Hadoop

1.1     Add user hadoop

[root@gd02 ~]# adduser hadoop

使用vim在/etc/group中将hadoop用户添加到mapred和hdfs组;

将mapred和hdfs用户添加到hadoop组。

hadoop:x:105:mapred,hdfs

hdfs:x:106:hadoop

mapred:x:107:hadoop

1.2     Change hadoop’s privileges ofrelated directories.

chown -R hadoop:hadoop /usr/lib/hadoop-0.20/

chown -R hadoop:hadoop /usr/lib/hadoop-0.20/pids/

chown -R hadoop:hadoop /usr/lib/hadoop-0.20/logs/

chown -R hadoop:hadoop /usr/lib/hadoop-0.20/logs/*

1.3     Format HDFS

sudo -u hadoop hadoop namenode -format

1.4     Automated scripts

1.4.1 Init.sh

#!/bin/bash

chown -R hadoop:hadoop /usr/lib/hadoop-0.20/

chown -R hadoop:hadoop /usr/lib/hadoop-0.20/pids/

chown -R hadoop:hadoop /usr/lib/hadoop-0.20/logs/

cd /usr/lib/hadoop-0.20/logs/

chown -R hadoop:hadoop *

1.4.2 Start-all.sh

#!/bin/sh

/etc/init.d/hadoop-0.20-namenode start

/etc/init.d/hadoop-0.20-secondarynamenodestart

/etc/init.d/hadoop-0.20-jobtracker start

/etc/init.d/hadoop-zookeeper start

/etc/init.d/hadoop-hbase-master start

1.4.3 Stop-all.sh

#!/bin/sh

/etc/init.d/hadoop-zookeeper stop

/etc/init.d/hadoop-0.20-secondarynamenodestop

/etc/init.d/hadoop-0.20-jobtracker stop

/etc/init.d/hadoop-0.20-namenode stop

2       Install Hbase

2.1     Change hadoop’s privileges ofrelated directories.

修改HBase权限

chown -R hadoop:hadoop /usr/lib/hbase/

chown -R hadoop:hadoop /usr/lib/hbase/logs/

修改ZooKeeper权限

chown -R hadoop:hadoop /local/zookeeper/

2.2     Automated scrips

3       Sqoop: Import Mysql to Hbase

#/bin/bash

 

MySQL_Server="10.10.97.116"

MySQL_Port="3306"

DataBase="rsearch"

 

sqoop import --connectjdbc:mysql://10.10.97.116:3306/rsearch --table institute --hbase-tableinstitute --column-family institute --hbase-row-key domain --hbase-create-table--username 'root' -P

4       Q&A

4.1     Synctime error

2011-06-21 08:41:10,470 FATALorg.apache.hadoop.hbase.regionserver.HRegionServer: Master rejected startupbecause clock is out of sync
org.apache.hadoop.hbase.ClockOutOfSyncException:org.apache.hadoop.hbase.ClockOutOfSyncException: Servergd03,60020,1308616870092 has been rejected; Reported time is too far out ofsync with master. Time difference of 50375801ms > max allowed of 30000ms
将集群中节点的时间同步

4.2     Permissionerror

2011-06-21 23:14:16,338 WARNorg.apache.hadoop.mapred.JobTracker: Failed to operate on mapred.system.dir(hdfs://gd02:9000/mapred/system) because of permissions.

删除datanode中的mapred.system.dir目录。

rm –rf /local/dfs


原创粉丝点击