Ganglia meeting Hadoop

来源:互联网 发布:淘宝联盟佣金结算时间 编辑:程序博客网 时间:2024/06/06 04:07
1. Introduction

Ganglia is a monitoring system for grids and cluster. Depending of version, it can be integrated with Hadoop. Ganglia consists of the following  components:
  • gmond
Ganglia Monitoring Daemon (gmond) runs oneach node in the cluster and collects statistics from the node it runs on as well as other nodes in the cluster. Normally it is a multicast system where each gmond node receives data from its peers. However, since Amazon EC2 does not support multicast at this time, you must setup Ganglia Monitoring Daemons in unicast mode where each node in a cluster is configured to send its data to one pre-designated node.
  • gmetad
Ganglia Meta Daemon (gmetad) runs foreach grid and collects data from the Ganglia Monitoring Daemons, one from each cluster. It stores the data it collects on the file system. We'll only be configuring one grid and therefore one gmetad. We'll be running gmetad from the same node that the PHP Web Front End is installed on.
  • Web Front End
A PHP application reads the data and provides a UI to visualize the data over time with pretty graphs. It requires RRDTools library.


2. Cluster Configuration
  • Ubuntu 9.04;
  • Ganglia Monitoring Core 3.0.7
  • Hadoop 0.20.2
Note: beware the software versions. Natively, Ganglia 3.1.x is incompatible with Hadoop.


3. Installing gmond and gmetad

Installing dependencies:
$ sudo apt-get install build-essential librrd2-dev libapr1-dev libconfuse-dev libexpat1-dev python-dev

Creating a user called 'ganglia' and extracting the packet:
$ sudo adduser --disabled-login --no-create-home ganglia$ sudo tar -xzvf ganglia-3.0.7.tar.gz -C /opt

Changing owner to 'ganglia':
$ sudo chown -R ganglia:ganglia /opt/ganglia-3.0.7

Installing gmond:
(Assume installation directory at/opt/ganglia-3.0.7and configuration directory at/etc)
$ cd /opt/ganglia-3.0.7$ sudo ./configure --with-gmetad$ sudo make && make install

Installing Web front-end:
$ sudo apt-get install rrdtool$ sudo apt-get install apache2 php5-mysql libapache2-mod-php5$ sudo cp -r ganglia-3.0.7/web /var/www && mv /var/www/web /var/www/ganglia

3.1 Running gmond

Generate a configuration file:
# gmond --default_config > /etc/gmond.conf

Edit/etc/gmond.confchanging the following lines:
globals { user = ganglia}cluster { name = "<cluster_name>" owner = "<owner_name>" latlong = "unspecified" url = "unspecified"}(Disable multicast and define the host where nodes in the cluster send data)udp_send_channel { #mcast_join = 239.2.11.71 host = <hostname> port = 8649 ttl = 1}udp_recv_channel { #mcast_join = 239.2.11.71 port = 8649 #bind = 239.2.11.71}

Run gmond as sudo:
$ sudo gmond

Check the daemon withps:
$ ps aux | grep gmondnobody   24069 3.1 0.7  4304  1872 ? Ss    15:45   0:00 gmondrhodesmi 24071 0.0 0.2  3004   756 pts/0 R+   15:45   0:00 grep gmond

Listen to gmond port with telnet to check if everything is alright:
$ telnet localhost 8649

Note:If XML lines appear in your terminal, everything is working fine


3.2 Running gmetad

Like gmond, gmetad also has a configuration file. Move it to config directory:
$ sudo cp gmetad/gmetad.conf /etc/

Afterwards insert the following lines:
setuid_username "ganglia"data_source "<master>" <hostname>gridname "<cluster_name>"

Now, create a directory to storage rrd files:
$ sudo mkdir -p /var/lib/ganglia/rrds/$ sudo chown -R ganglia:ganglia /var/lib/ganglia/rrds/

Run gmetad on debug mode to check if everything is alright:
$ sudo gmetad -d 1

Open front-end to test the services:
http://<hostname>/ganglia/

If everything is working fine, kill gmetad process (debug mode) and start it as:
$ sudo gmetad

4. Configuring Hadoop


Insert Ganglia Context in$HADOOP_HOME/conf/hadoop-metrics.propertiesas:

dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContextdfs.period=10dfs.servers=<hostname>:8649mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContextmapred.period=10mapred.servers=<hostname>:8649jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContextjvm.period=10jvm.servers=<hostname>:8649rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContextrpc.period=10rpc.servers=<hostname>:8649


Note: restart Hadoop. Afterwards, restart gmond and gmetad daemons.


References
  • Ken's Blog
  • Ryan Greenhall Home Page

0 0
原创粉丝点击