High Availability options with Hadoop distributions

来源:互联网 发布:足球财富网数据频道 编辑:程序博客网 时间:2024/06/05 04:17
With disaster recovery the talk of the town after recent storm on US east coast,let’s take a look at what options are available for High Availabilitywith various distributions of Hadoop.


Let’s begin with understanding the HAissue in otherwise high reliable Apache Hadoop. An extract from Cloudera text:

“Before …the NameNode was a single point of failure (SPOF) in an HDFS cluster. Eachcluster had a single NameNode, and if that machine or process becameunavailable, the cluster as a whole would be unavailable until the NameNode waseither restarted or brought up on a separate machine.

Thisreduced the total availability of the HDFS cluster in two major ways:
1. In the case of an unplanned event such as a machinecrash, the cluster would be unavailable until an operator restarted theNameNode.
2.Planned maintenance events such as software or hardware upgrades on theNameNode machine would result in periods of cluster downtime.”


Now let uslook at various HA options available:

(1) Cloudera:

1. 1 Quorum-based Storage

“Quorum-based Storagerefers to the HA implementation that uses Quorum Journal Manager (QJM).
In order for theStandby node to keep its state synchronized with the Active node in thisimplementation, both nodes communicate with a group of separate daemons calledJournalNodes…In the event of a failover, the Standby will ensure that it hasread all of the edits from the JournalNodes before promoting itself to theActive state. This ensures that the namespace state is fully synchronizedbefore a failover occurs.”

1. 2 Shared Storage Using NFS

“In order for theStandby node to keep its state synchronized with the Active node, thisimplementation requires that the two nodes both have access to a directory on ashared storage device (for example, an NFS mount from a NAS).
When any namespacemodification is performed by the Active node, it durably logs a record of themodification to an edit log file stored in the shared directory. The Standbynode constantly watches this directory for edits, and when edits occur, theStandby node applies them to its own namespace. In the event of a failover, theStandby will ensure that it has read all of the edits from the shared storagebefore promoting itself to the Active state. This ensures that the namespacestate is fully synchronized before a failover occurs.”


Moredetails are available in the Cloudera High Availability Guide

(2) HortonWorks

2. 1HortonWorks and VMware HA

“… jointlydeveloped Hortonworks Data Platform High Availability (HA) Kit for VMwarevSphere customers that enables full stack high availability for Hadoop 1.0 byeliminating the NameNode and JobTracker single points of failure. It is aflexible virtual machine-based high availability solution that integrates withthe VMware vSphere™ platform’s HA functionality to monitor and automatefailover for NameNode and JobTracker master services running within theHortonworks Data Platform (HDP).”

 2. 2 Linux HA

Excerptfrom FAQ which gives the gist of Linux HA
“If this was so easy to do with Linux HA or other tools why didn’t theHDFS community do this earlier?
This is partly because the original HDFS team focused on very large clusterswhere cold failover was not practical. We assumed that Hadoop needed toprovide its own built-in solution As we’ve developed this technology,we’ve heard directly from our customers that HA solutions are complex and thatthey prefer using their existing, well understood, solutions.”


Thefollowing presentation offers more insight into HortonWorks initiatives for HA andwhat to expect post Hadoop 2.0 stabilization. 

 (Click on image to read the pdf)


(3) MapR


“MapR’s Lockless Storage Services feature a distributed HA architecture:
• Themetadata is distributed across the entire cluster. Every node stores and servesa portion of the metadata.
•Every portion of the metadata is replicated on three different nodes (thisnumber can be increased by the
administrator).For example, the metadata corresponding to all the files and directories under/project/
advertisingwould exist on three nodes. The three replicas are consistent at all timesexcept, of course, for a
shorttime after a failure.
• The metadata is persisted to disk, just likethe data.

Thefollowing illustration shows how metadata is laid out in a MapR cluster (inthis case, a small 8-node cluster).
Eachcolored triangle represents a portion of the overall metadata; or in MapRterminology, the metadata of a single volume:”

 

(4) IBM


4.1 Redundancy in Hardware for Master nodes (name node, secondary name node, job tracker)
4.2 Use GPFS-SNC


 
 (Click on image to read the pdf)

For theacademically oriented, there a few research papers from IBM Research whichoffer more insight into the subject:
Feng Wang, Jie Qiu, JieYang, Bo Dong, Xinhui Li, and Ying Li. 2009. Hadoop high availability through metadata replication. In Proceedingsof the first international workshop on Cloud data management(CloudDB '09). ACM, New York, NY, USA,37-44

RajagopalAnanthanarayanan, Karan Gupta, Prashant Pandey, Himabindu Pucha, PrasenjitSarkar, Mansi Shah, and Renu Tewari. 2009.Cloud analytics: do we really need to reinvent the storage stack?. In Proceedings of the 2009 conference on Hot topics in cloud computing(HotCloud'09). USENIX Association, Berkeley, CA, USA, , pages.

Lanyue Lu, DeanHildebrand, and Renu Tewari. 2011. ZoneFS: Stripe remodeling in cloud data centers. In Proceedingsof the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies(MSST '11). IEEE Computer Society, Washington, DC, USA,1-10
原创粉丝点击