某企业级hadoop源代码分析-2

来源：互联网发布：2017淘宝查的太严了吧编辑：程序博客网时间：2024/06/06 19:10

本文尝试分析某大型大数据解决方案公司企业级hadoop源代码，班门弄斧。

本篇文章的重点为HashMap vs TreeMap，LightWeightHashSet；
重点在于数据结构的访问速度和内存占用。

修改点2

Index: org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java===================================================================--- org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java (revision 37)+++ org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java (revision 42)@@ -22,6 +22,7 @@ import java.util.ArrayList; import java.util.Calendar; import java.util.GregorianCalendar;+import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.TreeMap;@@ -47,7 +48,8 @@ class InvalidateBlocks {   /** Mapping: DatanodeInfo -> Collection of Blocks */   private final Map<DatanodeInfo, LightWeightHashSet<Block>> node2blocks =-      new TreeMap<DatanodeInfo, LightWeightHashSet<Block>>();+   //   new TreeMap<DatanodeInfo, LightWeightHashSet<Block>>();+         new HashMap<DatanodeInfo, LightWeightHashSet<Block>>();   /** The total number of blocks in the map. */   private long numBlocks = 0L;Index: org/apache/hadoop/hdfs/server/blockmanagement/CorruptReplicasMap.java===================================================================--- org/apache/hadoop/hdfs/server/blockmanagement/CorruptReplicasMap.java   (revision 37)+++ org/apache/hadoop/hdfs/server/blockmanagement/CorruptReplicasMap.java   (revision 42)@@ -46,9 +46,9 @@     CORRUPTION_REPORTED  // client or datanode reported the corruption   }-  private final SortedMap<Block, Map<DatanodeDescriptor, Reason>> corruptReplicasMap =-    new TreeMap<Block, Map<DatanodeDescriptor, Reason>>();-+  private final HashMap<Block, Map<DatanodeDescriptor, Reason>> corruptReplicasMap =+  //  new TreeMap<Block, Map<DatanodeDescriptor, Reason>>();+         new HashMap<Block, Map<DatanodeDescriptor, Reason>>();   /**    * Mark the block belonging to datanode as corrupt.    *

代码变更主要涉及InvalidateBlocks，CorruptReplicasMap两个类，
InvalidateBlocks主要是维护失效块儿的类，CorruptReplicasMap主要是维护corrupt块的类。

所做的修改主要是将TreeMap替换为HashMap，接下来我们就分析下这样做的好处。

一。TreeMap - HashMap 对比

x TreeMap HashMap 实现 TreeMap基于红黑树（一种自平衡二叉查找树）实现的，是基于散列表实现的时间复杂度时间复杂度平均能达到O(log n) 时间复杂度平均能达到O(1) 排序已排序未排序线程安全非线程安全非线程安全适用按自然顺序或自定义顺序遍历键（key） HashMap里面存入的键值对在取出的时候是随机的，它根据键的HashCode值存储数据,根据键可以直接获取它的值，具有很快的访问速度。在Map 中插入、删除和定位元素，HashMap是最好的选择。缺点插入、删除需要维护平衡会牺牲一些效率 x

HashMap通常比TreeMap快一点（树和哈希表的数据结构使然），建议多使用HashMap,在需要排序的Map时候才用TreeMap.

模拟150W以内海量数据的插入和查找，通过增加和查找两方面的性能测试，结果如下：

type 10W 50W 100W 150W 0-1W 0-25W 0-50W type 插入插入插入插入查找查找查找 HashMap 18 ms 93 ms 217 ms 303ms 2 ms 13 ms 45 ms Concurrent SkipListMap 62 ms 227 ms 433 ms 689ms 7 ms 80 ms 119 ms TreeMap 33 ms 228 ms 429 ms 584 ms 4ms 34 ms 61 ms

参考：
HashMap和TreeMap区别详解以及底层实现
Java8系列之重新认识HashMap

二。LightWeightHashSet vs HashSet

对于HashSet而言，它是基于HashMap实现的，HashSet底层使用HashMap来保存所有元素，因此HashSet 的实现比较简单，相关HashSet的操作，基本上都是直接调用底层HashMap的相关方法来完成。

LightWeightHashSet

/** * A low memory linked hash set implementation, which uses an array for storing * the elements and linked lists for collision resolution. This class does not * support null element. * * This class is not thread safe. * */public class LightWeightHashSet<T> implements Collection<T> {  /**   * An internal array of entries, which are the rows of the hash table. The   * size must be a power of two.   */  protected LinkedElement<T>[] entries;

低内存实现，用数组存储元素，链表解决冲突问题。
不支持null元素。非线程安全。

Yi Liu added a comment - 30/Jul/15 08:42

Arpit Agarwal, sorry for late response
do you have any estimates of the memory saved by using LightWeightHashSet?
Yes, compared to java HashSet, there are two advantages from memory point of review:
Java HashSet internally uses a HashMap, so there is one more reference (4 bytes) for each entry compared to LightWeightHashSet, so we can save 4 * size bytes of memory.In LightWeightHashSet, when elements become less, the size is shrinked a lot.
So we can see LightWeightHashSet is more better. The main issue is LightWeightHashSet#LinkedSetIterator doesn’t support remove currently, it’s easy to support it (similar to java HashSet). By the way, currently in Hadoop, we use LightWeightHashSet for all big objects required hash set except this one which needs to use remove.

以下为翻译分析。

1.因为HashSet基于HashMap实现，每个元素都会多一个引用，一个引用占4个字节。

    /**     * Adds the specified element to this set if it is not already present.     * More formally, adds the specified element <tt>e</tt> to this set if     * this set contains no element <tt>e2</tt> such that     * <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.     * If this set already contains the element, the call leaves the set     * unchanged and returns <tt>false</tt>.     *     * @param e element to be added to this set     * @return <tt>true</tt> if this set did not already contain the specified     * element     */    public boolean add(E e) {        return map.put(e, PRESENT)==null;    }

2.因为LightWeightHashSet是用链表方式实现，所以当元素减少时，内存占用也会少很多。
HashMap竟然没有shrink，这样设计的原因，一部分原因是不太好做，而且性能和效率不太好保证。
这里的LightWeightHashSet采用的是一种cpu换内存的做法。

  /**   * Resize the internal table to given capacity.   */  @SuppressWarnings("unchecked")  private void resize(int cap) {    int newCapacity = computeCapacity(cap);    if (newCapacity == this.capacity) {      return;    }    this.capacity = newCapacity;    this.expandThreshold = (int) (capacity * maxLoadFactor);    this.shrinkThreshold = (int) (capacity * minLoadFactor);    this.hash_mask = capacity - 1;    LinkedElement<T>[] temp = entries;    entries = new LinkedElement[capacity];    for (int i = 0; i < temp.length; i++) {      LinkedElement<T> curr = temp[i];      while (curr != null) {        LinkedElement<T> next = curr.next;        int index = getIndex(curr.hashCode);        curr.next = entries[index];        entries[index] = curr;        curr = next;      }    }  }

参考：

Shrinking HashMaps (was Re: Proposal: Better HashMap.resize() when memory is tight)

Java集合—HashSet的源码分析

is-java-hashmap-clear-and-remove-memory-effective

Java-HashMap工作原理及实现

三。LightWeightGSet vs LightWeightHashSet

LightWeightGSet不会根据数据的多少而进行二次扩容，而LightWeightHashSet会有个临界点来触发扩容，而且比原生的HashSet还多了一个缩容，但是扩容和缩容都是一个非常耗时的过程，因为它需要重新计算这些数据在新table数组中的位置并进行复制处理。所以如果我们已经预知元素的个数，那么预设元素的个数能够有效的提高性能，这种情况下使用LightWeightGSet会非常合适。

LightWeightGSet中底层数组的大小是在构造函数中固定的，并且其数组的大小不会扩容，则其数组的初始化就尤为重要，下面看下BlocksMap设置的LightWeightGSet数组的大小。代码如下：

/** * Let t = percentage of max memory. * Let e = round(log_2 t). * Then, we choose capacity = 2^e/(size of reference), * unless it is outside the close interval [1, 2^30]. */public static int computeCapacity(double percentage, String mapName) {  return computeCapacity(Runtime.getRuntime().maxMemory(), percentage,      mapName);}static int computeCapacity(long maxMemory, double percentage,    String mapName) {  ... // 参数校验  //VM detection  //See http://java.sun.com/docs/hotspot/HotSpotFAQ.html#64bit_detection  final String vmBit = System.getProperty("sun.arch.data.model");  //Percentage of max memory  final double percentDivisor = 100.0/percentage;  // 运行时的最大内存Runtime.getRuntime().maxMemory()  final double percentMemory = maxMemory/percentDivisor;  //compute capacity  final int e1 = (int)(Math.log(percentMemory)/Math.log(2.0) + 0.5);  final int e2 = e1 - ("32".equals(vmBit)? 2: 3);  final int exponent = e2 < 0? 0: e2 > 30? 30: e2;  final int c = 1 << exponent;  ...  return c;}

代码的主要逻辑是获取运行时的最大内存，然后根据百分比计算最多能保存的元素个数。
默认使用内存的2%来存储块儿信息。

公式自行研究官方pdf。。。

shv Konstantin Shvachko added a comment - 30/Apr/10 01:43

Do you have an estimate on how much space this will save in NN’s memory footprint?
szetszwo Tsz Wo Nicholas Sze added a comment - 30/Apr/10 08:33

I believe we can save from 24 to 40 bytes per entry. It depends on the chosen implementation (will give more details later).

In a large clusters, there are ~60m blocks. Then, we may save from 1.5GB to 2.5GB NN memory.

参考：

NB的官方设计文档

Reducing NameNode memory usage by an alternate hash table

HDFS中LightWeightGSet与HashMap结构解析

HDFS源码分析之LightWeightGSet

四。汇总

数据结构就分析到这，hadoop项目中类似于这样的基础数据结构封装比较多，
如LightWeightGSet，LightWeightHashSet，LightWeightLinkedSet，可以理解为各种特定用途的定制，使用前一定要了解其优缺点。

下面把JDK的基础数据结构和HADOOP部分的继承关系简单画一下。

graph LRLightWeightGSet-->GSetLightWeightHashSet-->CollectionLightWeightLinkedSet-->LightWeightHashSet

集合类说明及区别

graph LRCollection-->ListList-->LinkedListList-->ArrayListList-->VectorVector-->StackCollection-->SetMap-->HashtableMap-->HashMapMap-->WeakHashMap

LightWeightGSet

/** * A low memory footprint {@link GSet} implementation, * which uses an array for storing the elements * and linked lists for collision resolution. * * No rehash will be performed. * Therefore, the internal array will never be resized. * * This class does not support null element. * * This class is not thread safe. * * @param <K> Key type for looking up the elements * @param <E> Element type, which must be *       (1) a subclass of K, and *       (2) implementing {@link LinkedElement} interface. */@InterfaceAudience.Privatepublic class LightWeightGSet<K, E extends K> implements GSet<K, E> {

低内存实现，用数组存储元素，链表解决冲突问题。
不会出现rehash，因此内部的数组大小永远不会变化。
不支持null元素。非线程安全。

LightWeightHashSet

/** * A low memory linked hash set implementation, which uses an array for storing * the elements and linked lists for collision resolution. This class does not * support null element. * * This class is not thread safe. * */public class LightWeightHashSet<T> implements Collection<T> {

低内存实现，用数组存储元素，链表解决冲突问题。
不支持null元素。非线程安全。

LightWeightLinkedSet

/** * A low memory linked hash set implementation, which uses an array for storing * the elements and linked lists for collision resolution. In addition it stores * elements in a linked list to ensure ordered traversal. This class does not * support null element. * * This class is not thread safe. * */public class LightWeightLinkedSet<T> extends LightWeightHashSet<T> {

低内存实现，用数组存储元素，链表解决冲突问题。
为保证有序遍历，使用链表实现。
不支持null元素。非线程安全。

Apache jira相关

多多关注社区，努力站到巨人的肩膀上，不要老在井里面玩。

HDFS NameNode重启优化

https://issues.apache.org/jira/browse/HDFS-8792

BlockManager#postponedMisreplicatedBlocks should use a LightWeightHashSet to save memory

DescriptionLightWeightHashSet requires fewer memory than java hashset.

https://issues.apache.org/jira/browse/HDFS-8793

https://issues.apache.org/jira/browse/HDFS-8794

Improve CorruptReplicasMap#corruptReplicasMap

DescriptionCurrently we use TreeMap for corruptReplicasMap, actually the only need sorted place is getCorruptReplicaBlockIds which is used by test.So we can use HashMap.From memory and performance view, HashMap is better than TreeMap, a simliar optimization HDFS-7433. Of course we need to make few change to getCorruptReplicaBlockIds.

https://issues.apache.org/jira/browse/HDFS-1890
A few improvements on the LeaseRenewer.pendingCreates map

DescriptionThe class is better to be just a Map instead of a SortedMap.The value type is better to be DFSOutputStream instead of OutputStream.The variable name is better to be filesBeingWritten instead of pendingCreates since we have append.

0 0