数据结构(Map)

来源：互联网发布：清洗过期淘宝二次审核编辑：程序博客网时间：2024/06/05 00:49

HsdhMap

结构特点
HashMap的结构
1、table是一个Entry[]数组类型，而Entry实际上就是一个单向链表。哈希表的”key-value键值对”都是存储在Entry数组中的。

static class Entry<K,V> implements Map.Entry<K,V> {final K key;V value;Entry<K,V> next;int hash;}

2、size是HashMap的大小，它是HashMap保存的键值对的数量。

  /**    * The number of key-value mappings contained in this map.*/transient int size;void createEntry(int hash, K key, V value, int bucketIndex) {    Entry<K,V> e = table[bucketIndex];    table[bucketIndex] = new Entry<>(hash, key, value, e);    size++;}

3、threshold是HashMap的阈值，用于判断是否需要调整HashMap的容量。当HashMap中存储数据的数量达到threshold时，就需要将HashMap的容量加倍。

 /** * The next size value at which to resize (capacity * load factor). * @serial */// If table == EMPTY_TABLE then this is the initial capacity at which the// table will be created when inflated.int threshold;private void inflateTable(int toSize) {    // Find a power of 2 >= toSize    int capacity = roundUpToPowerOf2(toSize);    // 一般情况下 threshold = capacity * loadFactor    threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);    table = new Entry[capacity];    initHashSeedAsNeeded(capacity);}

4、上述代码中的loadFactor就是加载因子。

/** * The load factor for the hash table. * * @serial */final float loadFactor;如果加载因子越大，对空间的利用更充分，但是查找效率会降低（链表长度会越来越长）。如果加载因子太小，那么表中的数据将过于稀疏（很多空间还没用，就开始扩容了），对空间造成严重浪费。如果我们在构造方法中不指定，则系统默认加载因子为0.75，这是一个比较理想的值，一般情况下我们是无需修改的。

扩容

1、容量特点：无论我们指定的容量为多少，构造方法都会将实际容量设为不小于指定容量的2的次方的一个数，且最大值不能超过2的30次方

public HashMap(int initialCapacity, float loadFactor) {    if (initialCapacity < 0)        throw new IllegalArgumentException("Illegal initial capacity: " +                                           initialCapacity);    if (initialCapacity > MAXIMUM_CAPACITY)        initialCapacity = MAXIMUM_CAPACITY;    if (loadFactor <= 0 || Float.isNaN(loadFactor))        throw new IllegalArgumentException("Illegal load factor: " +                                           loadFactor);    this.loadFactor = loadFactor;    threshold = initialCapacity;    init();}在构造方法中，只是进行了简单的初始化操作，容量的真实值并不是这里确定的。public V put(K key, V value) {// 如果是首次添加，进行容量的初始化if (table == EMPTY_TABLE) {    inflateTable(threshold);}if (key == null)    return putForNullKey(value);int hash = hash(key);int i = indexFor(hash, table.length);for (Entry<K,V> e = table[i]; e != null; e = e.next) {    Object k;    if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {        V oldValue = e.value;        e.value = value;        e.recordAccess(this);        return oldValue;    }}modCount++;addEntry(hash, key, value, i);return null;}

在put()方法中，对首次调用时候进行了判断，进行了数组的初始化操作,调用了初始化数组的方法inflateTable(threshold)。

private void inflateTable(int toSize) {// Find a power of 2 >= toSizeint capacity = roundUpToPowerOf2(toSize);threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);table = new Entry[capacity];initHashSeedAsNeeded(capacity);}

加载因子：加载因子越大，数组填充的越满，这样可以有效的利用空间，但是有一个弊端就是可能会导致冲突的加大，链表过长，反过来却又会造成内存空间的浪费。所以只能需要在空间和时间中找一个平衡点，那就是设置有效的加载因子。我们知道，很多时候为了提高查询效率的做法都是牺牲空间换取时间，到底该怎么取舍，那就要具体分析。
hashCode() 和 equals()

1、 hashCode的存在主要是用于查找的快捷性，如Hashtable，HashMap等，hashCode是用来在散列存储结构中确定对象的存储地址的；

int hash = hash(key);int i = indexFor(hash, table.length);static int indexFor(int h, int length) {// assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";return h & (length-1);}

由于保证了数组的容量是一个偶数，h & (length-1)能够均匀的分布在散列表。

2、如果两个对象相同，就是适用于equals(java.lang.Object) 方法，那么这两个对象的hashCode一定要相同；

// 对key为空值的处理，HashMap中key值可以为空if (key == null)   return putForNullKey(value);// hash值和equals()方法在put()方法中的运用int hash = hash(key);int i = indexFor(hash, table.length);for (Entry<K,V> e = table[i]; e != null; e = e.next) {Object k;if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {    V oldValue = e.value;    e.value = value;    e.recordAccess(this);    return oldValue;}}

3、如果对象的equals方法被重写，那么对象的hashCode也尽量重写，并且产生hashCode使用的对象，一定要和equals方法中使用的一致，否则就会违反上面提到的第2点；

4、两个对象的hashCode相同，并不一定表示两个对象就相同，也就是不一定适用于equals(java.lang.Object) 方法，只能够说明这两个对象在散列存储结构中相同的索引位，如Hashtable，他们“存放在同一个篮子里”。
HsdhMap
概述

1、LinkedHashMap是HashMap的子类，与HashMap有着同样的存储结构，但它加入了一个双向链表的头结点，将所有put到LinkedHashmap的节点一一串成了一个双向循环链表，因此它保留了节点插入的顺序，可以使节点的输出顺序与输入顺序相同。

/** * The head of the doubly linked list. */// 记录添加顺序的双向链表private transient Entry<K,V> header;// 记录添加顺序的双向链表private static class Entry<K,V> extends HashMap.Entry<K,V> {// These fields comprise the doubly linked list used for iteration.Entry<K,V> before, after;}// 添加节点方法void addEntry(int hash, K key, V value, int bucketIndex) {super.addEntry(hash, key, value, bucketIndex);// Remove eldest entry if instructed// 添加到双向节点节点的头部Entry<K,V> eldest = header.after;// 如果removeEldestEntry(eldest)为true，则删除最旧的节点，默认为falseif (removeEldestEntry(eldest)) {    removeEntryForKey(eldest.key);}}

2、LinkedHashMap可以用来实现LRU算法。

3、LinkedHashMap同样是非线程安全的，只在单线程环境下使用。
LRU算法

LRU（Least recently used，最近最少使用）算法根据数据的历史访问记录来进行淘汰数据，其核心思想是“如果数据最近被访问过，那么将来被访问的几率也更高”。
Lru算法

新数据插入到链表头部。
每当缓存命中（即缓存数据被访问），则将数据移到链表头部。
当链表满的时候，将链表尾部的数据丢弃。

上面3条中，LinkedHashMap实现了第一条，但是没有实现2、3两条规定

先看第2条，访问数据（查询，更新）时，在LinkedHashMap中，会做什么事情

首先看看更新操作，LinkedHashMap的更新操作，其实就是使用了HashMap的put()方法

public V put(K key, V value) {if (table == EMPTY_TABLE) {    inflateTable(threshold);}if (key == null)    return putForNullKey(value);int hash = hash(key);int i = indexFor(hash, table.length);for (Entry<K,V> e = table[i]; e != null; e = e.next) {    Object k;    // 如果是已经存在的数据    if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {        V oldValue = e.value;        e.value = value;        // 调用这个方法        e.recordAccess(this);        return oldValue;    }}modCount++;addEntry(hash, key, value, i);return null;}

我们可以看到已经存在的数据进行更新时，有调用了recordAccess(this)这个方法，但是，这个方法在HashMap中是一个空实现，这个方法的真正实现在LinkedHashMap中

void recordAccess(HashMap<K,V> m) {LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;// accessOrder默认值为falseif (lm.accessOrder) {    lm.modCount++;    // remove()和addBefore()组合，移动到链表的头部    remove();    addBefore(lm.header);}}

可以看见，如果accessOrder为true，就把节点移动到链表的头部。但是，默认值为false。把节点移动到链表的头部，是符合LRU的规则第二条的，可惜默认为false。

然后，我们来看看LinkedHashMap的查询方法

public V get(Object key) {Entry<K,V> e = (Entry<K,V>)getEntry(key);if (e == null)    return null;e.recordAccess(this);return e.value;}

我们发现，这个方法也调用了recordAccess(this)。但是，和上文描述情况一样，默认accessOrder参数为false，要把accessOrder设置为true，才满足Lru的规则第2条。

其实，在LinkedHashMap中，有设置accessOrder的构造方法。

public LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder) {super(initialCapacity, loadFactor);//this.accessOrder = accessOrder;}

我们在编写Lru的LinkedHashMap的时候通过调用这个构造方法就可以设置accessOrder

最后，我们来看看LinkedHashMap的添加操作

void addEntry(int hash, K key, V value, int bucketIndex) {super.addEntry(hash, key, value, bucketIndex);// Remove eldest entry if instructed// 找出链表中的尾部节点Entry<K,V> eldest = header.after;// 如果removeEldestEntry(eldest)为true就删除节点，默认为falseif (removeEldestEntry(eldest)) {    removeEntryForKey(eldest.key);}}

我们看出来，在removeEldestEntry(eldest)返回的是true的时候，能够实现Lru的第3条规定。

综合起来，我们编写出了这个实现了Lru算法的LinkedHashMap

public class LRULinkedHashMap<K, V> extends LinkedHashMap<K, V> {private static final long serialVersionUID = -5933045562735378538L;// 定义Lru缓存的默认容量private static final int LRU_MAX_CAPACITY = 1024;private int capacity;public LRULinkedHashMap() {    super();}// 通过构造方法设置accessOrderpublic LRULinkedHashMap(int initialCapacity, float loadFactor, boolean isLRU) {    super(initialCapacity, loadFactor, true);    capacity = LRU_MAX_CAPACITY;}// 通过构造方法设置accessOrderpublic LRULinkedHashMap(int initialCapacity, float loadFactor,        boolean isLRU, int lruCapacity) {    super(initialCapacity, loadFactor, true);    this.capacity = lruCapacity;}// 复写LinkedHashMap的removeEldestEntry()方法@Overrideprotected boolean removeEldestEntry(Map.Entry<K, V> eldest) {    System.out.println(eldest.getKey() + "=" + eldest.getValue());    if (size() > capacity) {        return true;    }    return false;}}

TreeMap

定制比较器和自然比较器

TreeMap不保证元素的先后添加顺序,但是会对集合中的元素做排序操作，底层由有红黑树算法实现(树结构,比较擅长做范围查询)。

TreeSet要么自然排序,要么定制排序。

自然排序: 要求在TreeSet集合中的对象必须实现java.lang.Comparable接口，并覆盖compareTo方法。
定制排序: 要求在构建TreeSet对象的时候，传入一个比较器对象(必须实现java.lang.Comparator接口)。在比较器中覆盖compare方法,并编写比较规则.
TreeSet判断元素对象重复的规则:compareTo/compare方法是否返回0.如果返回0,则视为是同一个对象.

以下我们来看看TreeMap中使用比较器的代码吧

// 如果传入了定制比较器Comparator<? super K> cpr = comparator;if (cpr != null) {// 红黑树的算法do {    parent = t;    cmp = cpr.compare(key, t.key);    if (cmp < 0)        t = t.left;    else if (cmp > 0)        t = t.right;    else        return t.setValue(value);} while (t != null);}else {if (key == null)    throw new NullPointerException();// 如果实现了自然比较器Comparable<? super K> k = (Comparable<? super K>) key;// 红黑树算法do {    parent = t;    cmp = k.compareTo(t.key);    if (cmp < 0)        t = t.left;    else if (cmp > 0)        t = t.right;    else        return t.setValue(value);} while (t != null);}

从上面的代码中我们可以看出来，如果TreeMap中先是判断定制比较器，然后判断自然比较器。如果两个比较器都实现了，TreeMap会使用定制比较器进行比较，如果两个比较器都没有实现，TreeMap会在Comparable

阅读全文

0 0