LinkedList源码系列(1)

来源：互联网发布：surpac软件编辑：程序博客网时间：2024/05/17 04:47

//:judy/List

首先我们知道LinkedList的ADT（抽象数据类型）其实是链表，而链表的优缺点想必大家都知道。。在此不用赘述

今天我们主要讲的是LinkedLIst 类和 Node类以及ListIterator类用泛型的设计，以及核心常用的源码：

先来看下Collection接口，List接口，ListIterator接口的主要接口

public interface Collection<T> extends Iterable<T> {int size();boolean isEmpty();void clear();boolean contains(T x);boolean add(T x);boolean remove(T x);//实现Iterable接口可以拥有增强for循环java.util.Iterator<T> iterator();}

public interface List<T> extends Collection<T>{T get(int idx);T set(int idx,T newVal);void add(int idx,T x);void remove(int idx);ListIterator<T> listIterator(int pos);}

public interface Iterator<T>{    boolean hasNext();    T next();    void remove();}

想使用增强for循环就必须实现Iterator接口，覆盖Iterator（）方法，返回一个迭代器，这也就是为什么作为java Collections framework的根类Collection为什么要实现Iterable接口。下面看Linkedlist的设计

<pre name="code" class="java">public class MyLinkedList<T> implements Iterable<T>{private int size;private int modCount;//帮助迭代器检测集合的变化private Node<T> first;<span style="white-space:pre"></span>//总是指向List第一个元素private Node<T> last;<span style="white-space:pre"></span>//总是指向List的最后一元素private static class Node<T>{public Node(T val,Node<T> prev,Node<T> next ){this.data = val;this.prev = prev;this.next = next;}public T data;public Node<T> prev;public Node<T> next;}public Iterator<T> iterator(int index) {checkPositionIndex(index);return new LinkedListIterator(index);}private class LinkedListIterator implements java.util.ListIterator<T>{private Node<T> lastReturned = null;private Node<T> next ;private int nextIndex; private int expectedModCount = modCount;public LinkedListIterator(){}public LinkedListIterator(int index){// assert checkPositionIndex(index)next = (index==size)?null:getNode(index);nextIndex = index;}      }}

看到这里，我们肯定有个疑问为什么把Node类设计成嵌套类，而把LinkedListIterator设计成内部类，因为根据它们各自的定义我们可以知道，嵌套类它是外部类所产生所有的实例对象所共享的，也是就是静态的，而内部类却是和每个实例对象相关联的，并且在LinkedList中我们也可以自己想象一下，结点是干嘛的，迭代器又是要干嘛的，结点很明显是共享的，用来保护每个元素结点之间属于线性关系，1:1，而迭代器就是每个实例对象所独有的，用来迭代本链表中的元素，而迭代器中的remove方法比起集合的remove的方法也是有优点的：Iterator的remove方法在删除项时，已经确定了项的位置，而Collection的remove（）方法则需要先找出要删除项的位置，这样更更高效。但是已经确定在使用Iterator时就不能使用Collection中的clear，remove，add等改变集合结构的方法，否则Iterator就不合法，会抛出一个ConcurrentModificationException异常。而ModCount就是用来检测集合变化的手段！

打字好累还是直接看源码吧。。。首先向伟大的JAVA之母Josh Bloch致敬！

public void clear（）的源码：

<pre name="code" class="java">   /**     * Removes all of the elements from this list.     * The list will be empty after this call returns.     */public void clear(){for(Node<T> tmp = first;tmp!=null;){Node<T> next = tmp.next;tmp.data = null;//help GCtmp.prev = null;tmp.next = null;tmp = next;}first = last = null;size = 0 ;modCount++;}

个人感觉这个源码蕴含的思想挺重要的，许多人在之前写这个clear的时候都是如下：

public void clear(){first = last = null;size = 0 ;modCount++;}

认为Java虚拟机会帮助我们释放在堆里面创建的对象，但是你看源代码很明显有一个循环遍历LinkedList的每个节点，并且把结点的 data,prev,next对象的引用置null，这样做可以帮助GC，下面是作者对它的描述：

Description:Clearing all of the links between nodes is "unnecessary", but:

helps a generational GC if the discarded nodes inhabit more than one generation is sure to free memory even if there is a reachable Iterator

这句话的翻译就是：清除所有节点之间的链接是“不必要的”，但：帮助一代垃圾收集，如果被丢弃的节点居住一代以上即使有一个可到达的迭代器，也确保释放内存（ps：本人的英语very poor,不对别打我！！！）其实也就是为什么不把每个结点置null，而把结点的 data,prev,next对象的引用置nul的原因，但是为什么又能帮助到GC呢，那就要引出GC的工作机制：

在<<Thinking in java>>中的Bruce Eckel如是说，在其他系统中的垃圾回收机制采用的是一种简单但速度很低的“引用计数”技术，它的工作原理就是每一个对象都含有一个引用计数器，当引用连接至对象时引用计数加一，当引用离开作用域或置null时，引用计数减一，虽然管理引用计数的开销不大，但是这项工作是在整个程序的生命周期持续发生。垃圾回收器会在含有全部对象的列表上遍历，当发现某个对象的引用计数为0时，立即释放其占有空间的资源。但这种方法有个缺陷，就是对象之间存在循环引用，出现“对象应该被释放，但是其引用计数不为0”，对垃圾回收器来说，定位这样的交互自引用对象组需要的工作量极大，所以说引用计数只是用来说明垃圾收集器的工作方式，并未真正在哪个java虚拟机中实现。

在一些更快的模式中，它们并非基于引用计数的计数，它们的思想是：对任何“活”的对象，一定能最终追踪到其存活在堆栈或静态存储区之中的引用。这个引用链条可能会穿过数个对象层次。由此，从堆栈和静态存储区开始遍历所有引用就能找到所有活着的对象。对于发现的每个引用，必须追踪它所引用的对象，然后是此对象的所有引用，如此反复进行，直到“根源于堆栈和静态存储区的引用”，所形成的网络全部被访问为止。你所访问过的对象必须是“活”的。这就解决了“交互自引用的对象数组”问题。

在这种情况下，java虚拟机采用的是自适应的垃圾回收技术。如何处理找到的存活对象，取决于不同的java虚拟机实现。这里有两种方法：第一种是“停止--复制”。显然这意味着先暂停程序的运行，然后将所有存活的对象从当前堆复制到另一个堆，没有被复制的全是垃圾。对象复制到新堆是一个挨着一个紧密排列，然后堆指针可以像一个传送带一样简单的移动到尚未分配的区域，高效的分配空间。但是对于"复制式的回收器"来说效率是非常低的。

问题一：需要维护两个堆，对此java虚拟机的解决方法是需要从堆中分配几块较大的内存，而复制的动作就发生在这几块较大的内存之间。

问题二：程序进入稳定状态之后，可能只会产生少量垃圾，甚至没有垃圾，尽管如此，复制式回收器仍然会经所有的内存从一处复制到另一处，这很浪费。为避免这一情况，一些java虚拟机会进行检查：要是没有垃圾产生，就会转到另一种工作模式，即标记--清扫。

标记-清扫的思想是：同样从堆栈和静态存储区出发，遍历所有的应用，进而找出所有存活的对象，给对象一个标记，但是这个过程不会回收任何对象，只有全部的标记动作完成时，清理动作才会开始。，没有被标记的对象将被释放，所以剩下的堆空间是不连续的。垃圾回收器要是需要得到连续空间就必须重新整理剩下的对象。

在这里讨论的java的虚拟机中，内存分配以较大的“块”为单位。如果对象比较大，那么将占有单独的块。严格来说，“停止-复制”要求在释放旧有对象之前，必须把所有存活的对象从旧堆复制到新堆。有了块之后垃圾回收器再回收的时候就可以往废弃的块里拷贝对象，每个块都有相应的代数来判断对象是否存活，块在某处被引用，其代数会加一。垃圾回收器会定期进行完整的清理动作--大型对象仍然不会被复制（只是代数增加），内涵小型对象的那些块则被复制整理。java虚拟机会进行监视，如果所有的对象都很稳定，垃圾回收器的效率很低，就切换到标记-清扫的方式。同样，虚拟机也会对标记-清扫进行监视，要是堆空间出现很多碎片，就会切换回“停止-复制”的方式。这就是自适应技术。

总的来说就是“自适应，分块的，停止-复制，标记-清扫”式垃圾回收器！！！

综上所述，帮助GC回收是很有必要的~~

linkFirst(T e)的源码：

   /**     * Links e as first element.     */private void linkFirst(T e){final Node<T> f = first;final Node<T> newNode = new Node<T>(e, null, f); //  newNode.e = e; newNode.prev = null; newNode.next = f;first = newNode;if(f == null){last = newNode;  }else{ f.prev = newNode;   }modCount++;size++;}

linkFirst的代码是往双链表的头加元素，也就是deQue在队头的进队操作offerFirst(T e)addFirst(T e)内部封装的其实就是linkFirst(T e):

   /**     * Inserts the specified element at the front of this list.     *     * @param e the element to insert     * @return {@code true} (as specified by {@link Deque#offerFirst})     * @since 1.6     */public boolean offerFirst(T e){addFirst(e);return true;}

void linkLast(T e)的源码：

   /**     * Links e as last element.     */void linkLast(T e){final Node<T> l = last;final Node<T> newNode = new Node<T>(e, l, null);  //  newNode.e = e;    newNode.prev = l;      newNode.next = null;last = newNode;if(l == null)first = newNode;elsel.next = newNode;modCount++;size++;}

其实就是往List的高端添加元素，还是deQue的队尾进队操作。。

   /**     * Inserts the specified element at the end of this list.     *     * @param e the element to insert     * @return {@code true} (as specified by {@link Deque#offerLast})     * @since 1.6     */public boolean offerLast(T e){addLast(e);return true;}

T unlinkFirst(Node<T> f)的源码：

   /**     * Unlinks non-null first node f.     */private T unlinkFirst(Node<T> f){//assert f = first && f != nullfinal T item = f.data;final Node<T> next = f.next;f.data = null;f.next = null;// help GCfirst = next;if(next == null)      //也就是List只有一个元素的时候last = null;elsenext.prev = null;      //让原先的第二个元素的prev->null ，成为新的头元素modCount++;size--;return item;}

也是deQue的pollFirst（）队头的出队操作。。。。

/**     * Retrieves and removes the first element of this list,     * or returns {@code null} if this list is empty.     *     * @return the first element of this list, or {@code null} if     *     this list is empty     * @since 1.6     */public T pollFirst(){final Node<T> f = first;return (f == null)? null:unlinkFirst(f);}

T unlinkLast(T e)的源码：

   /**     * Unlinks non-null last node l.     */private T unlinkLast(Node<T> l){//assert l!=null && last = lfinal T item = l.data;final Node<T> prev = l.prev;l.data = null;l.prev = null;last = prev; if(prev == null)      //也就是List只有一个元素的时候first = null;elseprev.next = null;//新的last nodemodCount++;size--;return item;}

deQue操作：

   /**     * Retrieves and removes the last element of this list,     * or returns {@code null} if this list is empty.     *     * @return the last element of this list, or {@code null} if     *     this list is empty     * @since 1.6     */public T pollLast(){final Node<T> l = last;return (l == null)? null:unlinkLast(l);}

T unlink(Node<T> ):

   /**     * Unlinks non-null node x.     */private T unlink(Node<T> x){//assert x != nullfinal T item = x.data;final Node<T> prev = x.prev;final Node<T> next = x.next;if(prev == null)<span style="white-space:pre"></span>//如果删除的是头结点的情况first = next;else{prev.next = next;x.prev = null;<span style="white-space:pre"></span>//help GC<span style="white-space:pre"></span>}if(next == null)<span style="white-space:pre"></span>//删除尾结点的情况last = prev;else{next.prev = prev;x.next = null;<span style="white-space:pre"></span>//help GC}x.data = null;modCount++;size--;return item;}

这个分离链表任意结点的代码是非常basic and important ，所以务必熟悉。。。
boolean remove(Object o):

/**     * Removes the first occurrence of the specified element from this list,     * if it is present.  If this list does not contain the element, it is     * unchanged.  More formally(正规的), removes the element with the lowest index     * {@code i} such that     * <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>     * (if such an element exists).  Returns {@code true} if this list     * contained the specified element (or equivalently, if this list     * changed as a result of the call).     *     * @param o element to be removed from this list, if present     * @return {@code true} if this list contained the specified element     */public boolean remove(Object o){if(o == null){for(Node<T> x = first; x != null; x = x.next){if(x.data == null){this.unlink(x);return true;}}}else{for(Node<T> x = first; x != null; x = x.next){if(o.equals(x.data)){this.unlink(x);return true;}}}return false;}

这段代码的核心就是 o == null ?get(i).data = null : o.equals(get(i).data) ，其实思想就是当参数传递对象在null的情况下也可以遍历双链表找到Node.data == null的情况这也是必须考虑的，其次就是正常情况用equals比较对象，最后如果还没匹配就返回fasle~~~

boolean addAll(int index,Collection<? extends T> coll):

   /**     * Inserts all of the elements in the specified collection into this     * list, starting at the specified position.  Shifts(移位) the element     * currently at that position (if any) and any subsequent(随后的) elements to     * the right (increases their indices).  The new elements will appear(呈现)     * in the list in the order(规则) that they are returned by the     * specified collection's iterator.     *     * @param index index at which to insert the first element     *              from the specified collection     * @param c collection containing elements to be added to this list     * @return {@code true} if this list changed as a result of the call     * @throws IndexOutOfBoundsException {@inheritDoc}     * @throws NullPointerException if the specified collection is null     */public boolean addAll(int index, Collection<? extends T> coll){checkPositionIndex(index);   //后面再说Object[] a = coll.toArray();int newNum = a.length;        //先检查原先coll的的lengthif(newNum == 0)return false;Node<T> pre,succ;    //pre相当于新添加进来的Collection的firstif(index == size){     //往List的末端加入Collection，也就是index == sizepre = last;succ = null;}else{<span style="white-space:pre"></span>succ = getNode(index);    //succ指向索引为index的结点pre = succ.prev;<span style="white-space:pre"></span>//pre->succ之前的结点}for(Object o : a){@SuppressWarnings("unchecked")T e = (T)o;Node<T> newNode = new Node<T>(e, pre, null);   // newNode.data = e; newNode.pre = pre; newNode.next = null;if(pre == null){<span style="white-space:pre"></span>//如果输入的index为0的情况下first = newNode;}else{pre.next = newNode;}pre = newNode;<span style="white-space:pre"></span>//不断往pre之后添加newNode，pre也不停地后移}if(succ == null){<span style="white-space:pre"></span>//index == size 的时候，pre->newListLastElementlast = pre;}else{<span style="white-space:pre"></span>//在index处添加newList的情况pre.next = succ;<span style="white-space:pre"></span>succ.prev =pre;}modCount++;size += newNum;return true;}

void linkBefore（T e,Node<T> p）:

   /**     * Inserts element e before non-null Node p.     */private void linkBefore(T e,Node<T> p){//assert p != nullfinal Node<T> pred = p.prev;final Node<T> newNode = new Node<T>(e, pred, p);   //newNode.data = e; newNode.prev = pred; newNode.next = p;p.prev = newNode;if(pred==null)        //往List的低端插入newNode时first = newNode;elsepred.next = newNode;modCount++;size++;}

下面介绍的就是比较健壮的参数检查机制：

   /**     * Tells if the argument is the index of an existing element.     */private boolean isElementIndex(int index){return index >= 0 && index < size;}   /**     * Tells if the argument is the index of a valid(有效的) position for an     * iterator or an add operation.     */private boolean isPositionIndex(int index){return index >= 0 && index <=size;}   /**     * Constructs an IndexOutOfBoundsException detail(详情) message.     * Of the many possible refactorings of the error handling(处理) code,     * this "outlining" performs(执行) best with both server and client VMs.     */private String outOfBoundsMsg(int index){return "index:"+index+",size="+size;}private void checkElementIndex(int index){if(!isElementIndex(index))throw new IndexOutOfBoundsException(outOfBoundsMsg(index));}private void checkPositionIndex(int index){if(!isPositionIndex(index))throw new IndexOutOfBoundsException(outOfBoundsMsg(index));}

上面的isElementIndex是对结点的索引属于[0,size-1），而IsPositionIndex是对位置的索引，需要用到index == size这个位置，比如addAll（）,所以其范围为[0,size].否则就抛出IndexOutofBoundsException异常！！！

Node<T> getNode(int index):

   /**     * Returns the (non-null) Node at the specified element index.     */private Node<T> getNode(int idx){// assert isElementIndex(idx)checkElementIndex(idx);if(idx < (size >> 1)){Node<T> p = first;for(int i = 0 ;i<idx;i++){p = p.next;}return p;}else{Node<T> p = last;for(int i = size-1;i > idx;i--){p = p.prev;}return p;}}

这里面有一点折半的思想，因为是双链表嘛，所以也可以从后往前遍历。。。更有效率~~~

0 0