JDK6和JDK7中String的substring()方法

来源：互联网发布：米兰治安知乎编辑：程序博客网时间：2024/04/29 21:34

转自：http://www.programcreek.com/2013/09/the-substring-method-in-jdk-6-and-jdk-7/
参考：http://www.importnew.com/7656.html
http://book.51cto.com/art/201504/472208.htm

JDK6 与JDK7中的 substring(int beginIndex, int endIndex) 方法存在不同。

substring作用

substring(int beginIndex, int endIndex) 方法返回一个以beginIndex 开始，endIndex-1 结束的string。

String x = "abcdef";x = x.substring(1,3);System.out.println(x)

输出 bc

What happens when substring() is called

你或许会认为x 被赋值为x.substring(1,3)之后会指向一个新的字符串。
这里写图片描述

但是，这其实不完全正确。

substring() in JDK 6

String其实是一个字符数组。
JDK6 String 主要包括三个部分

private final char value[];  private final int offset;  private final int count;

调用substring() 方法时，会创建一个新的字符串，但是这个字符串的value 仍然指的堆中的同一个字符数组。这两个字符只是通过 count 和 offset 来区别的。

这里写图片描述

//JDK 6public String substring(int beginIndex, int endIndex) {    if (beginIndex < 0) {            throw new StringIndexOutOfBoundsException(beginIndex);        }        if (endIndex > value.length) {            throw new StringIndexOutOfBoundsException(endIndex);        }        int subLen = endIndex - beginIndex;        if (subLen < 0) {            throw new StringIndexOutOfBoundsException(subLen);        }        return ((beginIndex == 0) && (endIndex == value.length)) ? this                : new String(value, beginIndex, subLen);}

可以看到在 substring（）的视线中，使用了String 的构造函数，生成一个新的String。

JDK 6 的String 构造函数：

String(int offset, int count, char value[]) {    this.value = value;    this.offset = offset;    this.count = count;}

内存泄露就出现在构造函数中，新生成的String 的value 简单的使用了原 String 的value的引用（value是一个final 数组），只是修改了offset 和count。

当原始字符被回收之后，value中多余的部分就造成了空间浪费。

假设有个很大的字符串而每次只是使用substring() 来使用一小部分。这就会引起性能问题， since you need only a small part, you keep the whole thing.

假设 x=x.substring(1,20) 长度只有19，value数组后面的就用不到，但是value还被引用，无法回收不用的部分。

For JDK 6, the solution is using the following, which will make it point to a real sub string:x = x.substring(x, y) + ""

substring() in JDK 7

JDK 7 中，substring(）方法在堆中创建了一个新的数组。

这里写图片描述

//JDK 7public String substring(int beginIndex, int endIndex) {    //check boundary    int subLen = endIndex - beginIndex;    return new String(value, beginIndex, subLen);}

JDK 7 中对String进行调整，去掉offset 和count 两项，String 的实质性内容仅由value决定，而value数组本身就代表这个String 的实际取值。

这是这种改进，解决了内存泄露的问题。

//构造函数public String(char value[], int offset, int count) {    //check boundary    //仅复制需要的部分    this.value = Arrays.copyOfRange(value, offset, offset + count);}

看到这里的评论中，讲：

实际上是String() 和substring() 两个方法在JDK6，JDK7中的区别，是构造方法的实现变了
String sub = old.substring(begin,end); 在JDK6中，这个调用会导致 sub.value 与 old.value 引用同一数组内存：sub.value==old.value 为true。而在JDK7中为false。

所以事实上 JDK6的取子串的方法是有效率的，而JDK7中的String(char[],int,int)实现中Arrays.copyOfRange的调用 ,虽是本地方法, 但当字符串很大时还是有效率问题。总之JDK7中区别体现更多的是为了安全，而不是性能。

0 0