每天一个jvm类之String

来源：互联网发布：中国进出口最新数据编辑：程序博客网时间：2024/04/30 12:51

java.lang.String居然有3000+的代码，真是小看它了。

class represents character strings. All string literals in Java programs,

are implemented as instances of this class.

Strings are constant; their values cannot be changed after they are created. String buffers support mutable strings.

Because String objects are immutable they can be shared.

String str = "abc"; is equivalent to:

char data[] = {'a', 'b', 'c'};

String str = new String(data);

System.out.println("abc");

String cde = "cde";

System.out.println("abc" + cde);

String c = "abc".substring(2,3);

String d = cde.substring(1, 2);

includes methods for examining individual characters of the sequence, for comparing strings, for searching strings, for extracting substrings,

and for creating a copy of a string with all characters translated to uppercase or to lowercase.

Case mapping is based on the Unicode Standard version specified by the Character class.

The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings.

String concatenation is implemented through the <code>StringBuilder</code>(or<code>StringBuffer</code>) class and its<code>append</code> method.

String represents a string in the UTF-16 format in which<em>supplementary characters</em> are represented by<em>surrogate

pairs</em> (see the section <a href="Character.html#unicode">Unicode Character Representations</a> in the<code>Character</code> class for more information).

Index values refer to <code>char</code> code units, so asupplementary character uses two positions in a<code>String</code>.

final class String implements java.io.Serializable, Comparable<String>, CharSequence

String是不可变类，不可以继承，实现了CharSequence接口，4个方法

（

int length();

char charAt(int index);

CharSequence subSequence(int start, int end);

public String toString();

）

不可修改的char数组：

final char value[];

int hash;// Default to 0

Class String is special cased within the Serialization Stream Protocol.

A String instance is written initially into an ObjectOutputStream in the following format: TC_STRING (utf String)

The String is written by method <code>DataOutput.writeUTF</code>.

A new handle is generated to refer to all future references to the string instance within the stream.

private static final ObjectStreamField[] serialPersistentFields =

new ObjectStreamField[0];

默认构造器：创建一个0数组

public String() {

this.value =new char[0];

}

拷贝构造器：

public String(String original) {

this.value = original.value;

this.hash = original.hash;

}

从char数组构造一个String

public String(char value[]) {

this.value =Arrays.copyOf(value, value.length);

}

拷贝char数组的部分字符串，count是拷贝的长度

public String(char value[],int offset, int count) {

if (offset < 0) {

throw new StringIndexOutOfBoundsException(offset);

}

if (count < 0) {

throw new StringIndexOutOfBoundsException(count);

}

// Note: offset or count might be near -1>>>1.

if (offset > value.length - count) {

throw new StringIndexOutOfBoundsException(offset + count);

}

this.value =Arrays.copyOfRange(value, offset, offset+count);

}

Unicode code point array argument. The offset argument is the index of the first code point of the subarray

and the {@code count} argument specifies the length of the subarray.

public String(int[] codePoints,int offset, int count) {

if (offset < 0) {

throw new StringIndexOutOfBoundsException(offset);

}

if (count < 0) {

throw new StringIndexOutOfBoundsException(count);

}

// Note: offset or count might be near -1>>>1.

if (offset > codePoints.length - count) {

throw new StringIndexOutOfBoundsException(offset + count);

}

final int end = offset + count;

// Pass 1: Compute precise size of char[]

int n = count;

for (int i = offset; i < end; i++) {

int c = codePoints[i];

if (Character.isBmpCodePoint(c))

continue;

else if (Character.isValidCodePoint(c))

n++;

else thrownew IllegalArgumentException(Integer.toString(c));

}

// Pass 2: Allocate and fill in char[]

final char[] v =new char[n];

for (int i = offset, j = 0; i < end; i++, j++) {

int c = codePoints[i];

if (Character.isBmpCodePoint(c))

v[j] = (char)c;

else

Character.toSurrogates(c, v, j++);

}

this.value = v;

}

将bytes转为字符串

public String(byte bytes[],int offset, int length, String charsetName)

throws UnsupportedEncodingException {

if (charsetName == null)

throw new NullPointerException("charsetName");

checkBounds(bytes, offset, length);

this.value = StringCoding.decode(charsetName, bytes, offset, length);

}

public String(byte bytes[],int offset, int length, Charset charset) {

if (charset == null)

throw new NullPointerException("charset");

checkBounds(bytes, offset, length);

this.value = StringCoding.decode(charset, bytes, offset, length);

}

public String(byte bytes[], String charsetName)

throws UnsupportedEncodingException {

this(bytes, 0, bytes.length, charsetName);

}

public String(byte bytes[], Charset charset) {

this(bytes, 0, bytes.length, charset);

}

public String(byte bytes[],int offset, int length) {

checkBounds(bytes, offset, length);

this.value = StringCoding.decode(bytes, offset, length);

}

public String(byte bytes[]) {

this(bytes, 0, bytes.length);

}

public String(StringBuffer buffer) {

synchronized(buffer) {

this.value =Arrays.copyOf(buffer.getValue(), buffer.length());

}

public String(StringBuilder builder) {

this.value = Arrays.copyOf(builder.getValue(), builder.length());

}

public int length() {

return value.length;

}

public char charAt(int index) {

if ((index < 0) || (index >= value.length)) {

throw new StringIndexOutOfBoundsException(index);

}

return value[index];

}

Returns the character (Unicode code point) at the specified index. The index refers to<code>char</code> values

(Unicode code units) and ranges from <code>0</code> to length() - 1

public int codePointAt(int index) {

if ((index < 0) || (index >= value.length)) {

throw new StringIndexOutOfBoundsException(index);

}

return Character.codePointAtImpl(value, index,value.length);

}

Copy characters from this string into dst starting at dstBegin.

没有做range检测，

void getChars(char dst[],int dstBegin) {

System.arraycopy(value, 0, dst, dstBegin,value.length);

}

返回bytes数组

public byte[] getBytes(String charsetName)

throws UnsupportedEncodingException {

if (charsetName == null) throw new NullPointerException();

return StringCoding.encode(charsetName,value, 0, value.length);

}

public byte[] getBytes(Charset charset) {

if (charset == null)throw new NullPointerException();

return StringCoding.encode(charset,value, 0, value.length);

}

public byte[] getBytes() {

return StringCoding.encode(value, 0,value.length);

}

To Be Continued...

0 0