GemFire/Geode中的国际化(二)

来源:互联网 发布:seo内容制作什么意思 编辑:程序博客网 时间:2024/06/15 21:01

代码走读

商业产品的代码我们是看不到啦,这里我们就以开源的Geode为例,一起看看哪些区域涉及到了国际化实现。(这里笔者使用了内部研发的代码语法感知工具)首先,我们的目光投在了DataSerializer.java中的writeString和readString方法。

public staticvoid writeString(String value, DataOutput out) throws IOException {    …    if (value == null) {      if (isDebugEnabled) {        logger.trace(LogMarker.SERIALIZER,"Writing NULL_STRING");      }      out.writeByte(DSCODE.NULL_STRING);     } else {      // 注意这里!考虑到可能引入性能损耗      // 程序会对单字节还是多字节char进行判断,再决定使用何种write方式      int len = value.length();      int utfLen = len; // added for bug 40932      for (int i = 0; i < len; i++) {        char c = value.charAt(i);        if ((c <= 0x007F) && (c>= 0x0001)) {          // nothing needed        } else if (c > 0x07FF) {          utfLen += 2;        } else {          utfLen += 1;        }      }      boolean writeUTF = utfLen > len;      if (writeUTF) {        if (utfLen > 0xFFFF) {          if (isDebugEnabled) {            logger.trace(LogMarker.SERIALIZER,"Writing utf HUGE_STRING of len={}", len);          }          out.writeByte(DSCODE.HUGE_STRING);          out.writeInt(len);          out.writeChars(value);        } else {          if (isDebugEnabled) {            logger.trace(LogMarker.SERIALIZER,"Writing utf STRING of len={}", len);          }          out.writeByte(DSCODE.STRING);          out.writeUTF(value);        }      } else {        if (len > 0xFFFF) {          if (isDebugEnabled) {            logger.trace(LogMarker.SERIALIZER,"Writing HUGE_STRING_BYTES of len={}", len);          }          out.writeByte(DSCODE.HUGE_STRING_BYTES);          out.writeInt(len);          out.writeBytes(value);        } else {          if (isDebugEnabled) {            logger.trace(LogMarker.SERIALIZER,"Writing STRING_BYTES of len={}", len);          }          out.writeByte(DSCODE.STRING_BYTES);          out.writeShort(len);          out.writeBytes(value);        }      }    }  } public staticString readString(DataInput in) throws IOException {    returnInternalDataSerializer.readString(in, in.readByte());  }

再看HeapDataOutputStream.java中的writeUTF方法,清晰的看到对于ASCII和non-ASCII字符也有着不同的处理逻辑。

public voidwriteUTF(String str) throws UTFDataFormatException {    if (this.ignoreWrites)      return;    checkIfWritable();    if (ASCII_STRINGS) {      writeAsciiUTF(str, true);    } else {      writeFullUTF(str, true);    }  } private voidwriteFullUTF(String str, boolean encodeLength) throws UTFDataFormatException {    int strlen = str.length();    if (encodeLength && strlen >65535) {      throw new UTFDataFormatException();    }//这里也为了3字节字符和长度做了预留//显然4字节字符是不支持的,大家也不用尝试了    {      int maxLen = (strlen * 3);      if (encodeLength) {        maxLen += 2;      }      ensureCapacity(maxLen);    }    int utfSizeIdx = this.buffer.position();    if (encodeLength) {      // skip bytes reserved for length      this.buffer.position(utfSizeIdx + 2);    }    for (int i = 0; i < strlen; i++) {      int c = str.charAt(i);      if ((c >= 0x0001) && (c <=0x007F)) {        this.buffer.put((byte) c);      } else if (c > 0x07FF) {        this.buffer.put((byte) (0xE0 | ((c>> 12) & 0x0F)));        this.buffer.put((byte) (0x80 | ((c>> 6) & 0x3F)));        this.buffer.put((byte) (0x80 | ((c>> 0) & 0x3F)));      } else {        this.buffer.put((byte) (0xC0 | ((c>> 6) & 0x1F)));        this.buffer.put((byte) (0x80 | ((c>> 0) & 0x3F)));      }    }    int utflen = this.buffer.position() -utfSizeIdx;    if (encodeLength) {      utflen -= 2;      if (utflen > 65535) {        // act as if we wrote nothing to thisbuffer        this.buffer.position(utfSizeIdx);        throw new UTFDataFormatException();      }      this.buffer.putShort(utfSizeIdx, (short)utflen);    }  }

最后看UriUtils.java中的decode方法,代码中提前声明public static final String DEFAULT_ENCODING ="UTF-8"; 解决了缺省码表问题,值得我们在编码过程中效法。

public static String decode(final StringencodedValue) {    return decode(encodedValue,DEFAULT_ENCODING);  } public staticString decode(String encodedValue, final String encoding) {    try {      if (encodedValue != null) {        String previousEncodedValue;         do {          previousEncodedValue = encodedValue;          encodedValue =URLDecoder.decode(encodedValue, encoding);        } while(!encodedValue.equals(previousEncodedValue));      }       return encodedValue;    } catch (UnsupportedEncodingExceptionignore) {      return encodedValue;    }  }

国际化高危区

在前文讲述Redis的时候,我跟大家一起记住了两个国际化高危方法——serialize/ deseialize,今天就让我们一起再认识两张新面孔——toData / fromData

public classPlayer implements DataSerializable {  private int id;  private String name;  private Date birthday;  private FC club;   @Override  public void toData(DataOutput out) throwsIOException {    out.writeInt(this.id);    out.writeUTF(this.name);    DataSerializer.writeDate(this.birthday,out);    DataSerializer.writeObject(this.club, out);  }   @Override  public void fromData(DataInput in) throwsIOException, ClassNotFoundException {    this.id = in.readInt();    this.name = in.readUTF();    this.birthday =DataSerializer.readDate(in);    this.club = (FC)DataSerializer.readObject(in);  }}

一旦参数中包含display_name,description之类可能包含non-ASCII字符时,请务必invoke write orread UTF,否则就只能是你家乱码常打开,开放怀抱等你喽!(〒︿〒)