SimpleDateFormat 源码解析

来源：互联网发布：好看的淘宝头像图片编辑：程序博客网时间：2024/06/03 20:32

今天一个朋友突然说起了字符串解析的时候，如果字符串超出的时候时间会出现变动的情况

我觉得这个问题很有意思就开始查看源码，这篇博客的主题内容将围绕这个展开，接下来讲干货。

当我们在解析时间的时候往往会采用下面的格式

import java.text.DateFormat;import java.text.SimpleDateFormat;import java.util.Date;/** * Created by 优华 on 7/2/2015. */public class testDateFormat {    public static void main(String[] args) {        DateFormat df = new SimpleDateFormat("yyyyMMdd-HH:mm");        try {            Date d =   df.parse("20150619-240:60");            System.out.println(d);        } catch (Exception e) {            e.printStackTrace();        }    }}

我们可以看到这个过程分两部分一部分是解析pattern 另外一部分是进行字符串解析

解析的过程不做过多讲解我讲解一些需要关注的东西

在我们调用simpledateformate 的构造函数的时候我们会调用 init开头的一个方法这个方法中调用了

private char[] compile(String pattern)

最后解析完成的结果

存储在

transient private char[] compiledPattern;

因为java 中一个char 站两个字节那么这个解析完的pattern 也能存储 16位的信息。

对于最后的数据格式我也说明一下

低8位对于普通的字符如表示年的 y/Y 记录了他的位数 ,当然这个位数不能随便设像分钟你就不能设成1位的，这个在之后的解析字符串中会有作用

对于风格符这个记录了字符信息如空白是100000 ：是111010 等

高8位对普通的字符记录该符号的标志如果年对应的是 1 M对应的是10 (全是二进制下的)

将具体规则是该字符在
static final String patternChars = "GyMdkHmsSEDFwWahKzZYuXL";

中的下标。

对于特殊字符空格和：都是1100100 其他没测试过

这个我会在最后贴一个测试代码，内部的方法是从源码中爬出来的

关于pattern 解析就讲到这里，接下来是parse 方法

dateformate 的parse 会调用

public Date parse(String text, ParsePosition pos)

传入的position是从0开始的，这个方法也自然被simpletimeformat继承了.

下面是该方法的方法体

  public Date parse(String text, ParsePosition pos){        checkNegativeNumberExpression();        int start = pos.index;        int oldStart = start;        int textLength = text.length();        boolean[] ambiguousYear = {false};        CalendarBuilder calb = new CalendarBuilder();        for (int i = 0; i < compiledPattern.length; ) {            int tag = compiledPattern[i] >>> 8;            int count = compiledPattern[i++] & 0xff;            if (count == 255) {                count = compiledPattern[i++] << 16;                count |= compiledPattern[i++];            }            switch (tag) {            case TAG_QUOTE_ASCII_CHAR:                if (start >= textLength || text.charAt(start) != (char)count) {                    pos.index = oldStart;                    pos.errorIndex = start;                    return null;                }                start++;                break;            case TAG_QUOTE_CHARS:                while (count-- > 0) {                    if (start >= textLength || text.charAt(start) != compiledPattern[i++]) {                        pos.index = oldStart;                        pos.errorIndex = start;                        return null;                    }                    start++;                }                break;            default:                // Peek the next pattern to determine if we need to                // obey the number of pattern letters for                // parsing. It's required when parsing contiguous                // digit text (e.g., "20010704") with a pattern which                // has no delimiters between fields, like "yyyyMMdd".                boolean obeyCount = false;                // In Arabic, a minus sign for a negative number is put after                // the number. Even in another locale, a minus sign can be                // put after a number using DateFormat.setNumberFormat().                // If both the minus sign and the field-delimiter are '-',                // subParse() needs to determine whether a '-' after a number                // in the given text is a delimiter or is a minus sign for the                // preceding number. We give subParse() a clue based on the                // information in compiledPattern.                boolean useFollowingMinusSignAsDelimiter = false;                if (i < compiledPattern.length) {                    int nextTag = compiledPattern[i] >>> 8;                    if (!(nextTag == TAG_QUOTE_ASCII_CHAR ||                          nextTag == TAG_QUOTE_CHARS)) {                        obeyCount = true;                    }                    if (hasFollowingMinusSign &&                        (nextTag == TAG_QUOTE_ASCII_CHAR ||                         nextTag == TAG_QUOTE_CHARS)) {                        int c;                        if (nextTag == TAG_QUOTE_ASCII_CHAR) {                            c = compiledPattern[i] & 0xff;                        } else {                            c = compiledPattern[i+1];                        }                        if (c == minusSign) {                            useFollowingMinusSignAsDelimiter = true;                        }                    }                }                start = subParse(text, start, tag, count, obeyCount,                                 ambiguousYear, pos,                                 useFollowingMinusSignAsDelimiter, calb);                if (start < 0) {                    pos.index = oldStart;                    return null;                }            }        }        // At this point the fields of Calendar have been set.  Calendar        // will fill in default values for missing fields when the time        // is computed.        pos.index = start;        Date parsedDate;        try {            parsedDate = calb.establish(calendar).getTime();            // If the year value is ambiguous,            // then the two-digit year == the default start year            if (ambiguousYear[0]) {                if (parsedDate.before(defaultCenturyStart)) {                    parsedDate = calb.addYear(100).establish(calendar).getTime();                }            }        }        // An IllegalArgumentException will be thrown by Calendar.getTime()        // if any fields are out of range, e.g., MONTH == 17.        catch (IllegalArgumentException e) {            pos.errorIndex = start;            pos.index = oldStart;            return null;        }        return parsedDate;    }

可以看到该方法是拿编译好的compiledPattern 来对字符串进行解析

针对每一个compiledPattern[i] 再对高八位和低八位进行检查, 如果模式要求的是分隔符那么会进入专门的操作，如果模式匹配数字的话那么进入default 分支进行检测，最后调用subParse方法进行处理, 解析好的时间放在了 CalendarBuilder calb 变量中，最后构造date 返回

针对default分支的处理，这里有几个变量需要注意

boolean obeyCount = false;

这个表明是否需要遵从count，这个是产生之前问题的主要原因，

这个变量在与下一个解析域不存在分隔符分隔符并且不是该模式的最后一个域的时候为true，其他时候为false

这个变量在解析 yyyyMMdd 这样模式特别重要 201510101 因为dateformat解析的时候允许域中值溢出比如允许秒数填86400等，这个在calb内部会从处理掉，直接向上进位

有个这个限制在读yyyy的时候只读4位 MM只读2位最后的3位被赋值给了dd ，这个过程之后再说，我们继续朝下看

boolean useFollowingMinusSignAsDelimiter = false;

这个标志量表明判断是否为负数

你可以能很奇怪为什么会这样

那么你可以尝试用 yyyyMMdd解析一下 200512-60 正好是少掉60天的日期

这个也就是这个标识符出现的原因

start = subParse(text, start, tag, count, obeyCount,                 ambiguousYear, pos,                 useFollowingMinusSignAsDelimiter, calb);

这个是最后调用subParse的方法返回值是下一次检查的开始位置，传入的参数大概都讲过了

 private int subParse(String text, int start, int patternCharIndex, int count,                         boolean obeyCount, boolean[] ambiguousYear,                         ParsePosition origPos,                         boolean useFollowingMinusSignAsDelimiter, CalendarBuilder calb) {        Number number;        int value = 0;        ParsePosition pos = new ParsePosition(0);        pos.index = start;        if (patternCharIndex == PATTERN_WEEK_YEAR && !calendar.isWeekDateSupported()) {            // use calendar year 'y' instead            patternCharIndex = PATTERN_YEAR;        }        int field = PATTERN_INDEX_TO_CALENDAR_FIELD[patternCharIndex];        // If there are any spaces here, skip over them.  If we hit the end        // of the string, then fail.        for (;;) {            if (pos.index >= text.length()) {                origPos.errorIndex = start;                return -1;            }            char c = text.charAt(pos.index);            if (c != ' ' && c != '\t') {                break;            }            ++pos.index;        }        // Remember the actual start index        int actualStart = pos.index;      parsing:        {            // We handle a few special cases here where we need to parse            // a number value.  We handle further, more generic cases below.  We need            // to handle some of them here because some fields require extra processing on            // the parsed value.            if (patternCharIndex == PATTERN_HOUR_OF_DAY1 ||                patternCharIndex == PATTERN_HOUR1 ||                (patternCharIndex == PATTERN_MONTH && count <= 2) ||                patternCharIndex == PATTERN_YEAR ||                patternCharIndex == PATTERN_WEEK_YEAR) {                // It would be good to unify this with the obeyCount logic below,                // but that's going to be difficult.                if (obeyCount) {                    if ((start+count) > text.length()) {                        break parsing;                    }                    number = numberFormat.parse(text.substring(0, start+count), pos);                } else {                    number = numberFormat.parse(text, pos);                }                if (number == null) {                    if (patternCharIndex != PATTERN_YEAR || calendar instanceof GregorianCalendar) {                        break parsing;                    }                } else {                    value = number.intValue();                    if (useFollowingMinusSignAsDelimiter && (value < 0) &&                        (((pos.index < text.length()) &&                         (text.charAt(pos.index) != minusSign)) ||                         ((pos.index == text.length()) &&                          (text.charAt(pos.index-1) == minusSign)))) {                        value = -value;                        pos.index--;                    }                }            }            boolean useDateFormatSymbols = useDateFormatSymbols();            int index;            switch (patternCharIndex) {            case PATTERN_ERA: // 'G'                if (useDateFormatSymbols) {                    if ((index = matchString(text, start, Calendar.ERA, formatData.getEras(), calb)) > 0) {                        return index;                    }                } else {                    Map<String, Integer> map = getDisplayNamesMap(field, locale);                    if ((index = matchString(text, start, field, map, calb)) > 0) {                        return index;                    }                }                break parsing;            case PATTERN_WEEK_YEAR: // 'Y'            case PATTERN_YEAR:      // 'y'                if (!(calendar instanceof GregorianCalendar)) {                    // calendar might have text representations for year values,                    // such as "\u5143" in JapaneseImperialCalendar.                    int style = (count >= 4) ? Calendar.LONG : Calendar.SHORT;                    Map<String, Integer> map = calendar.getDisplayNames(field, style, locale);                    if (map != null) {                        if ((index = matchString(text, start, field, map, calb)) > 0) {                            return index;                        }                    }                    calb.set(field, value);                    return pos.index;                }                // If there are 3 or more YEAR pattern characters, this indicates                // that the year value is to be treated literally, without any                // two-digit year adjustments (e.g., from "01" to 2001).  Otherwise                // we made adjustments to place the 2-digit year in the proper                // century, for parsed strings from "00" to "99".  Any other string                // is treated literally:  "2250", "-1", "1", "002".                if (count <= 2 && (pos.index - actualStart) == 2                    && Character.isDigit(text.charAt(actualStart))                    && Character.isDigit(text.charAt(actualStart + 1))) {                    // Assume for example that the defaultCenturyStart is 6/18/1903.                    // This means that two-digit years will be forced into the range                    // 6/18/1903 to 6/17/2003.  As a result, years 00, 01, and 02                    // correspond to 2000, 2001, and 2002.  Years 04, 05, etc. correspond                    // to 1904, 1905, etc.  If the year is 03, then it is 2003 if the                    // other fields specify a date before 6/18, or 1903 if they specify a                    // date afterwards.  As a result, 03 is an ambiguous year.  All other                    // two-digit years are unambiguous.                    int ambiguousTwoDigitYear = defaultCenturyStartYear % 100;                    ambiguousYear[0] = value == ambiguousTwoDigitYear;                    value += (defaultCenturyStartYear/100)*100 +                        (value < ambiguousTwoDigitYear ? 100 : 0);                }                calb.set(field, value);                return pos.index;            case PATTERN_MONTH: // 'M'                if (count <= 2) // i.e., M or MM.                {                    // Don't want to parse the month if it is a string                    // while pattern uses numeric style: M or MM.                    // [We computed 'value' above.]                    calb.set(Calendar.MONTH, value - 1);                    return pos.index;                }                if (useDateFormatSymbols) {                    // count >= 3 // i.e., MMM or MMMM                    // Want to be able to parse both short and long forms.                    // Try count == 4 first:                    int newStart;                    if ((newStart = matchString(text, start, Calendar.MONTH,                                                formatData.getMonths(), calb)) > 0) {                        return newStart;                    }                    // count == 4 failed, now try count == 3                    if ((index = matchString(text, start, Calendar.MONTH,                                             formatData.getShortMonths(), calb)) > 0) {                        return index;                    }                } else {                    Map<String, Integer> map = getDisplayNamesMap(field, locale);                    if ((index = matchString(text, start, field, map, calb)) > 0) {                        return index;                    }                }                break parsing;            case PATTERN_HOUR_OF_DAY1: // 'k' 1-based.  eg, 23:59 + 1 hour =>> 24:59                if (!isLenient()) {                    // Validate the hour value in non-lenient                    if (value < 1 || value > 24) {                        break parsing;                    }                }                // [We computed 'value' above.]                if (value == calendar.getMaximum(Calendar.HOUR_OF_DAY) + 1) {                    value = 0;                }                calb.set(Calendar.HOUR_OF_DAY, value);                return pos.index;            case PATTERN_DAY_OF_WEEK:  // 'E'                {                    if (useDateFormatSymbols) {                        // Want to be able to parse both short and long forms.                        // Try count == 4 (DDDD) first:                        int newStart;                        if ((newStart=matchString(text, start, Calendar.DAY_OF_WEEK,                                                  formatData.getWeekdays(), calb)) > 0) {                            return newStart;                        }                        // DDDD failed, now try DDD                        if ((index = matchString(text, start, Calendar.DAY_OF_WEEK,                                                 formatData.getShortWeekdays(), calb)) > 0) {                            return index;                        }                    } else {                        int[] styles = { Calendar.LONG, Calendar.SHORT };                        for (int style : styles) {                            Map<String,Integer> map = calendar.getDisplayNames(field, style, locale);                            if ((index = matchString(text, start, field, map, calb)) > 0) {                                return index;                            }                        }                    }                }                break parsing;            case PATTERN_AM_PM:    // 'a'                if (useDateFormatSymbols) {                    if ((index = matchString(text, start, Calendar.AM_PM,                                             formatData.getAmPmStrings(), calb)) > 0) {                        return index;                    }                } else {                    Map<String,Integer> map = getDisplayNamesMap(field, locale);                    if ((index = matchString(text, start, field, map, calb)) > 0) {                        return index;                    }                }                break parsing;            case PATTERN_HOUR1: // 'h' 1-based.  eg, 11PM + 1 hour =>> 12 AM                if (!isLenient()) {                    // Validate the hour value in non-lenient                    if (value < 1 || value > 12) {                        break parsing;                    }                }                // [We computed 'value' above.]                if (value == calendar.getLeastMaximum(Calendar.HOUR) + 1) {                    value = 0;                }                calb.set(Calendar.HOUR, value);                return pos.index;            case PATTERN_ZONE_NAME:  // 'z'            case PATTERN_ZONE_VALUE: // 'Z'                {                    int sign = 0;                    try {                        char c = text.charAt(pos.index);                        if (c == '+') {                            sign = 1;                        } else if (c == '-') {                            sign = -1;                        }                        if (sign == 0) {                            // Try parsing a custom time zone "GMT+hh:mm" or "GMT".                            if ((c == 'G' || c == 'g')                                && (text.length() - start) >= GMT.length()                                && text.regionMatches(true, start, GMT, 0, GMT.length())) {                                pos.index = start + GMT.length();                                if ((text.length() - pos.index) > 0) {                                    c = text.charAt(pos.index);                                    if (c == '+') {                                        sign = 1;                                    } else if (c == '-') {                                        sign = -1;                                    }                                }                                if (sign == 0) {    /* "GMT" without offset */                                    calb.set(Calendar.ZONE_OFFSET, 0)                                        .set(Calendar.DST_OFFSET, 0);                                    return pos.index;                                }                                // Parse the rest as "hh:mm"                                int i = subParseNumericZone(text, ++pos.index,                                                            sign, 0, true, calb);                                if (i > 0) {                                    return i;                                }                                pos.index = -i;                            } else {                                // Try parsing the text as a time zone                                // name or abbreviation.                                int i = subParseZoneString(text, pos.index, calb);                                if (i > 0) {                                    return i;                                }                                pos.index = -i;                            }                        } else {                            // Parse the rest as "hhmm" (RFC 822)                            int i = subParseNumericZone(text, ++pos.index,                                                        sign, 0, false, calb);                            if (i > 0) {                                return i;                            }                            pos.index = -i;                        }                    } catch (IndexOutOfBoundsException e) {                    }                }                break parsing;            case PATTERN_ISO_ZONE:   // 'X'                {                    if ((text.length() - pos.index) <= 0) {                        break parsing;                    }                    int sign;                    char c = text.charAt(pos.index);                    if (c == 'Z') {                        calb.set(Calendar.ZONE_OFFSET, 0).set(Calendar.DST_OFFSET, 0);                        return ++pos.index;                    }                    // parse text as "+/-hh[[:]mm]" based on count                    if (c == '+') {                        sign = 1;                    } else if (c == '-') {                        sign = -1;                    } else {                        ++pos.index;                        break parsing;                    }                    int i = subParseNumericZone(text, ++pos.index, sign, count,                                                count == 3, calb);                    if (i > 0) {                        return i;                    }                    pos.index = -i;                }                break parsing;            default:         // case PATTERN_DAY_OF_MONTH:         // 'd'         // case PATTERN_HOUR_OF_DAY0:         // 'H' 0-based.  eg, 23:59 + 1 hour =>> 00:59         // case PATTERN_MINUTE:               // 'm'         // case PATTERN_SECOND:               // 's'         // case PATTERN_MILLISECOND:          // 'S'         // case PATTERN_DAY_OF_YEAR:          // 'D'         // case PATTERN_DAY_OF_WEEK_IN_MONTH: // 'F'         // case PATTERN_WEEK_OF_YEAR:         // 'w'         // case PATTERN_WEEK_OF_MONTH:        // 'W'         // case PATTERN_HOUR0:                // 'K' 0-based.  eg, 11PM + 1 hour =>> 0 AM         // case PATTERN_ISO_DAY_OF_WEEK:      // 'u' (pseudo field);                // Handle "generic" fields                if (obeyCount) {                    if ((start+count) > text.length()) {                        break parsing;                    }                    number = numberFormat.parse(text.substring(0, start+count), pos);                } else {                    number = numberFormat.parse(text, pos);                }                if (number != null) {                    value = number.intValue();                    if (useFollowingMinusSignAsDelimiter && (value < 0) &&                        (((pos.index < text.length()) &&                         (text.charAt(pos.index) != minusSign)) ||                         ((pos.index == text.length()) &&                          (text.charAt(pos.index-1) == minusSign)))) {                        value = -value;                        pos.index--;                    }                    calb.set(field, value);                    return pos.index;                }                break parsing;            }        }        // Parsing failed.        origPos.errorIndex = pos.index;        return -1;    }

这部分是subParse的源码，具体代码有兴趣的可以研究下，我们这里只关注他是怎么解析字段的，

首先如果他是判断是否obeyCount 如果遵守那么就截取 pos 之后count位交给numberformat 解析

如果不需要遵从那么就从 pos开始解析出一个数字（这部分是这个问题的关键）

源码中对numberformat 这个抽象类parse方法的描述

   /**     * Parses text from a string to produce a <code>Number</code>.     * <p>     * The method attempts to parse text starting at the index given by     * <code>pos</code>.     * If parsing succeeds, then the index of <code>pos</code> is updated     * to the index after the last character used (parsing does not necessarily     * use all characters up to the end of the string), and the parsed     * number is returned. The updated <code>pos</code> can be used to     * indicate the starting point for the next call to this method.     * If an error occurs, then the index of <code>pos</code> is not     * changed, the error index of <code>pos</code> is set to the index of     * the character where the error occurred, and null is returned.     * <p>     * See the {@link #parse(String, ParsePosition)} method for more information     * on number parsing.     *     * @param source A <code>String</code>, part of which should be parsed.     * @param pos A <code>ParsePosition</code> object with index and error     *            index information as described above.     * @return A <code>Number</code> parsed from the string. In case of     *         error, returns null.     * @exception NullPointerException if <code>pos</code> is null.     */    @Override    public final Object parseObject(String source, ParsePosition pos) {        return parse(source, pos);    }

这个描述可能有些不清楚，那么我们这里具体用到的是

public class DecimalFormat extends NumberFormat {

 /**     * Parses text from a string to produce a <code>Number</code>.     * <p>     * The method attempts to parse text starting at the index given by     * <code>pos</code>.     * If parsing succeeds, then the index of <code>pos</code> is updated     * to the index after the last character used (parsing does not necessarily     * use all characters up to the end of the string), and the parsed     * number is returned. The updated <code>pos</code> can be used to     * indicate the starting point for the next call to this method.     * If an error occurs, then the index of <code>pos</code> is not     * changed, the error index of <code>pos</code> is set to the index of     * the character where the error occurred, and null is returned.     * <p>     * The subclass returned depends on the value of {@link #isParseBigDecimal}     * as well as on the string being parsed.     * <ul>     *   <li>If <code>isParseBigDecimal()</code> is false (the default),     *       most integer values are returned as <code>Long</code>     *       objects, no matter how they are written: <code>"17"</code> and     *       <code>"17.000"</code> both parse to <code>Long(17)</code>.     *       Values that cannot fit into a <code>Long</code> are returned as     *       <code>Double</code>s. This includes values with a fractional part,     *       infinite values, <code>NaN</code>, and the value -0.0.     *       <code>DecimalFormat</code> does <em>not</em> decide whether to     *       return a <code>Double</code> or a <code>Long</code> based on the     *       presence of a decimal separator in the source string. Doing so     *       would prevent integers that overflow the mantissa of a double,     *       such as <code>"-9,223,372,036,854,775,808.00"</code>, from being     *       parsed accurately.     *       <p>     *       Callers may use the <code>Number</code> methods     *       <code>doubleValue</code>, <code>longValue</code>, etc., to obtain     *       the type they want.     *   <li>If <code>isParseBigDecimal()</code> is true, values are returned     *       as <code>BigDecimal</code> objects. The values are the ones     *       constructed by {@link java.math.BigDecimal#BigDecimal(String)}     *       for corresponding strings in locale-independent format. The     *       special cases negative and positive infinity and NaN are returned     *       as <code>Double</code> instances holding the values of the     *       corresponding <code>Double</code> constants.     * </ul>     * <p>     * <code>DecimalFormat</code> parses all Unicode characters that represent     * decimal digits, as defined by <code>Character.digit()</code>. In     * addition, <code>DecimalFormat</code> also recognizes as digits the ten     * consecutive characters starting with the localized zero digit defined in     * the <code>DecimalFormatSymbols</code> object.     *     * @param text the string to be parsed     * @param pos  A <code>ParsePosition</code> object with index and error     *             index information as described above.     * @return     the parsed value, or <code>null</code> if the parse fails     * @exception  NullPointerException if <code>text</code> or     *             <code>pos</code> is null.     */    @Override    public Number parse(String text, ParsePosition pos) {

好吧这次的分享就到这里

最后附上具体的解析pattern的测试代码 main方法以外其他都是出自源码

package DateFormat;/** * Created by 优华 on 7/2/2015. */public class TestCompile {    public static void main(String[] args){        TestCompile tc =new TestCompile();        char[] pat = tc.compile("yyyyMMdd:hhmmss");        for(char temp : pat){            System.out.print(Integer.toBinaryString(temp>>>8)+"|");        }        System.out.println();        System.out.println(pat.length);        System.out.println(Integer.toBinaryString('a'));    }    static final int PATTERN_MONTH                =  2; // M    static final int PATTERN_ISO_ZONE             = 21; // X    private final static int TAG_QUOTE_ASCII_CHAR       = 100;    private final static int TAG_QUOTE_CHARS            = 101;    transient private boolean forceStandaloneForm = false;    static final String  patternChars = "GyMdkHmsSEDFwWahKzZYuXL";    private char[] compile(String pattern) {        int length = pattern.length();        boolean inQuote = false;        StringBuilder compiledCode = new StringBuilder(length * 2);        StringBuilder tmpBuffer = null;        int count = 0, tagcount = 0;        int lastTag = -1, prevTag = -1;        for (int i = 0; i < length; i++) {            char c = pattern.charAt(i);            if (c == '\'') {                // '' is treated as a single quote regardless of being                // in a quoted section.                if ((i + 1) < length) {                    c = pattern.charAt(i + 1);                    if (c == '\'') {                        i++;                        if (count != 0) {                            encode(lastTag, count, compiledCode);                            tagcount++;                            prevTag = lastTag;                            lastTag = -1;                            count = 0;                        }                        if (inQuote) {                            tmpBuffer.append(c);                        } else {                            compiledCode.append((char)(TAG_QUOTE_ASCII_CHAR << 8 | c));                        }                        continue;                    }                }                if (!inQuote) {                    if (count != 0) {                        encode(lastTag, count, compiledCode);                        tagcount++;                        prevTag = lastTag;                        lastTag = -1;                        count = 0;                    }                    if (tmpBuffer == null) {                        tmpBuffer = new StringBuilder(length);                    } else {                        tmpBuffer.setLength(0);                    }                    inQuote = true;                } else {                    int len = tmpBuffer.length();                    if (len == 1) {                        char ch = tmpBuffer.charAt(0);                        if (ch < 128) {                            compiledCode.append((char)(TAG_QUOTE_ASCII_CHAR << 8 | ch));                        } else {                            compiledCode.append((char)(TAG_QUOTE_CHARS << 8 | 1));                            compiledCode.append(ch);                        }                    } else {                        encode(TAG_QUOTE_CHARS, len, compiledCode);                        compiledCode.append(tmpBuffer);                    }                    inQuote = false;                }                continue;            }            if (inQuote) {                tmpBuffer.append(c);                continue;            }            if (!(c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z')) {                if (count != 0) {                    encode(lastTag, count, compiledCode);                    tagcount++;                    prevTag = lastTag;                    lastTag = -1;                    count = 0;                }                if (c < 128) {                    // In most cases, c would be a delimiter, such as ':'.                    compiledCode.append((char)(TAG_QUOTE_ASCII_CHAR << 8 | c));                } else {                    // Take any contiguous non-ASCII alphabet characters and                    // put them in a single TAG_QUOTE_CHARS.                    int j;                    for (j = i + 1; j < length; j++) {                        char d = pattern.charAt(j);                        if (d == '\'' || (d >= 'a' && d <= 'z' || d >= 'A' && d <= 'Z')) {                            break;                        }                    }                    compiledCode.append((char)(TAG_QUOTE_CHARS << 8 | (j - i)));                    for (; i < j; i++) {                        compiledCode.append(pattern.charAt(i));                    }                    i--;                }                continue;            }            int tag;            if ((tag = patternChars.indexOf(c)) == -1) {                throw new IllegalArgumentException("Illegal pattern character " +                        "'" + c + "'");            }            if (lastTag == -1 || lastTag == tag) {                lastTag = tag;                count++;                continue;            }            encode(lastTag, count, compiledCode);            tagcount++;            prevTag = lastTag;            lastTag = tag;            count = 1;        }        if (inQuote) {            throw new IllegalArgumentException("Unterminated quote");        }        if (count != 0) {            encode(lastTag, count, compiledCode);            tagcount++;            prevTag = lastTag;        }        forceStandaloneForm = (tagcount == 1 && prevTag == PATTERN_MONTH);        // Copy the compiled pattern to a char array        int len = compiledCode.length();        char[] r = new char[len];        compiledCode.getChars(0, len, r, 0);        return r;    }    /**     * Encodes the given tag and length and puts encoded char(s) into buffer.     */    private static void encode(int tag, int length, StringBuilder buffer) {        if (tag == PATTERN_ISO_ZONE && length >= 4) {            throw new IllegalArgumentException("invalid ISO 8601 format: length=" + length);        }        if (length < 255) {            buffer.append((char)(tag << 8 | length));        } else {            buffer.append((char)((tag << 8) | 0xff));            buffer.append((char)(length >>> 16));            buffer.append((char)(length & 0xffff));        }    }}

0 0