lexical scanner of number token in lcc

来源：互联网发布：网络法律法规传播编辑：程序博客网时间：2024/06/17 18:45

. how to recognize integer constant
(1) get the value from literal string & check overflow
> integer in hex

       int d = 0;
       while(*++rcp) {
         if (map[*rcp] & DIGIT) {
           d = *rcp - '0';
         }
         else if(*rcp > 'a' && *rcp < 'f') {
           d = *rcp - 'a' + 10;
         }
         else if(*rcp > 'A' && *rcp < 'F') {
           d = *rcp - 'A' + 10;
         }
         else // maybe 0x1234u, or 0x1ul
           break;

         if( n & ~(~0UL >> 4) ) {
           overflow = 1; // !!! overflow
         }
         else {
           n = (n << 4) + d;
         }
       }

> integer in oct

       int err = 0;
       for(; map[*rcp++] & DIGIT; ) {
         if( *rcp == '8' || *rcp == '9' ) {
           err = 1;
         }
         if( n & ~(~0UL >> 3) ) {
           overflow = 1; // !!! overflow
         }
         else {
           n = (n << 3) + (*rcp - '0');
         }
       }

     > integer in decimal
       for( n = *token - '0'; map[*rcp] & DIGIT; ) {
          int d = *rcp - '0';

          if( n > (ULONG_MAX -d) /10 ) { // !!! overflow, 10 * n + d > ULONG_MAX
            overflow = 1;
          }
          else {
            n = n * 10 + d;
          }
       }

(2) give an appropriate integer type & value
store in a high-level integer type than that cause overflows.

     IF with suffix string{'u'|'U' . 'l'|'L'} // stored in unsigned long type
        tval.type <- unsignedlong
     ELSE IF with suffix 'u'|'U' // stored in unsigned int or unsigned long
        IF overflow || n > unsignedtype->u.sym->u.limits.max.i
           tval.type <- unsignedlong //store in unsigned long, if unsigned is insufficient
        ELSE
           tval.type <- unsignedtype
     ELSE IF with suffix 'l'|'L'
        IF overflow || n > longtype->u.sym->u.limits.max.i
           tval.type <- unsignedlong
        ELSE
           tval.type <- longtype
     ELSE IF overflow || n > longtype->u.sym->u.limits.max.i
        tval.type <- unsignedlong
     ELSE IF n > inttype->u.sym->u.limits.max.i
        tval.type <- longtype
     ELSE IF base != 10 && n > inttype->u.sym->u.limits.max.i
        tval.type <- unsignedtype
     ELSE
        tval.type <- inttype

     // store an appropriate value
     CASE tval.type->op:
       INT)
         IF overflow || n > tval.type->u.sym->u.limits.max.i
            warning overflows
            tval.u.c.v.i <- tval.type->u.sym->u.limits.max.i
         ELSE
            tval.u.c.v.i <- n
       UNSIGNED)
         IF overflow || n > tval.type->u.sym->u.limits.max.u
            warning overflows
            tval.u.c.v.u <- tval.type->u.sym->u.limits.max.u
         ELSE
            tval.u.c.v.u <- n
     ESAC

. how to recognize floating const
(1) DFA of a floating constant
    f -> [+|-] f_part [exp_part]
    f_part -> i_part | i_part.[i_part]
    exp_part -> e|E +|- i_part
    i_part -> [0-9]+   // [2]

(2) how to get the value of floating constant & check flow
lcc calculates the value of a floating constant and check overflow by POSIX API strtod().
This can be see in function fcon().

(3) give an appropriate floating type & value

    // 0. !!! different from type assignment of integer constant
    // 1. give a corresponding floating type according to suffix direction
    // 2. stores in double type, if no suffix direction
    tval.u.c.v.d = strtod(token, NULL)
    if (errno == ERANGE) {
       warning overflow...
    }
    if (*cp == 'f' || *cp == 'F') {
       ++cp;
       if (tval.u.c.v.d > floattype->u.sym->u.limits.max.d) {
         warning overflow...
       }
       tval.type = floattype;
    }
    else if (*cp == 'l' || *cp == 'L') {
       ++cp;
       tval.type = longdouble;
    }
    else {
       if (tval.u.c.v.d > doubletype->u.sym->u.limits.max.d) {
          warning overflow...
       }
       tval.type = doubletype;
    }

. lcc source codes arrangement
DFA of integer constant:
|-> static Symbol icon(unsigned long n, int overflow, int base) // see note1:
DFA of floating constant:
|-> static Symbol fcon(void) // see note2:

note1:
   icon() reads suffix of a integer constant, like "u", "l", or "ul" combination, and return a symbol
   of integer constant. [3]
note2:
   fcon scans the floating constant string after decimal point (i.e, decimal point, digits after the
   decimal point, and exponent part), explains suffix like "l", "L", "f", or "F", and returns a symbol
   of floating constant.

[1]
unsigned long long; // it's interpreted 64bits even if in word length of 32 machine
On the IA32 architecture, 64-bit integer are implemented in using two 32-bit registers (eax and edx).

however, lcc-4.1 does not support c99 yet.

[2] 0123 // oct integer
0123.134 // floating number, and convert it to a floating number by API strtod()

[3]see ch3.6 for integeral impression of constant symbols.