C Reference Manual Reading Notes: 010 Definition and Replacement

来源:互联网 发布:md5.js用法 编辑:程序博客网 时间:2024/05/22 10:57

1. synopsis

    The #definepreprocessor command causes a name (identifier) to become defined as amacro to the preprocessor.  A sequences of tokens, called the body ofthe macro, is associated with the name. When the name of the macro isrecognized in the program source text or in the arguments of certainother preprocessor commands, it is treated as a call to that macro; thename is effectively replaced by a copy of body. If the macro is definedto accept arguments, then the actual arguments following the macro nameare substituted for formal parameters in the macro body.

Example:

    If a macro sum with two arguments is defined by

        #define sum(x,y)   ((x)+(y))

    then the preprocessor replaces the source program line

        result = sum(5,a*b)

    with the simple (and perhaps unintended) text substitution

        result = ( (5) + (a*b) );


   Since the preprocessor does not distinguish reserved words from otheridentifiers, it is possible, in principle, to use a C reserved word asthe name of a preprocessor macro, but to do so is usually badprogramming practice. Macro names are never recognized within comments,string or character constants, or #include file names.


2. Objectlike Macro Definitions

   The #define command has two forms depending on whether a leftparenthesis immediately follows the name to be defined. The simpler,objectlike form has no left parenthesis:

        #define name sequence-of-tokens(optional)

    An objectlike macro takes no arguments. It is invoked merely bymentioningits name. When the name is encountered in the source program text, thename is replaced by the body (the associated sequence-of-tokens, whichmay be  empty). The syntax of the #define command does not require anequal sign or any other special delimiter token after the name beingdefined. The body starts right after the name.

    The objectlike macro is particularly useful for introducingnamed constant into a program, so that a "magic number" such as thelength of a table may be written in exactly one place and then referedto elsewhere by name. This makes it easier to change the number later.

    Another important use of objectlike macro is isolateimplementation-dependent restrictions on the name of externally definedfunctions and variables.

Example:

    When a C compiler permits long internal identifiers, but thetarget computer require short external names, the preprocessor may beused to hide these short names:

        #define error_handler eh73

        extern void error_handler();

    and can be used like as:

        error_handler(...);


    Here are some typical macro definitions:

        #define BLOCK_SIZE    0x100

        #define TRACK_SIZE    (16*BLOCK_SIZE)

    A common programming error is to include an extraneous equal sign:

       #define NUMBER_DRIVERS = 5              /* probably wrong */

    This is a valid definition, but it causes the nameNUMBER_DRIVERS to be defined as "=5" rather than "5". If one were thento write the code fragment

        If( count != NUMBER_DRIVERS ) ...

    it would be expanded to

        if ( count != = 5 ) ...

    which is syntactically invalid. For similar resons, also be careful to avoid an extraneous semicolon:

        #define NUMBER_DRIVERS    5;    /* probably wrong */


3. Defining Macros with Parameters

    The more complex, functionlike macro definition declares thenames of formal parameters within parentheses separated by commas:

        #define name( identifier-list(optional) ) sequence-of-tokens(optional)

    where identifier-list is a comma-separated list of formalparameter names. In C99, an ellipsis(...; three periods) may alsoappera after identifier-list to indicate a variable argument list.

    The left parenthesis must immediately follow the name of themacro with no intervening whitespace. If whitespace separates the leftparenthesis from the macro name, the definition is considered to definea macro that takes no arguments and has a body beginning with a leftparenthesis.

    The names of the formal parameters must be identifiers, no twothe same. There is no requirement  that any of the parameter names mustbe mentioned in the body(although normally they are mentioned). Afunctionlike macro can have an empty formal parameter list(i.e. zeroformal parameters). This kind of macro is useful to simulate a functionthat takes no arguments.

    A functionlike macro takes as many actual parameters as thereare formal parameters. The macro is invoked by writing its name, a leftparenthesis, then one actual argument token sequence for each formalparameter, then a right parenthesis. The actual argument tokensequences are separated by commas. (When a functonlike macro with noformal parameters is invoked, an empty actual argument list must beprovided.) When a macro is invoked, whitespace may appear between themacro name and the left parenthesis or in the actual arguments. (Someolder and deficient preprocessor implementations do not permit theactual argument token list to extend across multiple lines unless thelines to be continued end with a /.)

    A acutal argument token sequence may contain parenthesis if theyare properly nested and balanced, and it may contain commas if eachcomma appears within a set of parentheses. (This restriction preventsconfusion with the commas that separate the actual arguments.) Bracesand subscripting brackets likewise may appear within macro arguments,but they cannot contain commas and do not have to balance. Parenthesesand commas appearing with character-constant and string-constant tokensare not counted in the balancing of parentheses and the delimiting ofactual arguments.

    In C99, arguments to macro can be empty, that is, consist of no tokens.

Example:

    Here is the definition of a macro that multiplies its two arguments:

        #define product(x,y) ((x)*(y))

    It is invoked twice in the following statement:

        x = product(a+3,b) + product(c,d);

    The arguments to the product macro could be function(or macro)calls. The commas within the function argument list do not affect theparsing of the macro arguments:

        return product( f(g,b), g(a,b) );  /* OK */


    The getchar() macro has an empty parameter list:

        #define getchar()  getc(stdin)

    When it is invoked, an empty argument list is provided:

        while( (c=getchar()) != EOF ) ...

    (Note: getchar(), stdin, and EOF are defined in the standard header stdio.h.)


    We can also define a macro takes as its argument an arbitrary statement:

        #define insert(stmt)    stmt

    The invocation

        insert({a=1; b=1;})

    works properly, but if we change the two assignment statements to a single statement containing two assignment expressions:

        insert({a=1, b=1;})

   then the preprocessor will complain that we have too many macroassignments for insert. To fix the problem, we could have to write:

        insert( {(a=1, b=1);} )


   Definition functionlike macro to be used in statement contexts can betrickly. The following macro swaps the values int its two arguments, xand y, which are assumed to be of a type whose value can be convertedto unsigned long and back without change, and to not involve theidentifier _temp.

        #define swap(x,y)  {unsigned long _temp = x; x=y; y=_temp;}

    The problem is that it is natural to want to place a semicolon after swap, as you would if swap were really a function:

        if ( x > y ) swap (x, y);    /* whoops*/

        else x = y;

   This will result an error since the expansion includes an extrasemicolon. We put the expanded statements on separate lines next toillustrate the problems more clearly:

        if ( x > y ) { unsigned long _temp = x; x = y; y = _temp; }

        ;

        else x = y;

    A clever way to avoid the problem is to define the macro body as a do-while statement, which consumes the semicolon:

        #define swap(x, y )  /

                 do { unsigned long _temp = x; x = y; y = _temp; }while(0)


   When a functionlike macro call is encountered, the entire macro call isreplaced, after parameter processing, by  a process copy of the body.Parameter processing preoceeds as follows. Actual argument tokensstrings are associated with the corresponding formal parmeter names. Acopy of the body is then made in which every occurence of a formalparameter name is replaced by a copy of the actual argument tokensequence associated with it. This copy the body then replaces the macrocall. The entire process of replacing a macro call with the processedcopy of itd body is called macro expansion; the processed copy of thebody is called the expansion of the macro call.


Example:

   Consider this macro definition, which provides a convenient way to makea loop that counts from a given value up to(and including) some limit:

        #define incr(v,low,high) /

              for( (v) = (low); (v) <= (high); ++(v) )

    To print a table of the cubes of the integers from 1 to 20, we could write:

        #include <stdio.h>

         int main()

         {

              int j;

              incr(j,1,20)

                   printf("%2d  %6d/n",j, j*j*j);


              return 0;

         }

    The call to the macro incr is expanded to produce this loop:

         for( (j) = (1); (j) <= (20); ++(j) )

    The liberal use of parentheses ensures that complicated acutal arguments are not be misinterpreted by the compiler.


4. Rescanning of Macro Expressions

   Once a macro call has been expanded, the scan for macro calls resumesat the beginning of the expansion so that names of macros may berecognized within the expansion for the purpose of futher macroreplacement. Macro replacement is not performed on any part of a#define command, not even the body, at the time the command isprocessed and the macro name defined. Macro names are recognized withinthe body only after the body has expanded for some particular macrocall.

    Macro replacement is also not performed within theactual argument token string of a functionlike macro call at the timethe macro call is being scanned. Macro names are recognized withinactual argument token strings only during the rescanning of theexpansion, assuming that the corresponding formal parameter in factoccurred one or more times within the body(thereby causing the actualargument token string to appear one or more times in the expansion).


Example:

     Giving the following definitions:

         #define plus(x,y)  add(y,x)

         #define add(x,y)   ((x)+(y))

    The invocation

        plus(plus(a,b),c)

    is expanded as shown next.

                                 Step                           Result

                       1.     original                 plus(plus(a,b),c)

                       2.                                 add(c, plus(a,b))

                       3.                                 ((c)+(plus(a,b)))

                       4.                                 ((c)+(add(b,a)))

                       5.      final                    ((c)+(((b)+(a))))


   Macros appearing in their own expansion--either immediately or throughsome intermediate sequence of nested macro expansions--are notreexpanded in Standard C. This permits a programmer to redefine afunction in terms of its old function. Older C preprocessorstraditionally do not detect this recursion, and will attempt tocontinue the expansion until they are stopped by some system error.


Example:

   The following macro changes the definition of the square root functionto handle negative arguments in  a different fashion than is normal:

        #define sqrt(x)    ( (x) < 0 ? sqrt(-x) : sqrt(x) )

   Except that it evaluates its argument more than once, this macro workas intended in Standard C, but might cause an error in older compilers.Similarly:

        #define char unsigned char


5. Predefined Macros

    Preprocessors for Standard C are required to define certainobjectlike macros. The name of each begins and ends with two underscorecharacters. None of these predefined may be undefined (#undef) orredefined by the programmer.

    The __LINE__ and __FILE__ macros are useful when printingcertain kinds of error messages. The __DATE__ and __TIME__ macros canbe used to record when a compilation occured. The values of __TIME__and __DATE__ remain constant throughout the compilation. The values of__FILE__ and __LINE__ macros are established by implementation, but aresubject to alteration by the #line directive(like as #line 300 or #line500 "cppgp.c"). The C99 predefined identifier __func__ is similar inpurpose to __LINE__, but is actually a block-scope variable, not amacro. It supplies the name of the enclosing function.

    The __STDC__ and __STDC_VERSION__ macros are useful for writingcode compatible with Standard and non-Standard C implementations. The__STDC_HOSTED__ macro was introduced in C99 to distinguish hosted fromfreestanding implementations. The remaining C99 macros indicate whetherthe implementation's floating-point and wide character facilitiesadhere to other relevant international standards(Adherence isrecommended, but not required)

    Implementation routinely define additional macros to communicateinformation about the enviroment, such as the type of computer forwhich the program is being compiled. Exactly which macros are definedis implementation-dependent, although UNIX implementations customarilypredefine unix. Unlike the built-in macros, these macros may beundefined. Standard C requires implementation-specific macro names tobegin with a leading underscore followed by either an uppercase letteror another underscore.(The macro unix does not meet that criterion.)

    And the example about the predefined macros will be appended the next subject.


6. Undefining and Redefining Macros

    The #undef command can be used to make a name be no longer defined:

        #undef name

    This command causes the preprocessor to forget any macrodefinition of name. It is not an error to undefine a name currently notdefined. Once a name has been undefined, it may then be given acompletely new definition(using #define) without error. Macroreplacement is not performed within #undef commands.

    The benign redefinition of macros is allowed in Standard C andmany other implementations. That is, a macro may be redefined if thenew definition is the same, token for token, as the existingdefinition. The redefinition must include whitespace in the samelocations as in the original definition, although the particularwhitespace characters can be different. We think programmers shouldavoid depending on benign redefinitions.  It is generally better styleto have a single point of definition for all program entities,including macros. (Some older implementations of C may not allow anykind of redefinition.)

Example:

    In the following definitions, the redefinition of NULL isallowed, but neither redefinition of FUNC is valid. (The first includeswhitespace not in the original definition, and the second changes twotokens.)

        #define NULL 0

        #define FUNC(x)    x+4

        #define NULL    /* null pointer */ 0

        #define FUNC(x)    x + 4

        #define FUNC(y)    y+4

    (But I make a test on fedora10 platform with gcc version 4.3.220081105 (Red Hat 4.3.2-7) (GCC), Both the FUNC redefinition is validtoo. why ?)


    When the programmer legitimate reasons cannot tell if a previousdefinition exists, the #ifndef can be used to test for an existingdefinition so that a redefinition can be avoided.:

        #ifndef MAX_TABLE_SIZE

        #define MAX_TABLE_SIZE 1000

        #endif

    Thisidiom is particularly useful with implementations that allowmacro definitions in the command that invokes the C compiler. Forexample, the following UNIX invocation of C provides an initialdefinition of the macro MAX_TABLE_SIZE as 5000. The C programmer wouldthen check for the definition as shown before:

        cc -c -DMAX_TABLE_SIZE=5000 prog.c


    Although disallowed in Standard C, a few older preprocessorimplementations handle #define and #undef so as to maintain a stack ofdefinitions. When a name is redefined with a #define, its olddefinition is pushed onto a stack and then the new definition replacesthe old one. When a name is undefined with #undef, the currentdefinition is discarded and the most recent previous definition (ifany) restored.


7. Precedence Errors In Macro Expansions

    Macros operate purely by textual substitution of tokens. Parsingof the body into declarations, expressions, or statements occurs onlyafter the macro expansion process. This can lead to surprising resultsif care is not taken. As a rule, it is safest to always parenthesizeeach parameter appearing in the macro body. The entire body, if it issyntactically an expression, should also be parenthesized.


Example:

    Consider this macro definition:

        #define  SQUARE(x)    x*x

    The idea is that SQUARE takes an argument expression andproduces a new expression to comput the square of that argument. Forexample, SQUARE(5) expands to %*5. However, the expression SQUARE(z+1)expands to z+1*z+1, which is parsed as z+(1*z)+1 rather than expected(z+1)*(z+1). A definition of SQUARE that avoids this problem is:

        #define SQUARE(x)    ((x)*(x))

    The out parentheses are needed to prevent misinterpretation of an expression such as (short)SQUARE(z+1).


8. Side Effects In Macro Arguments

   Macros can also produce problems dut to side effects. Because themacro's actual arguments may be textually replicated, they may beexecuted more than once, and side effects in the actual arguments mayoccur more than once. In contrast, a true function call--which themacro invocation resembles--evaluates argument expressions exactlyonce, so any side effects of the expression occur exactly once. Macrosmust be used with care to avoid such problems.


Example:

    Consider the macro SQUARE from the prior example and also a function square that does (almost) the same thing:

        int square(int x) { return x*x; }

   The macro can square integers or floating-point numbers; the functioncan square only integers. Also, calling the function is likely to besomewhat slower at run time than using the macro. But these differencesare less important than the problem of side effects. In the programfragment

        a = 3;

        b = square(a++);

   the variable b gets the value 9 and the variable a ends up with thevalue 4. Howerver, in the superficially similar program fragment

        a = 3;

        b = SQUARE(a++);

   the variable b may get the value 12 and the variable a may end up withthe value 5 because the expansion of the last fragment is

        a = 3;

        b = ((a++)*(a++));

   (Say that 12 and 15 may be the resulting values of b  and a becauseStandard C implementations may evaluate the expression ((a++)*(a++)) indifferent ways.)


9. Converting Tokens to Strings

   There is a mechanism in Standard C to convert macro parameters (afterexpansion) to string constants. Before this, programmers had to dependon a loophole in many C preprocessors that achieved the same result ina different way.

    In Standard C, the # token appearing withina macro definition is recognized as a unary "stringization" operatorthat must be followed by the name of a macro formal parameters. Duringmacro expansion, the # and the formal parameter name are replaced bythe corresponding actual argument enclosed in string quotes. Whencreating the string, each sequence of whitespace in the argument'stoken list is replaced by a single space character, and any embeddedquotation or backslash character characters are preceded by a backslashcharacter to preserve their meaning in the string. Whitespace at thebeginning and end of the argument is ignored, so an empty argument(even with whitespace between the commas) expands to the empty string"".


Example:

    Consider the Standard C definition of macro TEST:

        #define TEST(a, b )    printf( #a " < " #b " = %d/n", (a)<(b) )

    The statement TEST(0, 0XFFFF);  TEST('/n', 10); would expand into

        printf("0" "<" "0xFFFF" " = %d/n", (0)<(0XFFFF));

        printf(" '//n' " "<" "10" " = %d/n", ('/n') <(10) );

    After concatenation of ajacent strings, these become:

        printf("0 < 0xFFFF = %d/n", (0) < (0XFFFF) );

        printf(" '//n' < 10 = %d/n", ('/n') <(10) );


   A number of non-standard C compilers will substitute for macro formalparameters inside string and character constants. Standard C prohibitsthis.

    The handling if whitespace in non-ISO implementationsis likely to vary from compiler to compiler--another reason to avoiddepending on this feature except in Standard C implementations.


10. Token Merging In Macro Expansions

    Merging of tokens to form new tokens in Standard C is controlled by the presence of a merging operator, ##, in macro definitions. In a macro replacement list--before rescanning for more macros--the two tokens surrounding any ## operator are combined into a single token. There must be suck tokens: ## must not appear at the begnning or end of a replacement list. If the combination does not form a valid token, the result is undefined.

        #define TEMP(i)   temp ## i

        Temp(1) = TEMP(2+k) + x;

    After preprocessing, this becomes

        temp1 = temp2 + k + x;

 

    In the previous example, a curious situation can arise whe expanding TEMP() + x. The macro definition is valid, but ## is left with no right-hand token token to combine (unless it grabs +, which we do not want). This problem is resolved by treating the formal parameter i as if it expanded to a special "empty" token just for the benefit of ##. Thus, the expansion of TEMP() + x would be temp + x as expected.

 

    Token concatenation must not be used to produce a unversal character name.

 

    As with the conversion of macro arguments to strings, programmers can obtain something like this merging capability through a loophole in many non-Standard C implementations. Although the original definition of C explicitly described macro bodies as being  sequences of tokens, not sequences of characters, nevertheless many C compilers expand and rescan macro bodies as if they were character sequences. This becomes apparent primarily in the case where the compiler also handles comments by eliminating them entirely (rather than replacing them with a space)--a situation exploited by some cleverly written programs.

 

Example:

    Consider the following example:

        #define INC    ++

        #define TAB    internal_table

        #define INCTAB table_of_increments

        #define CONC(x,y) x/**/y

        CONC(INC,TAB)

    Standard C interprets the body of CONC as two tokens, x and y, separated by a space.(Comments are converted to a space.) The call CONC(INC,TAB) expands to the two tokens INC TAB. Howerver, some non-Standard implementations simply eliminate comments and rescan macro bodies for tokens; the expand CONC(INC,TAB) to the single token INCTAB.

 

    Step              1                2             3          4

    Standard          CONC(INC,TAB)    INC/**/TAB    INC TAB    ++ internal_table

    non-Standard      CONC(INC,TAB)    INC/**/TAB    INCTAB     table_of_increments

 

11. Variable Argument Lists In Macro

    In C99, a functionlike macro can have as its last or only formal parameter an ellipsis, signifying that the macro may accept a variable number of arguments:

        #define name( identifier-list, ... ) sequence-of-tokens(optional)

        #define name( ... ) sequence-of-tokens(optional)

    When such a macro is invoked, there must be at least as many actual arguments as there are identifiers in identifier-list. The trailing argument(s), including any separating commas, are merged into a single sequence of preprocessing tokens called the variable arguments. The identifier __VA_ARGS__ appearing in the replacement list of the macro definiton as treated as if it had been a macro parameter whose argument was the merged variable arguments. That is, __VA_ARGS__ is replaced by the list of extra arguments, including their comma separators. __VA_ARGS__can only appear in a macro definition that includes ... in its parameter list.

    Macro with a variable number of arguments are often used to interface to functions that takes a variable number of arguments, such as printf. By using # stringization operator, they can also be used to convert a list of arguments to a single string without having to enclosing the arguments in parentheses.

 

Example:

    These directives create a macro my_printf that can write its arguments either to the error or standard output.

        #ifdef DEBUG

        #define my_printf( ... ) fprintf(stderr, __VA_ARGS__)

        #else

        #define my_printf( ... ) fprintf(stdout, __VA_ARGS__)

        #endif

 

    Given the definition

        #define make_em_a_string( ... ) #__VA_ARGS__

    the invocation

        make_em_a_string(a, b, c, d)

    expands to the string

        "a, b, c, d"

 

12. Other Problems

    Some non-Standard implementations do not perform stringent error checking on macro definitions and calls, including permitting an incomplete token in the macro body to be completed by text appearing after the macro call. The lack of error checking by certain implementations does not make clever exploitation of that lack legitimate. Standard C reaffirms that macro bodies must be sequences of well-formed tokens.

 

Example:

    For example, the folloing fragment in one of these non-ISO implementations:

        #define STRING_START_PART   "This is a split"

        ...

        printf(STRING_START_PART string."); /* !!!! Yuk */

    will, after preprocessing, result in the source text

        printf("This is a split string.");

原创粉丝点击