linux gnu c feature

来源：互联网发布：微信java通用版jar 编辑：程序博客网时间：2024/05/20 04:15
===========================Linux 内核使用的 GNU C 扩展===========================GNC CC 是一个功能非常强大的跨平台 C 编译器，它对 C 语言提供了很多扩展，这些扩展对优化、目标代码布局、更安全的检查等方面提供了很强的支持。本文把支持 GNU 扩展的 C 语言称为 GNU C。Linux 内核代码使用了大量的 GNU C 扩展，以至于能够编译 Linux 内核的唯一编译器是 GNU CC，以前甚至出现过编译 Linux 内核要使用特殊的 GNU CC 版本的情况。本文是对 Linux 内核使用的 GNU C 扩展的一个汇总，希望当你读内核源码遇到不理解的语法和语义时，能从本文找到一个初步的解答，更详细的信息可以查看gcc.info。文中的例子取自 Linux 2.4.18。语句表达式==========GNU C 把包含在括号中的复合语句看做是一个表达式，称为语句表达式，它可以出现在任何允许表达式的地方，你可以在语句表达式中使用循环、局部变量等，原本只能在复合语句中使用。例如：++++ include/linux/kernel.h159: #define min_t(type,x,y) \160:         ({ type __x = (x); type __y = (y); __x < __y ? __x: __y; })++++ net/ipv4/tcp_output.c654:         int full_space = min_t(int, tp->window_clamp, tcp_full_space(sk));复合语句的最后一个语句应该是一个表达式，它的值将成为这个语句表达式的值。这里定义了一个安全的求最小值的宏，在标准 C 中，通常定义为:#define min(x,y) ((x) < (y) ? (x) : (y))这个定义计算 x 和 y 分别两次，当参数有副作用时，将产生不正确的结果，使用语句表达式只计算参数一次，避免了可能的错误。语句表达式通常用于宏定义。Typeof======使用前一节定义的宏需要知道参数的类型，利用 typeof 可以定义更通用的宏，不必事先知道参数的类型，例如：++++ include/linux/kernel.h141: #define min(x,y) ({ \142:         const typeof(x) _x = (x);       \143:         const typeof(y) _y = (y);       \144:         (void) (&_x == &_y);            \145:         _x < _y ? _x : _y; })这里 typeof(x) 表示 x 的值类型，第 142 行定义了一个与 x 类型相同的局部变量 _x 并初使化为 x，注意第 144 行的作用是检查参数 x 和 y 的类型是否相同。typeof 可以用在任何类型可以使用的地方，通常用于宏定义。零长度数组==========GNU C 允许使用零长度数组，在定义变长对象的头结构时，这个特性非常有用。例如：++++ include/linux/minix_fs.h 85: struct minix_dir_entry { 86:         __u16 inode; 87:         char name[0]; 88: };结构的最后一个元素定义为零长度数组，它不占结构的空间。在标准 C 中则需要定义数组长度为 1，分配时计算对象大小比较复杂。可变参数宏==========在 GNU C 中，宏可以接受可变数目的参数，就象函数一样，例如：++++ include/linux/kernel.h110: #define pr_debug(fmt,arg...) \111:         printk(KERN_DEBUG fmt,##arg)这里 arg 表示其余的参数，可以是零个或多个，这些参数以及参数之间的逗号构成 arg 的值，在宏扩展时替换 arg，例如：    pr_debug("%s:%d",filename,line)扩展为    printk("<7>" "%s:%d", filename, line)使用 ## 的原因是处理 arg 不匹配任何参数的情况，这时 arg 的值为空，GNUC 预处理器在这种特殊情况下，丢弃 ## 之前的逗号，这样    pr_debug("success!\n")扩展为    printk("<7>" "success!\n")注意最后没有逗号。标号元素========标准 C 要求数组或结构变量的初使化值必须以固定的顺序出现，在 GNU C 中，通过指定索引或结构域名，允许初始化值以任意顺序出现。指定数组索引的方法是在初始化值前写 '[INDEX] ='，要指定一个范围使用 '[FIRST ... LAST] =' 的形式，例如：+++++ arch/i386/kernel/irq.c1079: static unsigned long irq_affinity [NR_IRQS] = { [0 ... NR_IRQS-1] = ~0UL };将数组的所有元素初使化为 ~0UL，这可以看做是一种简写形式。要指定结构元素，在元素值前写 'FIELDNAME:'，例如：++++ fs/ext2/file.c 41: struct file_operations ext2_file_operations = { 42:         llseek:         generic_file_llseek, 43:         read:           generic_file_read, 44:         write:          generic_file_write, 45:         ioctl:          ext2_ioctl, 46:         mmap:           generic_file_mmap, 47:         open:           generic_file_open, 48:         release:        ext2_release_file, 49:         fsync:          ext2_sync_file, 50 };将结构 ext2_file_operations 的元素 llseek 初始化为 generic_file_llseek，元素 read 初始化为 genenric_file_read，依次类推。我觉得这是 GNU C 扩展中最好的特性之一，当结构的定义变化以至元素的偏移改变时，这种初始化方法仍然保证已知元素的正确性。对于未出现在初始化中的元素，其初值为 0。Case 范围=========GNU C 允许在一个 case 标号中指定一个连续范围的值，例如：++++ arch/i386/kernel/irq.c1062:                         case '0' ... '9': c -= '0'; break;1063:                         case 'a' ... 'f': c -= 'a'-10; break;1064:                         case 'A' ... 'F': c -= 'A'-10; break;    case '0' ... '9':相当于    case '0': case '1': case '2': case '3': case '4':    case '5': case '6': case '7': case '8': case '9':声明的特殊属性==============GNU C 允许声明函数、变量和类型的特殊属性，以便手工的代码优化和更仔细的代码检查。要指定一个声明的属性，在声明后写    __attribute__ (( ATTRIBUTE ))其中 ATTRIBUTE 是属性说明，多个属性以逗号分隔。GNU C 支持十几个属性，这里介绍最常用的：* noreturn属性 noreturn 用于函数，表示该函数从不返回。这可以让编译器生成稍微优化的代码，最重要的是可以消除不必要的警告信息比如未初使化的变量。例如：++++ include/linux/kernel.h 47: # define ATTRIB_NORET  __attribute__((noreturn)).... 61: asmlinkage NORET_TYPE void do_exit(long error_code)        ATTRIB_NORET;* format (ARCHETYPE, STRING-INDEX, FIRST-TO-CHECK)属性 format 用于函数，表示该函数使用 printf, scanf 或 strftime 风格的参数，使用这类函数最容易犯的错误是格式串与参数不匹配，指定 format 属性可以让编译器根据格式串检查参数类型。例如：++++ include/linux/kernel.h? 89: asmlinkage int printk(const char * fmt, ...) 90:         __attribute__ ((format (printf, 1, 2)));表示第一个参数是格式串，从第二个参数起根据格式串检查参数。* unused属性 unused 用于函数和变量，表示该函数或变量可能不使用，这个属性可以避免编译器产生警告信息。* section ("section-name")属性 section 用于函数和变量，通常编译器将函数放在 .text 节，变量放在.data 或 .bss 节，使用 section 属性，可以让编译器将函数或变量放在指定的节中。例如：++++ include/linux/init.h 78: #define __init          __attribute__ ((__section__ (".text.init"))) 79: #define __exit          __attribute__ ((unused, __section__(".text.exit"))) 80: #define __initdata      __attribute__ ((__section__ (".data.init"))) 81: #define __exitdata      __attribute__ ((unused, __section__ (".data.exit"))) 82: #define __initsetup     __attribute__ ((unused,__section__ (".setup.init"))) 83: #define __init_call     __attribute__ ((unused,__section__ (".initcall.init"))) 84: #define __exit_call     __attribute__ ((unused,__section__ (".exitcall.exit")))连接器可以把相同节的代码或数据安排在一起，Linux 内核很喜欢使用这种技术，例如系统的初始化代码被安排在单独的一个节，在初始化结束后就可以释放这部分内存。* aligned (ALIGNMENT)属性 aligned 用于变量、结构或联合类型，指定变量、结构域、结构或联合的对齐量，以字节为单位，例如：++++ include/asm-i386/processor.h294: struct i387_fxsave_struct {295:         unsigned short  cwd;296:         unsigned short  swd;297:         unsigned short  twd;298:         unsigned short  fop;299:         long    fip;300:         long    fcs;301:         long    foo;......308: } __attribute__ ((aligned (16)));表示该结构类型的变量以 16 字节对齐。通常编译器会选择合适的对齐量，显示指定对齐通常是由于体系限制、优化等原因。* packed属性 packed 用于变量和类型，用于变量或结构域时表示使用最小可能的对齐，用于枚举、结构或联合类型时表示该类型使用最小的内存。例如：++++ include/asm-i386/desc.h 51: struct Xgt_desc_struct { 52:         unsigned short size; 53:         unsigned long address __attribute__((packed)); 54: };域 address 将紧接着 size 分配。属性 packed 的用途大多是定义硬件相关的结构，使元素之间没有因对齐而造成的空洞。当前函数名==========GNU CC 预定义了两个标志符保存当前函数的名字，__FUNCTION__ 保存函数在源码中的名字，__PRETTY_FUNCTION__ 保存带语言特色的名字。在 C 函数中，这两个名字是相同的，在 C++ 函数中，__PRETTY_FUNCTION__ 包括函数返回类型等额外信息，Linux 内核只使用了 __FUNCTION__。++++ fs/ext2/super.c 98: void ext2_update_dynamic_rev(struct super_block *sb) 99: {100:         struct ext2_super_block *es = EXT2_SB(sb)->s_es;101: 102:         if (le32_to_cpu(es->s_rev_level) > EXT2_GOOD_OLD_REV)103:                 return;104: 105:         ext2_warning(sb, __FUNCTION__,106:                      "updating to rev %d because of new feature flag, "107:                      "running e2fsck is recommended",108:                      EXT2_DYNAMIC_REV);这里 __FUNCTION__ 将被替换为字符串 "ext2_update_dynamic_rev"。虽然__FUNCTION__ 看起来类似于标准 C 中的 __FILE__，但实际上 __FUNCTION__是被编译器替换的，不象 __FILE__ 被预处理器替换。内建函数========GNU C 提供了大量的内建函数，其中很多是标准 C 库函数的内建版本，例如memcpy，它们与对应的 C 库函数功能相同，本文不讨论这类函数，其他内建函数的名字通常以 __builtin 开始。* __builtin_return_address (LEVEL)内建函数 __builtin_return_address 返回当前函数或其调用者的返回地址，参数LEVEL 指定在栈上搜索框架的个数，0 表示当前函数的返回地址，1 表示当前函数的调用者的返回地址，依此类推。例如：++++ kernel/sched.c437:                 printk(KERN_ERR "schedule_timeout: wrong timeout "438:                        "value %lx from %p\n", timeout,439:                        __builtin_return_address(0));* __builtin_constant_p(EXP)内建函数 __builtin_constant_p 用于判断一个值是否为编译时常数，如果参数EXP 的值是常数，函数返回 1，否则返回 0。例如：++++ include/asm-i386/bitops.h249: #define test_bit(nr,addr) \250: (__builtin_constant_p(nr) ? \251:  constant_test_bit((nr),(addr)) : \252:  variable_test_bit((nr),(addr)))很多计算或操作在参数为常数时有更优化的实现，在 GNU C 中用上面的方法可以根据参数是否为常数，只编译常数版本或非常数版本，这样既不失通用性，又能在参数是常数时编译出最优化的代码。* __builtin_expect(EXP, C)内建函数 __builtin_expect 用于为编译器提供分支预测信息，其返回值是整数表达式 EXP 的值，C 的值必须是编译时常数。例如：++++ include/linux/compiler.h 13: #define likely(x)       __builtin_expect((x),1) 14: #define unlikely(x)     __builtin_expect((x),0)++++ kernel/sched.c564:         if (unlikely(in_interrupt())) {565:                 printk("Scheduling in interrupt\n");566:                 BUG();567:         }这个内建函数的语义是 EXP 的预期值是 C，编译器可以根据这个信息适当地重排语句块的顺序，使程序在预期的情况下有更高的执行效率。上面的例子表示处于中断上下文是很少发生的，第 565-566 行的目标码可能会放在较远的位置，以保证经常执行的目标码更紧凑.1.表达式的陈述和申明。   数据安全性   in Standard C:      #define max(a,b) ((a)>(b)?(a):(b))   in Gcc:  ( assume type is int )      #define maxint(a,b) \              ({int _a=(a), _b=(b),;                  _a>_b?_a:_b;})2.本地申明标签(locally declared labels)   in Gcc:      __label__  name ;3.可附值得标签   " && " 运算符--返回标签地址   in Gcc:      void * ptr;      ptr = &&foo;      goto *ptr;            static void *array[] = { &&foo, &&bar, &&hack };      goto *array[ i ];      static const int array[] = { &&foo - &&foo, &&bar - &&foo, &&hack - &&foo };      goto * (&&foo+ array[ i ]);4.陷套函数(nested function) 5.构建函数调用(contructing function call)  in Gcc:     void * __builtin_apply_args()     返回一个由参数指针寄存器，结构值地址(structure value address),全部寄存器组成的堆栈的地址     void * __builtin_apply( void (*function)(), void * arguments, size_t size)     用 *arguments和size指定的参数调用 (*function).     void * __builtin_return( void * result)     返回result从包含的函数6.给表达式取名字(naming an expression`s type)    typedef7.用typeof()递交类型    如果是在 *.h 文件里，则用 __typeof__8.推广的左附值  in Gcc:     (a , b) += 5     a, (b +=5 )     &(a,b)     a, &b     ( a ? b : c ) = 5     ( a ? b = 5 : ( c = 5 ))         (int)a = 5     (int) ( a = (char *)(int) 5 )     (int)a += 5     (int) ( a = (char *)((int)a + 5))             *&(int)f = 19.条件省略  in Gcc:     x ? : y       EQU      x ? x : y10.双字整形数  in Gcc:     long long     unsigned long long     LL     ULL11.复杂数   in Gcc:      _Complex      __complex__      __imag__12.16位浮点数   in Gcc:      0x0.f is (15/16)      p3    is  multiplies by 8      so      0x0.fp3 is 1.55e113.长度为0的矩阵(arrays of length zero)14.可变长度矩阵(arrays of variable length)   in Gcc:      struct entry      tester (int len, char data[len][len]){       }15.可变量参数的宏( macros with variable number of arguments)   in Gcc:      #define debug(format, ...) fprintf(stderr, format, __VA_ARGS__)      "..."是一个可变参数16.slightly looser rules for escaped newlines17.string literals with embedded newlines18.non-lvalue arrays may have subscripts   in Gcc:   struct foo{ int a[4]};   struct foo f();   bar (int index){       return f().a[index];   }19.void- 和 function-pointers 的转换( arithmetic on void- and function-pointer)   arg with gcc:       -Wpointer-arith  -- to warning if the arithmetic are used20.非常数初始化( non-constant initializers )   in Gcc:      foo(float f, float g){         float beat_freqs[2] = { f-g, f+g };      }21.字面上的组合( compound literals )22.指明的初始化( designated initializers )   in Gcc:      int a[6] = { [4] = 29, [2] = 15};      int withs[] = { [0 ... 9] = 1, [10 ... 99] = 2, [100] = 3};23. case ranges24. cast to a union type                 25. mixed devlarations and code26. declaring attributes of functions    in Gcc:    __attribute__    total 14 attributes: noreturn, pure, const, format, format_arg, no_instrument_function, section, constructor, destructor, unused, weak, malloc, alias, no_check_memory_usage,regparm(number),stdcall,cdecl,longcall,short_call....27.属性语法(attribute syntax)本节描述__attribute__可能用到的语法，和在C中属性特殊绑定的构造，一些细节会多样的对C++和objective C.因为属性文法的不幸，一些在这里描述的格式不能被全部成功地分析。  see 5.26 [function attributes]      5.33 [variable attributes]      5.34 [type attributes]一个属性指示是格式__attribute__((attribute-list))一个属性列表可能是一个空的逗号隔开的属性序列,每一个属性是下列中的一个:  1.空  2.字  ......一个属性指示表(attribute specifier list)是一个或多个属性指示,不被任何记号分离一个属性指示可以出现在标号的冒号后面,除了 case , default 但是它是不用的(unused),但是要用'-Wall'编译一个属性指示可以是struct, union, enum 指示的一部分,如果struct,union,enum为空则忽略它另外,一个属性指示可以是一个申明,未名命的计数申明和类型名的一部分以后,一个属性指示在很多地方作为一个特殊申明符代替了,一些例子在下面当一个属性指示申明为函数或矩阵的参数,他会付给函数隐含的盖住参数的指针,但是这个现在还没有真确的实现任何指示的列表在??????? p172.s4一个属性指示列表能出现在一个声明前面,而且对他后面的申明都有效__attribute((noreturn)) void d0(void), __attribute__((format(printf,1,2))) d1(const char *, ...), d2(void)其中'noreturn'对全部申明,'format'只能为d1声明，如果为d2声明则产生错误一个属性指示列表能出现在逗号(,)，等号(")，分号(")(****f)(void) __attribute__((noreturn))，现在'noreturn'属性申明到f,导致警告f不是一个函数，但是以后将能申明到这个函数 ****f?????????  p17328.原形和经典风格的函数定义29.C++风格的注释30.带美金符号的名字31.'ESC'常数ascii ESC = '\e'32.追根究底的变量和类型对齐   __alignof__(foo)   exp:   __alignof__(double) is 8 on may RISC   __alignof__(double) is 4 or even 2 on more tradidional machine33.变量的属性   8 attributes:     aligned, mode, nocommon, packed, section, transparent_union, unused, weak, share(only in WinNT)         NOTE:     mode(mode): byte, __byte__, word, __word__, pointer, __pointer__       nocommon: 用'-fno-common'编译，将会将全部的变量编译为nocommon,直接分配空间给它,并全部置为0,一个变量只能在一个源文件中初始化     section("section-name"): 一般，编译器将数据装到.data和.bss       exp:           struct duart a __attribute__((section ("DUART_A"))) = {0};           struct duart b __atrrubite__((section ("DUART_B"))) = {0};           char stack[10000] __attribute__((section ("STACK"))) = {0};   int init_data __attribute((section ("INITDATA"))) = 0;     model(model-name): small, medium, large         small -- lower of 16mb         mediumm ,large -- 32bit address space34.类型的属性     in Gcc:         unused: gcc对被申明unused的不会产生任何警告         packed: 如果被enum,struct,union申明,意味着用最小的内存去表现这种数据类型         aligned(alignedment):指定类型存储的最小边界         aligned:aligned子会增加alignment,如果你想减小alignment就用packed35.行内函数如同宏   in Gcc:      inline int inc(int *a){              (*a)++;}???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????38.1 全局寄存器变量(defining global register variables)     in Gcc:        register int *foo asm ("reg");        这里的"a5"是一个寄存器的名字,当函数调用时被存储，库例程(library routine)不会clobber it38.2 局部寄存器变量(specifying registers for local variables)     in Gcc:        register int *foo asm ("a5");39   可选的关键字(alternate keywords)     in option:        '-traditional' 屏蔽当前关键字        '-ansi','-std=?' 屏蔽一部分关键字 '-std=c99'        '-pedantic', 导致很多警告对很多GNU Ce40.未完成的enum41.函数名作为字符串(function names as strings)   in Gcc:      '__fUNCTION__'保持(hold)函数的名字出现在源码????      '__PRETTY_FUNCTION__'保持(hold)函数的名字用语言指示的样式打印   exp:      char here[] = "Function " __FUNCTION__ " in " __FILE__;      static const char __func__[] = "function-name"; // std=c9942.取得函数的返回或结构地址 getting the return or frame address of a function   in Gcc:      void * __builtin_return_address ( unsigned int level );      NOTE: level 必须是一个常整数(constant int)