Chapter 6

来源:互联网 发布:wps表格数据分析教程 编辑:程序博客网 时间:2024/06/09 21:03

6.6 Table Lookup

In this section we will write the innards of a table-lookup package, to illustrate more aspects of structures. This code is typical of what might be found in the symbol table management routines of a macro processor or a compiler. For example, consider the #define statement. When a line like

为了对结构的更多方面进行深入的讨论,我们来编写一个表查找程序包的核心部分代码。这段代码很典型,可以在宏处理器或编译器的符号表管理例程中找到。例如,考虑#define语句。当遇到类似于


#define IN 1

is encountered, the name IN and the replacement text 1 are stored in a table. Later, when the name IN appears in a statement like

之类的程序行时,就需要把名字IN和替换文本1存入到某个表中。此后,当名字IN出现在某些语句中时,如:

state = IN;

it must be replaced by 1.

There are two routines that manipulate the names and replacement texts. install(s,t) records the name s and the replacement text t in a table; s and t are just character strings. lookup(s) searches for s in the table, and returns a pointer to the place where it was found, or NULL if it wasn't there.

以下两个函数用来处理名字和替换文本。install(s, t)函数将名字s 和替换文本t记录到某个表中,其中st仅仅是字符串。lookup(s)函数在表中查找s,若找到,则返回指向该处的指针;若没找到,则返回NULL

The algorithm is a hash-search - the incoming name is converted into a small non-negative integer, which is then used to index into an array of pointers. An array element points to the beginning of a linked list of blocks describing names that have that hash value. It is NULL if no names have hashed to that value.

该算法采用的是散列查找方法——将输入的名字转换为一个小的非负整数,该整数随后将作为一个指针数组的下标。数组的每个元素指向某个链表的表头,链表中的各个块用于描述具有该散列值的名字。如果没有名字散列到该值,则数组元素的值为NULL(参见图6-4)。


A block in the list is a structure containing pointers to the name, the replacement text, and the next block in the list. A null next-pointer marks the end of the list.

链表中的每个块都是一个结构,它包含一个指向名字的指针、一个指向替换文本的指针以及一个指向该链表后继块的指针。如果指向链表后继块的指针为NULL,则表明链表结束。

struct nlist { /* table entry: */

struct nlist *next; /* next entry in chain */

char *name; /* defined name */

char *defn; /* replacement text */

};

The pointer array is just

#define HASHSIZE 101

static struct nlist *hashtab[HASHSIZE]; /* pointer table */

The hashing function, which is used by both lookup and install, adds each character value in the string to a scrambled combination of the previous ones and returns the remainder modulo the array size. This is not the best possible hash function, but it is short and effective.

散列函数hash lookup install 函数中都被用到,它通过一个for 循环进行计算,每次循环中,它将上一次循环中计算得到的结果值经过变换(即乘以31)后得到的新值同字符串中当前字符的值相加(*s + 31 * hashval),然后将该结果值同数组长度执行取模操作,其结果即是该函数的返回值。这并不是最好的散列函数,但比较简短有效。

/* hash: form hash value for string s */

unsigned hash(char *s)

{

unsigned hashval;

for (hashval = 0; *s != '\0'; s++)

hashval = *s + 31 * hashval;

return hashval % HASHSIZE;

}

Unsigned arithmetic ensures that the hash value is non-negative.

由于在散列计算时采用的是无符号算术运算,因此保证了散列值非负。

The hashing process produces a starting index in the array hashtab; if the string is to be found anywhere, it will be in the list of blocks beginning there. The search is performed by lookup. If lookup finds the entry already present, it returns a pointer to it; if not, it returns NULL.

散列过程生成了在数组hashtab中执行查找的起始下标。如果该字符串可以被查找到,则它一定位于该起始下标指向的链表的某个块中。具体查找过程由lookup 函数实现。如果lookup函数发现表项已存在,则返回指向该表项的指针,否则返回NULL

/* lookup: look for s in hashtab */

struct nlist *lookup(char *s)

{

struct nlist *np;

for (np = hashtab[hash(s)]; np != NULL; np = np->next)

if (strcmp(s, np->name) == 0)

return np; /* found */

return NULL; /* not found */

}

The for loop in lookup is the standard idiom for walking along a linked list:

for (ptr = head; ptr != NULL; ptr = ptr->next)

...

install uses lookup to determine whether the name being installed is already present; if so, the new definition will supersede the old one. Otherwise, a new entry is created. install returns NULL if for any reason there is no room for a new entry.

install函数借助lookup函数判断待加入的名字是否已经存在。如果已存在,则用新的定义取而代之;否则,创建一个新表项。如无足够空间创建新表项,则install函数返回NULL

struct nlist *lookup(char *);

char *strdup(char *);

/* install: put (name, defn) in hashtab */

struct nlist *install(char *name, char *defn)

{

struct nlist *np;

unsigned hashval;

if ((np = lookup(name)) == NULL) { /* not found */

np = (struct nlist *) malloc(sizeof(*np));

if (np == NULL || (np->name = strdup(name)) == NULL)

return NULL;

hashval = hash(name);

np->next = hashtab[hashval];

hashtab[hashval] = np;

} else /* already there */

free((void *) np->defn); /*free previous defn */

if ((np->defn = strdup(defn)) == NULL)

return NULL;

return np;

}

6.7 Typedef

C provides a facility called typedef for creating new data type names. For example, the declaration

C语言提供了一个称为typedef的功能,它用来建立新的数据类型名,例如,声明


typedef int Length;


makes the name Length a synonym for int. The type Length can be used in declarations, casts, etc., in exactly the same ways that the int type can be:

Length定义为与int具有同等意义的名字。类型Length可用于类型声明、类型转换等,它和类型int完全相同,例如:

Length len, maxlen;

Length *lengths[];

Similarly, the declaration

typedef char *String;


makes String a synonym for char * or character pointer, which may then be used in declarations and casts:

 String 定义为与char *或字符指针同义,此后,便可以在类型声明和类型转换中使用String,例如:


String p, lineptr[MAXLINES], alloc(int);

int strcmp(String, String);

p = (String) malloc(100);


Notice that the type being declared in a typedef appears in the position of a variable name, not right after the word typedef. Syntactically, typedef is like the storage classes extern, static, etc. We have used capitalized names for typedefs, to make them stand out.

注意,typedef 中声明的类型在变量名的位置出现,而不是紧接在关键字typedef 后。typedef 在语法上类似于存储类externstatic 等。我们在这里以大写字母作为typedef定义的类型名的首字母,以示区别。

As a more complicated example, we could make typedefs for the tree nodes shown earlier in this chapter:

typedef struct tnode *Treeptr;

typedef struct tnode { /* the tree node: */

char *word; /* points to the text */

int count; /* number of occurrences */

struct tnode *left; /* left child */

struct tnode *right; /* right child */

} Treenode;

This creates two new type keywords called Treenode (a structure) and Treeptr (a pointer to the structure). Then the routine talloc could become

Treeptr talloc(void)

{

return (Treeptr) malloc(sizeof(Treenode));

}


It must be emphasized that a typedef declaration does not create a new type in any sense; it merely adds a new name for some existing type. Nor are there any new semantics: variables declared this way have exactly the same properties as variables whose declarations are spelled out explicitly. In effect, typedef is like #define, except that since it is interpreted by the compiler, it can cope with textual substitutions that are beyond the capabilities of the preprocessor. For example,

这里必须强调的是,从任何意义上讲,typedef 声明并没有创建一个新类型,它只是为某个已存在的类型增加了一个新的名称而已。typedef 声明也没有增加任何新的语义:通过这种方式声明的变量与通过普通声明方式声明的变量具有完全相同的属性。实际上,typedef类似于#define 语句,但由于typedef 是由编译器解释的,因此它的文本替换功能要超过预处理器的能力。例如:


typedef int (*PFI)(char *, char *);

creates the type PFI, for ``pointer to function (of two char * arguments) returning int,'' which can be used in contexts like

该语句定义了类型PFI 是“一个指向函数的指针,该函数具有两个char *类型的参数,返回值类型为int”,它可用于某些上下文中,例如,可以用在第5章的排序程序中,如下所示:

PFI strcmp, numcmp;

in the sort program of Chapter 5.

Besides purely aesthetic issues, there are two main reasons for using typedefs. The first is to parameterize a program against portability problems. If typedefs are used for data types that may be machine-dependent, only the typedefs need change when the program is moved. One common situation is to use typedef names for various integer quantities, then make an appropriate set of choices of short, int, and long for each host machine. Types like size_t and ptrdiff_t from the standard library are examples.

除了表达方式更简洁之外,使用typedef还有另外两个重要原因。首先,它可以使程序参数化,以提高程序的可移植性。如果typedef声明的数据类型同机器有关,那么,当程序移植到其它机器上时,只需改变typedef类型定义就可以了。一个经常用到的情况是,对于各种不同大小的整型值来说,都使用通过typedef 定义的类型名,然后,分别为各个不同的宿主机选择一组合适的shortint long 类型大小即可。标准库中有一些例子,例如

size_tptrdiff_t等。

The second purpose of typedefs is to provide better documentation for a program - a type called Treeptr may be easier to understand than one declared only as a pointer to a complicated structure.

typedef 的第二个作用是为程序提供更好的说明性——Treeptr 类型显然比一个声明为指向复杂结构的指针更容易让人理解。

 上一章Chapter 6 - Structures(四).2