前言的闲话以及第一章的入门(四).2

来源:互联网 发布:淘宝创业成功案例 编辑:程序博客网 时间:2024/05/16 17:32

1.5.2 Character Counting(字符计数)

#include <stdio.h>

/* count characters in input; 1st version */

main()

{

    long nc;

    nc = 0;

    while (getchar() != EOF)

        ++nc;

        printf("%ld\n", nc);

    }

The statement

    ++nc;

presents a new operator, ++, which means increment by one. You could instead write nc = nc + 1 but ++nc is more concise and often more efficient. There is a corresponding operator -- to decrement by 1. The operators ++ and -- can be either prefix operators (++nc) or postfix operators (nc++); these two forms have different values in expressions,

引入了一个新的运算符++,其功能是执行加1 操作。可以用语句nc = nc + 1 代替它,但语句++nc 更精炼一些,且通常效率也更高。与该运算符相应的是自减运算符--。++与--这两个运算符既可以作为前缀运算符(如++nc),也可以作为后缀运算符(如nc++)。

The character counting program accumulates its count in a long variable instead of an int. long integers are at least 32 bits. Although on some machines, int and long are the same size, on others an int is 16 bits, with a maximum value of 32767, and it would take relatively little input to overflow an int counter. The conversion specification %ld tells printf that the corresponding argument is a long integer.

该字符计数程序使用long类型的变量存放计数值,而没有使用int类型的变量。long整型数(长整型)至少要占用32位存储单元。在某些机器上int与long类型的长度相同,

但在一些机器上,int类型的值可能只有16 位存储单元的长度(最大值为32767),这样,相当小的输入都可能使int类型的计数变量溢出。转换说明%ld告诉printf函数其对应的参数是long整型。

 

It may be possible to cope with even bigger numbers by using a double (double precision float). We will also use a for statement instead of a while, to illustrate another way to write the loop.

使用double(双精度浮点数)类型可以处理更大的数字。我们在这里不使用while 循环语句,而用for循环语句来展示编写此循环的另一种方法:

 

#include <stdio.h>

/* count characters in input; 2nd version */

main()

{

    double nc;

    for (nc = 0; gechar() != EOF; ++nc)

    ;

    printf("%.0f\n", nc);

}

printf uses %f for both float and double; %.0f suppresses the printing of the decimal point and the fraction part, which is zero.

对于float与double类型。printf函数都使用%f进行说明。%.0f强制不打印小数点和小数部分,因此小数部分的位数为0。


The body of this for loop is empty, because all the work is done in the test and increment parts. But the grammatical rules of C require that a for statement have a body. The isolated semicolon, called a null statement, is there to satisfy that requirement. We put it on a separate line to make it visible.

在该程序段中,for 循环语句的循环体是空的,这是因为所有工作都在测试(条件)部分与增加步长部分完成了。但C语言的语法规则要求for循环语句必须有一个循环体,因此用单独的分号代替。单独的分号称为空语句,它正好能满足for 语句的这一要求。把它单独放在一行是为了更加醒目。

 

Before we leave the character counting program, observe that if the input contains no characters, the while or for test fails on the very first call to getchar, and the program produces zero, the right answer. This is important. One of the nice things about while and for is that they test at the top of the loop, before proceeding with the body. If there is nothing to do, nothing is done, even if that means never going through the loop body. Programs should act intelligently when given zero-length input. The while and for statements help ensure that programs do reasonable things with boundary conditions.

在结束讨论字符计数程序之前,我们考虑以下情况:如果输入中不包含字符,那么,在第一次调用getchar 函数的叫候,while 语句或for 语句中的条件测试从一开始就为假,程序的执行结果将为0,这也是正确的结果。这一点很重要。whi1e 语句与for 语句的优点之一就是在执行循环体之前就对条件进行测试,如果条件不满足,则不执行循环体,这就可能出现循环体一次都不执行的情况。在出现0 长度的输入时,程序的处理应该灵活一些,在出现边界条件时,while语句与for语句有助于确保程序执行合理的操作。

 

1.5.3 Line Counting(行计数)

The next program counts input lines. As we mentioned above, the standard library ensures that an input text stream appears as a sequence of lines, each terminated by a newline. Hence, counting lines is just counting newlines:

接下来的这个程序用于统计输入中的行数。我们在上面提到过,标准库保证输入文本流以行序列的形式出现,每一行均以换行符结束。因此,统计行数等价于统计换行符的个数。


 #include <stdio.h>

/* count lines in input */

main()

{

int c, nl;

nl = 0;

while ((c = getchar()) != EOF)

    {

        if (c == '\n')

        {

            ++nl;

            printf("%d\n", nl);   


        }

    }

}

The body of the while now consists of an if, which in turn controls the increment ++nl. The if statement tests the parenthesized condition, and if the condition is true, executes the statement (or group of statements in braces) that follows. We have again indented to show what is controlled by what.

在该程序中,while循环语句的循环体是一个if语句,它控制自增语句++nl。if语句先测试圆括号中的条件,如果该条件为真,则执行其后的语句(或括在花括号中的一组语句)。这里再次用缩进方式表明语句之间的控制关系。

 

The double equals sign == is the C notation for ``is equal to'' (like Pascal's single = or Fortran's .EQ.). This symbol is used to distinguish the equality test from the single = that C uses for assignment. A word of caution: newcomers to C occasionally write = when they mean ==.

双等于号==是C语言中表示“等于”关系的运算符(类似于Pascal中的单等于号=及Fortran中的.EQ.)。由于C 语言将单等于号=作为赋值运算符,因此使用双等于号==表示相等的逻辑关系,以示区分。这里提醒注意,在表示“等于”逻辑关系的时候(应该用==),C语言初学者有时会错误地写成单等于号=。


A character written between single quotes represents an integer value equal to the numerical value of the character in the machine's character set. This is called a character constant, although it is just another way to write a small integer. So, for example, 'A' is a character constant; in the ASCII character set its value is 65, the internal representation of the character A. Of course, 'A' is to be preferred over 65: its meaning is obvious, and it is independent of a particular character set.

单引号中的字符表示一个整型值,该值等于此字符在机器字符集中对应的数值,我们称之为字符常量。但是,它只不过是小的整型数的另一种写法而已。例如,'A'是一个字符常量;在ASCII字符集中其值为65(即字符A的内部表示值为65)。当然,用'A'要比用65 好,因为。'A'的意义更清楚,且与特定的字符集无关。

 

The escape sequences used in string constants are also legal in character constants, so '\n' stands for the value of the newline character, which is 10 in ASCII. You should note carefully that '\n' is a single character, and in expressions is just an integer; on the other hand, '\n' is a string constant that happens to contain only one character.

字符串常量中使用的转义字符序列也是合法的字符常量,比如,'\n'代表换行符的值,在ASCII字符集中其值为10。我们应当注意到,'\n'是单个字符,在表达式中它不过是一个整型数而已;而"\n"是一个仅包含一个字符的字符串常量


1.5.4 Word Counting(单词计数)


The fourth in our series of useful programs counts lines, words, and characters, with the loose definition that a word is any sequence of characters that does not contain a blank, tab or newline. This is a bare-bones version of the UNIX program wc.

我们将介绍的第4 个实用程序用于统计行数、单词数与字符数。这里对单词的定义比较宽松,它是任何其中不包含空格、制表符或换行符的字符序列。下面这段程序是UNIX 系统

中wc程序的骨干部分:

 

#include <stdio.h>

#define IN 1 /* inside a word */

#define OUT 0 /* outside a word */

/* count lines, words, and characters in input */

main()

{

    int c, nl, nw, nc, state;

    state = OUT;

    nl = nw = nc = 0;

    while ((c = getchar()) != EOF) {

            ++nc;

        if (c == '\n')

            ++nl;

        if (c == ' ' || c == '\t')

            state = OUT;

        else if (state == OUT) {

            state = IN;

            ++nw;

        }

        if(c=='\n'){

  

            break;

        }

    }

    printf("%d %d %d\n", nl, nw, nc);

}

Every time the program encounters the first character of a word, it counts one more word. The variable state records whether the program is currently in a word or not; initially it is ``not in a word'', which is assigned the value OUT. We prefer the symbolic constants IN and OUT to the literal values 1 and 0 because they make the program more readable. In a program as tiny as this, it makes little difference, but in larger programs, the increase in clarity is well worth the modest extra effort to write it this way from the beginning. You'll also find that it's easier to make extensive changes in programs where magic numbers appear only as symbolic constants.

程序执行时,每当遇到单词的第一个字符,它就作为一个新单词加以统计。state 变量记录程序当前是否正位于一个单词之中,它的初值是“不在单词中”,即初值被赋为OUT。我们在这里使用了符号常量IN与OUT,而没有使用其对应的数值1 与0,这样程序更易读。在较小的程序中,这种做法也许看不出有什么优势,但在较大的程序中,如果从一开始就这样做,因此而增加的一点工作量与提高程序可读性带来的好处相比是值得的。读者也会发现,如果程序中的幻数都以符号常量的形式出现,对程序进行大量修改就会相对容易得多。


     nl = nw = nc = 0;

sets all three variables to zero. This is not a special case, but a consequence of the fact that an assignment is an expression with the value and assignments associated from right to left. It's as if we had written

将把其中的3个变量nl、nw与nc都设置为0。这种用法很常见,但要注意这样一个事实:在兼有值与赋值两种功能的表达式中,赋值结合次序是由右至左。所以上面这条语句等同于


     nl = (nw = (nc = 0));

The operator || means OR, so the line

     if (c == ' ' || c == '\n' || c = '\t')

says ``if c is a blank or c is a newline or c is a tab''. (Recall that the escape sequence \t is a visible representation of the tab character.) There is a corresponding operator && for AND; its precedence is just higher than ||. Expressions connected by && or || are evaluated left to right, and it is guaranteed that evaluation will stop as soon as the truth or falsehood is known. If c is a blank, there is no need to test whether it is a newline or tab, so these tests are not made. This isn't particularly important here, but is significant in more complicated situations, as we will soon see.

的意义是“如果c 是空格,或c 是换行符,或c 是制表符”(前面讲过,转义字符序列\t 是制表符的可见表示形式)。相应地,运算符&&代表AND(逻辑与),它仅比||高一个优先级。由&&或||连接的表达式由左至右求值,并保证在求值过程中只要能够判断最终的结果为真或假,求值就立即终止。如果c 是空格,则没有必要再测试它是否为换行符或制表符,这样就不必执行后面两个测试。在这里,这一点并不特别重要,但在某些更复杂的情况下这样做就有必要了,不久我们将会看到这种例子。

 

The example also shows an else, which specifies an alternative action if the condition part of an if statement is false. The general form is

    if (expression)

    statement1

    else

    statement2

One and only one of the two statements associated with an if-else is performed. If the expression is true, statement1 is executed; if not, statement2 is executed. Each statement can be a single statement or several in braces. In the word count program, the one after the else is an if that controls two statements in braces.

其中,if-else 中的两条语句有且仅有一条语句被执行。如果表达式的值为真,则执行语句1,否则执行语句2。这两条语句都既可以是单条语句,也可以是括在花括号内的语句序列。在单词计数程序中,else 之后的语句仍是一个if 语句,该if 语句控制了包含在花括号内的两条语句。

 

interleaved 交换使用

shrinks 收缩

compact 紧凑的

impenetrable 不可理解的

curb 限制, 克制, 抑制

indented v. 缩进;切割成锯齿状(indent的过去分词)

escape sequence 转义序列

precedence 优先级


原创粉丝点击