【龙书答案】第三章解析(未完成)

来源:互联网 发布:2015年度十大网络剧 编辑:程序博客网 时间:2024/04/30 09:24

Exercise 3.3


Problem 3.3.1

Consult the language reference manuals to determine

  1. The sets of characters that form the input alphabet (exclude those may only appear in character strings or comments).

  2. The lexical form of numerical constants and

  3. The lexical form of identifiers.

for each of the following languages:

  1. C

  2. C++

  3. C#

  4. Fortran

  5. Java

  6. Lisp

  7. SQL

Answer:

Actually this exercise is not that important. Since it will cost much time to search language for each language, which is not worth to do. So we will skip the question first and see whether we have extra time to fill it.


Problem 3.3.2

Describe the languages denoted by the following regular expressions:

  1. a (a | b)* a

  2. ((ε|a)b*)*

  3. (a|b)*a(a|b)(a|b)

  4. a*ba*ba*ba*

  5. (aa|bb)((ab|ba)(aa|bb)(ab|ba)(aa|bb))

Answer:

  1. String made up by a’s and b’s and end with a.

  2. Note that this is an arbitrary string make up by a’s and b’s. This is a little bit tricky since I firstly think it cannot form string arbitrarily due to the constraint of a.

  3. This is string made up by a’s and b’s and the third character from last must be a.

  4. String of a’s and b’s but with exactly three b.

  5. String of a’s and b’s that has even number of a and b.


Problem 3.3.3

In a string of length n, how many of the following are there?

  1. Prefixes

  2. Suffixes

  3. Proper prefixes

  4. Substrings

  5. Subsequences

Answer:

  1. Clearly n+1.

  2. Clearly n+1

  3. n-1. Except the empty string and that string itself.

  4. (n+1)n/2 + 1. Just enumerate substrings with length 1,2,3…n. And still need to count the empty string.

  5. There are totally 2^n subsequences, which is a permutation problem.


Problem 3.3.4

Most languages are case sensitive, so keywords can be written only one way, and the regular expressions describing their lexeme is very simple. However, some languages, like SQL, are case insensitive, so a keyword can be written either in lowercase or in uppercase, or in any mixture of cases. Thus, the SQL keyword SELECT can also be written select, Select, or sElEcT, for instance. Show how to write a regular expression for a keyword in a case­ insensitive language. Illustrate the idea by writing the expression for “select” in SQL.

Answer:

select [Ss][Ee][Ll][Cc][Ee][Tt]


Problem 3.3.5

Write regular definitions for the following languages:

  1. All strings of lowercase letters that contain the five vowels in order.

  2. All strings of lowercase letters in which the letters are in ascending lexicographic order.

  3. Comments, consisting of a string surrounded by /* and /, without an intervening /, unless it is inside double-quotes (“)

  4. All strings of digits with no repeated digits.
    Hint: Try this problem first with a few digits, such as {0, 1, 2}.

  5. All strings of digits with at most one repeated digit.

  6. All strings of a’s and b’s with an even number of a’s and an odd number of b’s.

  7. The set of Chess moves,in the informal notation,such as p-k4 or kbp*qn.

  8. All strings of a’s and b’s that do not contain the substring abb.

  9. All strings of a’s and b’s that do not contain the subsequence abb.

Answer:

1.

other [bcdfghjklmnpqrstvwxyz]
res (other)* a (other | a)* e (other | e)* i (other | i)* o (other | o)* u (other | u)*

注意这里默认了e出现之后的位置不能够再出现a了。原则上符合按顺序的元音字母。

2.

a* b* c* z*

这个就是简单的列举一下。

3.

\ / \ * ( [ ^ * ” ] * | ” . * ” | \ * + [ ^ / ] ) * \ * \ /

这个需要解释一下了,[ ^ * ” ] *:除了 * 和 ” 之外所有的符号任意长度的串。" . * " :两个引号括起来的串,其内允许除换行符之外的任何符号。\*+[^/]:出现 * 的情况,其后跟一个不是 / 的符号。这道题的关键就在于否定符号 ^ 的应用,以及对闭包的处理。注释符号中间是一个大闭包,这个闭包之中就是上述的几种情况。然而我们需要注意的是,这种描述方法中不能包含不成对的引号。(这里直接抄了沉鱼姐姐的答案,我想如果要是想支持单个双引号的情况,只需要在大闭包内加一个“或双引号”就好了)

4.5.6.7.

我看了这个答案,要用到状态转换图+状态图简化的一些技巧,我现在还不能看懂,等再过几天看了之后的章节一定会把这里的东西补上。

8.

b*(a+b?)*

终于看到了一个好理解的东西了,首先开头可以出现很多b,然后一旦a出现,就只能有a或者ab了,这就是后面那个闭包的概念。

9.

b* | b*a+ | b*a+ba*

因为其实满足这个条件的字符串不是很多,只有上面三种,所以完全可以枚举达到最终的结果。


Problem 3.3.6

Write character classes for the following sets of characters:

  1. The first ten letters (up to “j”) in either upper or lower case.

  2. The lowercase consonants.

  3. The “digits” in a hexadecimal number (choose either upper or lower case for the “digits” above 9).

  4. The characters that can appear at the end of alegitimate English sentence (e.g. , exclamation point) .

Answer:

1.

[A-Ja-j]

2.

[bcdfghjklmnpqrstvwxzy]

3.

[0-9a-f]

4.

[.?!]


Problem 3.3.7

Note that these regular expressions give all of the following symbols (operator characters) a special meaning:

\ ” . ^ $ [ ] * + ? { } | /

Their special meaning must be turned off if they are needed to represent themselves in a character string. We can do so by quoting the character within a string of length one or more; e.g., the regular expression “**” matches the string ** . We can also get the literal meaning of an operator character by preceding it by a backslash. Thus, the regular expression ** also matches the string **. Write a regular expression that matches the string “\.

Answer:

\”\\

这个很简单的,就是每一个符号前面加一个反斜杠。


Problem 3.3.8-12

To be written.


Exercise 3.4


0 0