习题48 更复杂的用户输入

来源：互联网发布：淘宝上哪些零食好吃编辑：程序博客网时间：2024/06/02 04:14

前面的各种文字小游戏要求用户必须一字不差输入命令，才能正确执行。

但是我们希望允许用户多样化地输入，比如简写、大小写之类的。

所以最好的开始方式是先搞定如何得到用户输入的单词并且判断出这是干什么的。

我们的游戏语汇

我在游戏里创建了如下单词的语汇表：

方向词: north, south, east, west, down, up, left, right, back
动词: go, stop, kill, eat
修饰词: the, in, of, from, at, it
名词: door, bear, princess, cabinet
数字: 0到9构成的数字

断句

我们已经有了单词的语汇表，为了分析句子的意思，要找到一种断句的方法。我们对于句子的定义是：空格隔开的单词。

所以有：

stuff = raw_input('> ')words = stuff.split()

目前就够了，其中 stuff.split() 中的 split() 函数意思是“通过指定分隔符给字符串切片”，这里分隔符就是空格。

语汇元组

一旦我们知道了如何把句子断为单词，剩下的就是逐一检查这些单词，看它们是什么类型的。

这里，我们要用到一种Python的数据结构，叫做“元组”（tuple）。

元组就是一个不能修改的列表。创建它的方法和创建列表差不多，成员之间需要用逗号隔开，不过方括号要换成圆括号 () 。

first_word = ('direction', 'north')second_word = ('verb', 'go')sentence = [first_word,second_word]

这样就创建了一个 (TYPE, WORD) 组，让你识别出单词，并且对其执行指令。

简要地说，就是接收用户输入（句子），然后用 split() 分割成单词，然后分析识别，最后重新组成句子。

扫描输入

现在要写的是扫描器，这里 Zed 没有给出代码，而是要自己写出来。

这个扫描器会将用户输入的字符串当做参数，然后返回由多个 (TOKEN, WORD) 组成的一个列表，这个列表实现类似句子的功能。如果一个单词不在预定的单词语汇表中，那它返回时 WORD 应该还在，但 TOKEN 应该设置成一个专门的错误标记。这个标记告诉用户哪里出错了。

异常和数字

Zed 这里要帮助我做一个数字转换。这里要使用“异常”（exception）来做。“异常”是运行某个函数时候得到的错误。

函数在遇到错误的时候，就会“引发”（raise）一个“异常”，然后你就要去“处理”（handle）这个异常。

例如：

这里的 ValueError 就是 int() 函数给出的一个异常，因为我们给它的参数不是一个数字。

int() 函数其实也可以返回一个值来告诉你它遇到了错误，不过它只能返回整数值。

处理异常的方法是使用 try 和 except 这两个关键字：

def convert_number(s):    try:        return int(s)    except ValueError:        return None

把要试着运行的代码放到 try 块里面，再将出错后要运行的代码放到 except 块里面。这里，要试着调用 int() 去处理某个可能是数字的东西，如果中间出错了，就抓到这个错误，然后返回 None 。

在你写的扫描器里面应该用这个函数来测试某个东西是不是数字。

====================================================================================================

强烈怀疑本书在这节翻译有问题，要么漏了东西，要么版本不对，下面添加上了。

正确的翻译参见 http://www.kancloud.cn/kancloud/learn-python-hard-way/49926

====================================================================================================

测试第一的挑战

测试首先是一种编程策略，你先写一段自动化测试代码，假装代码是在正常运行的，然后你再写出代码保证测试代码能正常运行。这种方法用在当你不知道代码是如何运行，但又可以想象必须使用它的时候。比如说，如果你知道你需要在另一个模块中使用一个新类,但是你不太知道如何实现这个类，那么先写出测试程序。

我将给你一份测试代码，你需要写出代码，保证测试代码能正常工作。为了完成这个任务，你可以看看下面的流程：

创建一小部分我给你的测试代码
确保它运行失败,你知道测试实际上是确认功能的工作原理。
到你的源代码文件lexicon.py中，写出能使测试代码通过的代码
重复以上工作直到你实现测试中的所有点

当你做到3的时候，和其他编写代码的方法相结合也是很好的方法：

编写你需要的函数或类的基本框架
添加注释，解释说明这个函数是如何运行的
按照描述中的注释写代码
去掉注释

这种写代码的方法被称作“psuedo code”，用在你不知道该如何实现某些功能，但是会用自己的语言来描述这个功能的时候。

结合“test first”和“psuedo code”策略，我们得出一个编程的简易流程：

写一些运行失败的测试用例
写出测试要用的函数、方法、类的基本结构
用自己的语言填充这些框架，解释它们的功能
用代码替换注释，直到测试代码运行通过
重复

在这节练习中，你将通过运行我给你的测试程序逆向运行lexicon.py来实践这个方法。

====================================================================================================

应该测试的东西

下面是你应该使用的测试文件 test/lexicon_tests.py

记住，使用 nosetests 来运行测试脚本而不是 python 命令。

可以使用 nosetests -x ，意思是出现错误就停止。

可以参考视频教程 http://www.bilibili.com/video/av3964988/

from nose.tools import *from ex48 import lexicondef test_directions():    assert_equal(lexicon.scan("north"), [('direction', 'north')])    result = lexicon.scan("north south east")    assert_equal(result, [('direction', 'north'),                          ('direction', 'south'),                          ('direction', 'east')])def test_verbs():    assert_equal(lexicon.scan("go"), [('verb', 'go')])    result = lexicon.scan("go kill eat")    assert_equal(result, [('verb', 'go'),                          ('verb', 'kill'),                          ('verb', 'eat')])def test_stops():    assert_equal(lexicon.scan("the"), [('stop', 'the')])    result = lexicon.scan("the in of")    assert_equal(result, [('stop', 'the'),                          ('stop', 'in'),                          ('stop', 'of')])def test_nouns():    assert_equal(lexicon.scan("bear"), [('noun', 'bear')])    result = lexicon.scan("bear princess")    assert_equal(result, [('noun', 'bear'),                          ('noun', 'princess')])def test_numbers():    assert_equal(lexicon.scan("1234"), [('number', 1234)])    result = lexicon.scan("3 91234")    assert_equal(result, [('number', 3),                          ('number', 91234)])def test_errors():    assert_equal(lexicon.scan("ASDFADFASDF"), [('error', 'ASDFADFASDF')])    result = lexicon.scan("bear IAS princess")    assert_equal(result, [('noun', 'bear'),                          ('error', 'IAS'),                          ('noun', 'princess')])

如果我们直接 nosetest 那么会有

提示我们没有 scan 函数。乱码是注释里面的，不用在意。

所以我们在 lexicon.py 里面要加入一个 scan

def scan(sentence):    pass

现在 nosetest ，有：

报错变更了，根据这个报错，我们要进一步写。

我们想让函数 return 正确的东西。

def scan(sentence):    return [('direcion','north')]

现在试试呢？

现在我们要继续，让它真的工作。要 return 一个真的 result 。

#-*-coding:utf-8-*- lexicon = {    'north':('direction','north')            }def scan(sentence):    words = sentence.split()    result = []    for word in words:        result.append(lexicon[word])    return result

nosetests -x 结果：

提示是 KeyError，因为 south 没有在我们的 lexicon 里面，那么就加上咯。

#-*-coding:utf-8-*- lexicon = {    'north':('direction','north'),    'south':('direction','south')            }def scan(sentence):    words = sentence.split()    result = []    for word in words:        result.append(lexicon[word])    return result

结果如下：

实际上，lexincon 里面的 north 之类的信息已经在里面了，我们可以只把值放在右边。

修改一下，注意这里做了一个 pair

#-*-coding:utf-8-*- lexicon = {    'north':'direction',    'south':'direction'            }def scan(sentence):    words = sentence.split()    result = []    for word in words:        pair = (lexicon[word],word)        result.append(pair)    return result

再试试：

现在我们把 east 和west 都加上：

#-*-coding:utf-8-*- lexicon = {    'north':'direction',    'south':'direction',    'east':'direction',    'west':'direction'            }def scan(sentence):    words = sentence.split()    result = []    for word in words:        pair = (lexicon[word],word)        result.append(pair)    return result

然后测试一下，结果如下：

可以看出，现在的 KeyError 是 'go' ，这表明，第一部分已经测试通过了。

接下来照猫画虎，可以做完整个测试。当然这不是唯一的方法，可以多尝试尝试别的格式。

对数字的判定也许是一个挑战，将在接下来试一试。

#-*-coding:utf-8-*- lexicon = {    'north':'direction',    'south':'direction',    'east':'direction',    'west':'direction',    'go':'verb',    'kill':'verb',    'eat':'verb',    'the':'stop',    'in':'stop',    'of':'stop',    'bear':'noun',    'princess':'noun',    91234:'number',    3:'number',    'IAS':'error'            }def scan(sentence):    words = sentence.split()    result = []    for word in words:        pair = (lexicon[word],word)        result.append(pair)    return result

把测试文件里面要测试的东西都加进去，看看：

这里是数字出了问题。那么看书，已经给出了处理数字的方法。

其实还有个问题，也许是书里面的 91234 其实是 1234？改过来试试？包括后面的 IAS 都改一改吧。

但是能力有限，怎么改都是错的，如果不把1234改成字符串的话...

给出错误的程序如果有谁能给我看看就好了：

#-*-coding:utf-8-*- lexicon = {    'north':'direction',    'south':'direction',    'east':'direction',    'west':'direction',    'go':'verb',    'kill':'verb',    'eat':'verb',    'the':'stop',    'in':'stop',    'of':'stop',    'bear':'noun',    'princess':'noun',    '1234':'number',    '3':'number',    'ASDFADFASDF':'error'            }def scan(sentence):    words = sentence.split()    result = []    for word in words: # word 在循环一开始就被定义        if word.isdigit():            number = int(word)            word = number        else:            pass        print lexicon[word]        print word        pair = (lexicon[word],word)        result.append(pair)    return result

反正还是提示 1234 那里有错。结果就不贴了。

0 0