webpy源码阅读(1)——初见整体

来源：互联网发布：开源中国app源码解析编辑：程序博客网时间：2024/05/17 22:38

最近闲来无事，准备看看web.py的源码，一开始直接看现在的源码，发现比较吃力，后来想起从github上clone下来后checkout到2006年的那个第一个版本开始，700多次commit，准备一次次的看，记录下自己的阅读所得

最开始版本的web.py就一个web.py文件，一共1000行多一点，其中还有300行是模板，不得不佩服Aaron Swartz，不愧是世界著名的黑客

我在阅读后，把其中例如数据库操作这类的模块去除，只留下了wsgi的运行过程，下面慢慢放上

首先web.py里面的一个main代码块，

if __name__ == "__main__":    urls = ('/web.py', 'source')    class source:        def GET(self):            header('Content-Type', 'text/python')            print open(__file__).read()    run(urls)

自定义一个url与handle的对应关系序列，再启动服务器，等待处理

大致分析一下结构

大概流程就是以上

其中见到几个比较有意思的代码

class memoize:    def __init__(self, func):        self.func = func        self.cache = {}    def __call__(self, *args, **kwargs):        key = (args, tuple(kwargs.items()))        if key not in self.cache:            self.cache[key] = self.func(*args, **kwargs)        return self.cache[key]

这个函数可以为函数提供返回值的缓存

import rere_compile = memoize(re.compile)r = re_compile('\d+')<pre name="code" class="python">r = re_compile('\d+')

</pre>r = re_compile('\w+')<pre>

结果为

('not cache', (('\\d+',), ()))('not cache', (('\\w+',), ()))[Finished in 0.0s]

很明显把正则编译的结果缓存了，这样提高效率，还有提升代码优美度很好感觉

一开始有个地方看的不太明白

class source:    def GET(self):        print input()        header('Content-Type', 'text/python')        print open(__file__).read()

为何是print 我们通常用到的webpy都是返回值，而且并没有控制台输出

后来发现下面的代码

class _outputter:    def write(self, x):         if hasattr(ctx, 'output'): output(x)        else: _oldstdout.write(x)if not '_oldstdout' in globals():     _oldstdout = sys.stdout    sys.stdout = _outputter()

context又是什么？

class threadeddict:    def __init__(self, d): self.__dict__['_threadeddict__d'] = d    def __getattr__(self, a): return getattr(self.__d[currentThread()], a)    def __getitem__(self, i): return self.__d[currentThread()][i]    def __setattr__(self, a, v): return setattr(self.__d[currentThread()], a, v)    def __setitem__(self, i, v): self.__d[currentThread()][i] = v<pre name="code" class="python">_context = {currentThread():Storage()}ctx = context = threadeddict(_context)

我们看到threadeddict()是一个按线程索引存储数据的对象，也就是我们平时用到的web.ctx对象，原来是这么是实现的！

当前请求的所有信息，还有响应的所有信息都在里面，包括响应body context.output

但为何要用print实现呢，return不是也一样吗，还更加直观。

我想可能是为了多次输出，类似大文件之类的东西，可是当时是Python2.3，yield关键词还没有出现，return只能返回一次值，就用了print进行多次输出。

期待后面的commit带来新的收获！

_context = {currentThread():Storage()}ctx = context = threadeddict(_context)

0 0