python wsgi 详解浏览器请求过程

来源:互联网 发布:淘宝网交电费 编辑:程序博客网 时间:2024/05/17 12:03

由于这两连天一直在看python wsgi解析request以及response的过程,所以先记录django系列博客的第一篇文章,过几天会将wsgi初始化过程写一篇文章。

前提:

(1)本章节采用 再浏览器输入 IP:9000,后服务器的处理请求的具体过程。

(2)wsgi web服务器套接字采用select异步模式,设有select 超时间隔,并且超时后通过while循环继续select。

(3)类的继承关系simple server真正的类是WSGIServer,继承自HTTPServer,HTTPServer类又继承自TCPServer,TCPServer又继承自BaseServer;与server类直接打交道的还有RequestHandler类,从最上层的WSGIRequestHandler —>BaseHTTPRequestHandler —> StreamRequestHandler —> BaseRequestHandler。

处理过程:

1、当客户端请求数据时,会触发select,返回相应的socket,如下图:

                                           

                r, w, e = _eintr_retry(select.select, [self], [], [],                                       poll_interval)                                         if self in r:                    print("s_handle_request_noblock")                    self._handle_request_noblock()


2、则进入_handle_request_noblock()函数:

1)在函数体内调用 get_request() (内部调用accept,返回客户端连接以及客户端地址):

                                          request, client_address = self.get_request()

2)进入process_request函数,处理请求:

                                          self.process_request(request, client_address)

此函数在ThreadingMixIn类中:                                   

    def process_request(self, request, client_address):        """Start a new thread to process the request."""        print("ThreadingMixIn")        t = threading.Thread(target = self.process_request_thread,                             args = (request, client_address))        t.daemon = self.daemon_threads        t.start()

由此函数可看出,process_resquest函数启动新新线程处理请求,线程函数为process_request_thread。

3)start后进入线程函数,而主线程返回到select处继续监听。process_request_thread线程函数如下:

        try:            print " process_request_thread"            self.finish_request(request, client_address)            self.shutdown_request(request)        except:            self.handle_error(request, client_address)            self.shutdown_request(request)
finish_requesth函数处理具体的请求:
        def finish_request(self, request, client_address):            """Finish one request by instantiating RequestHandlerClass."""            print "finish_request",request            print self.RequestHandlerClass            self.RequestHandlerClass(request, client_address, self)

此函数通过实例化RequestHandlerClass类来处理具体请求,此类实际为WSGIRequestHandler类,在初始化WSGIServer时传递给变量WSGIServer.RequestHandlerClass的。

3、现在进入RequestHandlerClass的处理流程

1)通过RequestHandlerClass类的继承关系知,调用基类BaseRequestHandler.__init__(self, request, client_address, server)函数初始化,如下:

    def __init__(self, request, client_address, server):        self.request = request        self.client_address = client_address        self.server = server        print "BaseRequestHandler"        self.setup()        try:            self.handle()        finally:            self.finish()
初始化进行了一些必要的赋值操作,记录server等,接着调用self.setup()函数.

2)进入self.setup(self)函数,如下:

    def setup(self):        self.connection = self.request        if self.timeout is not None:            self.connection.settimeout(self.timeout)        if self.disable_nagle_algorithm:            self.connection.setsockopt(socket.IPPROTO_TCP,                                       socket.TCP_NODELAY, True)        self.rfile = self.connection.makefile('rb', self.rbufsize)        self.wfile = self.connection.makefile('wb', self.wbufsize)
self.rfile和self.wfile为数据存储区(具体的意义和流程现在还是不是很了解,再此做标记继续了解)
4、self.setup()结束之后就进入self.handle()具体是数据出流程了:

    def handle(self):        """Copy of WSGIRequestHandler, but with different ServerHandler"""        self.raw_requestline = self.rfile.readline(65537)        if len(self.raw_requestline) > 65536:            self.requestline = ''            self.request_version = ''            self.command = ''            self.send_error(414)            return        if not self.parse_request():  # An error code has been sent, just exit            return        handler = ServerHandler(            self.rfile, self.wfile, self.get_stderr(), self.get_environ()        )        handler.request_handler = self      # backpointer for logging        handler.run(self.server.get_app())
self.raw_requestline为读取缓冲区后的内容:通过打印知道其内容为”GET / HTTP/1.1“ 。

1)进入self.parse_request()函数:

    def parse_request(self):        """Parse a request (internal).        The request should be stored in self.raw_requestline; the results        are in self.command, self.path, self.request_version and        self.headers.        Return True for success, False for failure; on failure, an        error is sent back.        """        self.command = None  # set in case of error on the first line        self.request_version = version = self.default_request_version        self.close_connection = 1        requestline = self.raw_requestline        requestline = requestline.rstrip('\r\n')        self.requestline = requestline        words = requestline.split()        if len(words) == 3:            command, path, version = words            if version[:5] != 'HTTP/':                self.send_error(400, "Bad request version (%r)" % version)                return False            try:                base_version_number = version.split('/', 1)[1]                version_number = base_version_number.split(".")                # RFC 2145 section 3.1 says there can be only one "." and                #   - major and minor numbers MUST be treated as                #      separate integers;                #   - HTTP/2.4 is a lower version than HTTP/2.13, which in                #      turn is lower than HTTP/12.3;                #   - Leading zeros MUST be ignored by recipients.                if len(version_number) != 2:                    raise ValueError                version_number = int(version_number[0]), int(version_number[1])            except (ValueError, IndexError):                self.send_error(400, "Bad request version (%r)" % version)                return False            if version_number >= (1, 1) and self.protocol_version >= "HTTP/1.1":                self.close_connection = 0            if version_number >= (2, 0):                self.send_error(505,                          "Invalid HTTP Version (%s)" % base_version_number)                return False        elif len(words) == 2:            command, path = words            self.close_connection = 1            if command != 'GET':                self.send_error(400,                                "Bad HTTP/0.9 request type (%r)" % command)                return False        elif not words:            return False        else:            self.send_error(400, "Bad request syntax (%r)" % requestline)            return False        self.command, self.path, self.request_version = command, path, version        # Examine the headers and look for a Connection directive        self.headers = self.MessageClass(self.rfile, 0)        conntype = self.headers.get('Connection', "")        if conntype.lower() == 'close':            self.close_connection = 1        elif (conntype.lower() == 'keep-alive' and              self.protocol_version >= "HTTP/1.1"):            self.close_connection = 0        return True
首先对读取的数据进行进一步的处理,然后调用MessageClass(self.rfile,0)获取数据包头部。打印内容如下:

Accept: */*
Accept-Language: zh-Hans-CN,zh-Hans;q=0.8,en-US;q=0.5,en;q=0.3
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; InfoPath.3; LBBROWSER)
Accept-Encoding: gzip, deflate
Host: 172.16.0.108:9000
Connection: Keep-Alive
Cookie: login_region="http://127.0.0.1:5000/v2.0"; login_domain=; csrftoken=9RJulphcUYk5sP9YRjmpwAj1FThRG3Ko

2)接着实例化handler = ServerHandler(),其中ServerHandler->ServerHandler->SimpleHandler->BaseHandler为继承过程。

3)接着运行handler.run(self.server.get_app())

    def run(self, application):        """Invoke the application"""        # Note to self: don't move the close()!  Asynchronous servers shouldn't        # call close() from finish_response(), so if you close() anywhere but        # the double-error branch here, you'll break asynchronous servers by        # prematurely closing.  Async servers must return from 'run()' without        # closing if there might still be output to iterate over.        try:            self.setup_environ()            print(2222222222222222222222222222222222222)            self.result = application(self.environ, self.start_response)            self.finish_response()        except:            try:                self.handle_error()            except:                # If we get an error handling an error, just give up already!                self.close()                raise   # ...and let the actual server figure it out
其中application为self.server.get_app()的值,此值在先前初始化WSGIServer时已经设置,为类WSGIHandler()实例。

4)接着会调用WSGIHandler实例函数__call__,如下:

    def __call__(self, environ, start_response):        # Set up middleware if needed. We couldn't do this earlier, because        # settings weren't available.        if self._request_middleware is None:            with self.initLock:                try:                    # Check that middleware is still uninitialized.                    if self._request_middleware is None:                        self.load_middleware()                except:                    # Unload whatever middleware we got                    self._request_middleware = None                    raise        set_script_prefix(get_script_name(environ))        signals.request_started.send(sender=self.__class__, environ=environ)        try:            request = self.request_class(environ)        except UnicodeDecodeError:            logger.warning('Bad Request (UnicodeDecodeError)',                exc_info=sys.exc_info(),                extra={                    'status_code': 400,                }            )            response = http.HttpResponseBadRequest()        else:            response = self.get_response(request)        response._handler_class = self.__class__        status = '%s %s' % (response.status_code, response.reason_phrase)        response_headers = [(str(k), str(v)) for k, v in response.items()]        for c in response.cookies.values():            response_headers.append((str('Set-Cookie'), str(c.output(header=''))))        start_response(force_str(status), response_headers)        if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):            response = environ['wsgi.file_wrapper'](response.file_to_stream)        return response
5)__call__首先检测中间件是否已经加载,中间件可以自行百度,或者http://djangobook.py3k.cn/2.0/chapter17/:

在这里摘一句话:Django项目的安装并不强制要求任何中间件,如果你愿意, MIDDLEWARE_CLASSES 可以为空。这里中间件出现的顺序非常重要。 在request和view的处理阶段,Django按照 MIDDLEWARE_CLASSES 中出现的顺序来应用中间件,而在response和异常处理阶段,Django则按逆序来调用它们。 也就是说,Django将MIDDLEWARE_CLASSES 视为view函数外层的顺序包装子: 在request阶段按顺序从上到下穿过,而在response则反过来。

具体的加载由函数实现load_middleware(self):

Request预处理函数: process_request(self, request):接受request之后确定所执行的view之前

View预处理函数: process_view(self, request, view, args, kwargs) : 确定了所要执行的view之后 view真正执行之前

Response后处理函数: process_response(self, request, response) :view 执行之后

Exception后处理函数: process_exception(self, request, exception) :view抛出异常

将所有函数存储到self._request_middleware列表之中。

6)然后通过调用request=self.request_class(environ),实例化一个request.

7)然后调用response = self.get_response(request):

函数体太长,这里就不方便贴出来了,此函数的主要作用是返回一个response:

     1 首先调用request中间件:process_request(), 如果返回response不为None, 则跳转到4,如果返回None则执行2

    2 执行view中间件:process_view。如果返回None, 则执行response = wrapped_callback(),如果出现异常则跳转到3

     3 如果调用view中间件过程中出现异常,则需要调用异常中间件process_exception。如果调用异常中间件扔异常,则报错。     

    4 如果经过上面几步返回的response 支持延迟渲染(deferred),则继续应用template中间件,然后渲染response。继续5执行

    5 调用response中间件 process_response(self, request, response)

   6 返回最终生成的response

8) 接着执行start_response(),此函数在执行4.3步时传入的:

    def start_response(self, status, headers,exc_info=None):        """'start_response()' callable as specified by PEP 333"""        if exc_info:            try:                if self.headers_sent:                    # Re-raise original exception if headers sent                    raise exc_info[0], exc_info[1], exc_info[2]            finally:                exc_info = None        # avoid dangling circular ref        elif self.headers is not None:            raise AssertionError("Headers already set!")        assert type(status) is StringType,"Status must be a string"        assert len(status)>=4,"Status must be at least 4 characters"        assert int(status[:3]),"Status message must begin w/3-digit code"        assert status[3]==" ", "Status message must have a space after code"        if __debug__:            for name,val in headers:                assert type(name) is StringType,"Header names must be strings"                assert type(val) is StringType,"Header values must be strings"                assert not is_hop_by_hop(name),"Hop-by-hop headers not allowed"        self.status = status        self.headers = self.headers_class(headers)        return self.write
此函数不是很负责,按照字面意思理解就行了。

9)函数在8)返回后 self.result = application(self.environ, self.start_response)执行结束,此时self.result =返回的response。

5、最后执行4.3中的self.finish_response()函数:

    def finish_response(self):        """Send any iterable data, then close self and the iterable        Subclasses intended for use in asynchronous servers will        want to redefine this method, such that it sets up callbacks        in the event loop to iterate over the data, and to call        'self.close()' once the response is finished.        """        try:            if not self.result_is_file() or not self.sendfile():                for data in self.result:                    self.write(data)                self.finish_content()        finally:            self.close()
在执行self.write时,先发送headers,执行函数self.send_headers(),最后发送数据包回客户端。

6、沿着函数调用返回到2.3 ,接着调用self.shutdown_request(request)函数,关闭socket连接。

至此,一个请求结束。





1 0
原创粉丝点击