Python 使用 urllib2 防止302跳转

来源:互联网 发布:matlab 保存excel数据 编辑:程序博客网 时间:2024/06/05 12:49

转自:http://www.jb51.net/article/51942.htm


说明:python的urllib2获取网页(urlopen)会自动重定向(301,302)。但是,有时候我们需要获取302,301页面的状态信息。就必须获取到转向前的调试信息。

下面代码将可以做到避免302重定向到新的网页

#!/usr/bin/python# -*- coding: utf-8 -*-#encoding=utf-8#Filename:states_code.py import urllib2 class RedirctHandler(urllib2.HTTPRedirectHandler):  """docstring for RedirctHandler"""  def http_error_301(self, req, fp, code, msg, headers):    pass  def http_error_302(self, req, fp, code, msg, headers):    pass def getUnRedirectUrl(url,timeout=10):  req = urllib2.Request(url)  debug_handler = urllib2.HTTPHandler(debuglevel = 1)  opener = urllib2.build_opener(debug_handler, RedirctHandler)   html = None  response = None  try:    response = opener.open(url,timeout=timeout)    html = response.read()  except urllib2.URLError as e:    if hasattr(e, 'code'):      error_info = e.code    elif hasattr(e, 'reason'):      error_info = e.reason  finally:    if response:      response.close()  if html:    return html  else:    return error_info html = getUnRedirectUrl('http://jb51.net')print html
0 0