Scrapy爬虫：代理IP配置

来源：互联网发布：泰拉瑞亚辅助软件ios 编辑：程序博客网时间：2024/04/28 20:18

Scrapy设置代理IP步骤：
1、在Scrapy工程下新建"middlewares.py":
?
1
2
3
4
5
6
7
8
9
10
11
12
13
`import` `base64`
`# Start your middleware class`
`class` `ProxyMiddleware(object):`
    `# overwrite process request`
    `def` `process_request(self, request, spider):`
        `# Set the location of the proxy`
        `request.meta['proxy']` `=` `"http://YOUR_PROXY_IP:PORT"`

        `# Use the following lines if your proxy requires authentication`
        `proxy_user_pass` `=` `"USERNAME:PASSWORD"`
        `# setup basic authentication for the proxy`
        `encoded_user_pass` `=` `base64.encodestring(proxy_user_pass)`
        `request.headers['Proxy-Authorization']` `=` `'Basic '` `+` `encoded_user_pass`

2、在项目配置文件里setting.py添加:
?
1
2
3
4
`DOWNLOADER_MIDDLEWARES` `=` `{`
    `'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware':` `110,`
    `'pythontab.middlewares.ProxyMiddleware':` `100,`
`}`
转载自：http://my.oschina.net/jhao104/blog/639745

0 0