python爬虫demo
来源:互联网 发布:区块链的共识机制 算法 编辑:程序博客网 时间:2024/06/05 09:08
#!/usr/bin/python# -*- coding: utf-8 -*-import urllib2import jsonimport sysimport timekeyword = 'port:8080' # 获取查询关键字page = '1' # 获取页数f = open('result.txt', 'w')# keyword = sys.argv[1] # 获取查询关键字# page = sys.argv[2] # 获取页数# get_cookie = sys.argv[3] # 获取cookie的值for i in range(int(page)): req = urllib2.Request( 'https://www.oshadan.com:443/search?info={"c":"' + keyword + '","p":' + str( i + 1) + ',"q":0,"clear":false}&_='+str(time.time()).replace('.','')+'0') req.add_header('Host', 'www.oshadan.com') req.add_header('User-Agent', 'Mozilla/5.0 (X11; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0') req.add_header('Accept', 'application/json, text/javascript, */*; q=0.01') req.add_header('Accept-Language', 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3') req.add_header('X-Requested-With', 'XMLHttpRequest') req.add_header('Referer', 'https://www.oshadan.com/main') req.add_header('Cookie', 'sid=s%3Abojn6UmMsWcvTlf97yWtsHLM.BWamQyVwpPz1L4JwelKJqgrEoK0JXqRZF1xy19EN7Co') # req.add_header('Cookie', get_cookie) response = urllib2.urlopen(req) the_page = response.read() json_re = json.loads(the_page) # print json_re['result']['result']['recordNum'] # 个数 for j in json_re['result']['result']['data']: if j['notcomponentFields']['url'] != None: print j['notcomponentFields']['url'] f.write(j['notcomponentFields']['url']) f.write('\n') else: print f.write(j['notcomponentFields']['ip']) f.write(j['notcomponentFields']['ip']) f.write(j['notcomponentFields']['port']) f.write('\n') print '第' + str(i + 1) + '页爬取完毕'f.close()print '爬虫任务全部结束'
0 0
- python 爬虫demo
- python爬虫demo
- Python爬虫demo
- python 爬虫demo
- Python爬虫简单的demo
- python 网络爬虫(一) 简单demo
- Python爬虫原理的小demo
- python爬虫之BeautifulSoup小demo
- 爬虫demo
- python爬虫实现简单爬取淘宝商品demo
- python爬虫实现网络股票信息爬取的demo
- java 爬虫Demo
- 网页爬虫简单demo
- Android 网络爬虫demo
- Scrapy爬虫Demo
- webcollector爬虫demo
- PY爬虫Demo集合
- Java小爬虫Demo
- Python科学计算-----NumPy(一)
- PAT乙级(Basic Level)1042(Java)
- 查出区间整数
- Stress中常见问题及相关logcat
- Visual Studio开始Python编程 && Windows下用PyCharm
- python爬虫demo
- JAVA实践生成验证码图片
- android studio 主线程中访问网络,遇到android.os.NetworkOnMainThreadException
- uploadify 出现 IO Error 修改php-ini
- PercentRelativeLayout百分比布局
- 深入剖析Java中的装箱和拆箱
- ASP.NET与HTML的关系理解
- 3529: [Sdoi2014]数表
- SQL语句大全及实例