爬虫05 BeautifulSoup4初体验
来源:互联网 发布:国际通用聊天软件 编辑:程序博客网 时间:2024/06/05 03:54
# -*- coding: utf-8 -*-import sysreload(sys)sys.setdefaultencoding( "utf-8" )import urllibimport urllib2import refrom bs4 import BeautifulSouppage = 1url = 'http://www.qiushibaike.com/8hr/page/%d/?s=4908781' %pageuser_agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0"headers = { 'User-Agent' : user_agent }request = urllib2.Request(url,headers=headers)response = urllib2.urlopen(request)back=response.read()soup= BeautifulSoup(back,'html.parser',from_encoding='utf-8')#print backcontents=soup.find_all("div","content")f=open("糗事百科"+str(page)+".txt","w")for content in contents: print content.get_text() f.write(content.get_text())
0 0
- 爬虫05 BeautifulSoup4初体验
- BeautifulSoup4小爬虫
- requests+beautifulsoup4 爬虫实战
- python爬虫 BeautifulSoup4官方文档
- 爬虫——使用BeautifulSoup4的爬虫
- web 爬虫初体验
- Python爬虫辅助库BeautifulSoup4用法精要
- python爬虫——beautifulsoup4使用学习
- Python爬虫之正则 & BeautifulSoup4解析HTML
- Python爬虫BeautifulSoup4系列之十
- BeautifulSoup4
- BeautifulSoup4
- 爬虫工具Heritrix初体验
- 爬虫工具Heritrix初体验
- 爬虫工具Heritrix初体验
- .NET网站爬虫初体验
- Python3 爬虫(七) -- 配置BeautifulSoup4+lxml+html5lib
- Python进阶(十九)-Python3安装第三方爬虫库BeautifulSoup4
- hdu5536 Chip Factory(Tire)
- SQL的JOIN语法解析(inner join, left join, right join, full outer join的区别)
- 导出秘钥库里的公钥私钥
- Yii2-从ActiveRecord获取原生sql
- Android屏幕适配
- 爬虫05 BeautifulSoup4初体验
- Json解析
- js返回上一页并刷新的几种方法
- java 入门
- 《从零开始学Swift》学习笔记(Day 23)——尾随闭包
- hadoop集群环境的搭建
- Android Studio Plugins
- ThinkPHP内容管理系统开发日记(四)-- 配置信息与开发Longin模块以及模板
- Android中moveTo、lineTo、quadTo、cubicTo、arcTo详解(实例)