Python爬虫学习(1):百度贴吧

来源：互联网发布：mysql navicat 破解码编辑：程序博客网时间：2024/05/21 06:14

第一次学习爬虫:只用了urllib的函数。不需要跳过登陆,最简单的爬虫。

爬三页帖子的页面,然后作为html保存起来,只是要注意

def scrap(url,begin,end):    for i in range(begin,end+1):        filename = 'file{}.html'.format(str(i).zfill(3))        print(url+str(i))        response = url_req.urlopen(url+str(i))        with open(filename,'wb') as f:            f.write(response.read())burl_ = 'http://tieba.baidu.com/f?kw=%E7%94%B5%E5%AD%90%E7%A7%91%E6%8A%80%E5%A4%A7%E5%AD%A6&ie=utf-8&pn='burl =   'http://tieba.baidu.com/p/4711180166?pn='burl_2 = 'http://tieba.baidu.com/p/3138733512?pn='begin = 1end = 3scrap(burl,begin,end)

0 0

Python爬虫学习(1):百度贴吧
Python爬虫学习2--百度贴吧
学习记录：python百度贴吧爬虫
python百度贴吧爬虫
python 百度贴吧爬虫
python- 百度贴吧爬虫
[python]百度贴吧爬虫
Python爬虫入门学习例子之百度贴吧
python爬虫：下载百度贴吧图片学习笔记
Python爬虫学习记录（1）——百度贴吧图片下载
python网络爬虫学习(二)一个爬取百度贴吧的爬虫程序
Python爬虫实例1-抓取百度贴吧
Python 爬虫获取百度贴吧图片
python爬虫百度贴吧标题数据
python实现百度贴吧爬虫
python爬虫实战2-百度贴吧
python爬虫--抓取百度贴吧
Python爬虫实战：百度贴吧帖子
OpenGLES 版本
MinGW for windows
redis常用命令、常见错误、配置技巧等分享
217. Contains Duplicate
CGAL+VS2013配置中遇到的几个问题
Python爬虫学习(1):百度贴吧
（不再黑屏里查看） Android Studio获取SHA1或MD5的方法
Gnu Global 识别C++头文件
配置SS
单点登录
(转)使用tar和split打包分割文件
第十二周项目4——利用遍历思想求解图问题2
第十一周项目2—用二叉树求解代数式
三张图搞懂JavaScript的原型对象和原型链