【python】爬虫2——下载亦舒博客首页所有文章

来源：互联网发布：创奇老照片修复软件编辑：程序博客网时间：2024/05/23 01:16

#! /usr/bin/env python#coding=utf-8from urllib import urlopenimport timeurl = ['']*40i = 0arti = urlopen('http://blog.sina.com.cn/s/articlelist_1227636382_0_1.html').read()title = arti.find(r'<a title=')href = arti.find(r'href=',title)html = arti.find(r'.html',href)url[0] = arti[href+6:html+5]print urlwhile title != -1 and href != -1 and html != -1 and i<40:    url[i] = arti[href+6:html+5]    print url[i]    title = arti.find(r'<a title=',html)    href = arti.find(r'href=',title)    html = arti.find(r'.html',href)    i = i + 1else:    print 'find end'    j=0while j<50:    content = urlopen(url[j]).read()    filename = url[j][-26:]    print filename    open(r'yishu/'+url[j][-26:],'w+').write(content)    print 'downloading',url[j]    j = j+1    time.sleep(1)else:    print 'download article finished'

0 0

【python】爬虫2——下载亦舒博客首页所有文章
【python】爬虫3——抓取亦舒博客所有文章
【python】爬虫1——下载博客文章
Python爬虫——下载韩寒博客文章
爬虫之下载博客目录文章
第一个Python爬虫，爬取某个新浪博客所有文章并保存为doc文档
Python网络爬虫实训：如何下载韩寒博客文章
一个简单的python网络爬虫程序（下载博客文章）
利用Python编写网络爬虫下载文章
python爬虫下载网站所有文件
python实现下载韩寒博客中的所有文章，在本地存储
python爬虫——爬取微信文章
python爬虫代码-CSDN博客下载
python 爬虫 CSDN博客下载-改进版
Python实现抓取CSDN博客首页文章列表
Python抓取博客园首页文章列表（带分页）
python爬虫爬取csdn博客专家所有博客内容
百度空间博客文章下载 [Python 源码]
启动器添加快捷方式
ZOJ 2561 Order-Preserving Codes DP 四边形优化
工具栏图标替换
Hello World!
一个图片满屏显示
【python】爬虫2——下载亦舒博客首页所有文章
全屏显示窗口及无标题
button的onclicklistener的几种实…
textview的基本设置
Ehcache入门
常用颜色值表1
常用颜色值表2
XML布局
lvm命令详解