python 爬虫小记之一

来源：互联网发布：素手浣花知乎编辑：程序博客网时间：2024/05/21 19:34

学习爬虫，先下载美女图片练练手

#coding=utf-8import urllibimport reFORBIDDEN = "403 Forbidden"def getHtml(url):    page = urllib.urlopen(url)    html = page.read()    return htmldef getImg(html):    reg = r'URL":"(http.+?\.jpg)",'    imgre = re.compile(reg)    imglist = re.findall(imgre,html)    x = 0    for imgurl in imglist:        status = urllib.urlopen(imgurl).code        if status == 200:            urllib.urlretrieve(imgurl,'%s.jpg' % x)            x+=1    return imglisthtml = getHtml("http://image.baidu.com/search/index?tn=baiduimage&ipn=r&ct=201326592&cl=2&lm=-1&st=-1&fm=result&fr=&sf=1&fmq=1491787331416_R&pv=&ic=0&nc=1&z=&se=1&showtab=0&fb=0&width=&height=&face=0&istype=2&ie=utf-8&word=%E5%9B%BE%E7%89%87")print getImg(html)

0 0

python 爬虫小记之一
python 爬虫小记
Python + Selenium 爬虫小记
python爬虫小记
Python爬虫入门小记
python爬虫实例之一
Python爬虫小记（一）
Python爬虫小记（二）
Python爬虫小记（三）
python爬虫之一：requests库
Python爬虫简述系列之一
c++转python知识小记之一
学习小记 - Python爬虫 (2) 爬虫闯关系列
Python爬虫入门之一-requests+BeautifulSoup
数据科学工程师面试宝典系列之一--Python爬虫实战
数据科学工程师面试宝典系列之一----Python爬虫
数据科学工程师面试宝典系列之一--Python爬虫实战
python 爬虫利器之一Request库的用法
[算法作业-动态规划][LeetCode] 97. Interleaving String
Redis在windows下安装过程
ssm连接sqlserver时通过端口1433连接到主机的TCP/IP失败，错误“Connection refused：connect……
【封装】使用okHttp发送网络请求及上传下载进度监听
binary 和 varbinary 用法全解
python 爬虫小记之一
树莓派开机不显示log和打印信息
CronExpression表达式
创建线程的三种方式优缺点
启动redis出现Creating Server TCP listening socket *:6379: bind: No such file or directory
现实世界里的 SOA
再次回到csdn
MMU详解 <一>
js剪切板使用