Python爬虫学习笔记Day3

来源:互联网 发布:西瓜影音mac 编辑:程序博客网 时间:2024/05/16 12:26

今天学习的是 如何 下载 豆瓣首页的图片 然后保存到本地  

豆瓣首页如下:


抓取代码如下

import urllib.requestimport reimport osimagePath = '/Users/touna/Desktop/image'#保存文件的方法def saveFile(path):    #检测路径是否存在 if不存在 就创建    if not os.path.isdir(imagePath):        os.mkdir(imagePath)    #rindex() 返回子字符串 str 在字符串中最后出现的位置    str = path.rindex('/')    print('---%s' % str)    p = os.path.join(imagePath,path[str+1:])    print('++++%s' % p)    print('++++%s' % path[str+1:])    return purl = 'https://www.douban.com/'header = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'}req = urllib.request.Request(url=url,headers=header)res = urllib.request.urlopen(req)data = res.read()#data = data.decode('utf-8')pattern = re.compile(r'(https:[^s]*?(jpg|png|gif))')for imageUrl,t in set(re.findall(pattern,str(data))):    print(imageUrl)    #urlretrieve()方法直接将远程数据下载到本地    urllib.request.urlretrieve(imageUrl,saveFile(imageUrl))


打印日志如下:

https://img3.doubanio.com/view/photo/albumcover/public/p2497540936.jpg---54++++/Users/touna/Desktop/image/p2497540936.jpg++++p2497540936.jpghttps://img3.doubanio.com/icon/g83759-2.jpg---30++++/Users/touna/Desktop/image/g83759-2.jpg++++g83759-2.jpghttps://img3.doubanio.com/icon/g109498-1.jpg---30++++/Users/touna/Desktop/image/g109498-1.jpg++++g109498-1.jpghttps://img1.doubanio.com/view/dianpu_product_item/medium/public/p1982227.jpg---64++++/Users/touna/Desktop/image/p1982227.jpg++++p1982227.jpghttps://img1.doubanio.com/view/photo/albumcover/public/p2498359159.jpg---54++++/Users/touna/Desktop/image/p2498359159.jpg++++p2498359159.jpghttps://img3.doubanio.com/view/dianpu_product_item/medium/public/p270364.jpg---64++++/Users/touna/Desktop/image/p270364.jpg++++p270364.jpghttps://img3.doubanio.com/view/dianpu_product_item/medium/public/p458880.jpg---64++++/Users/touna/Desktop/image/p458880.jpg++++p458880.jpghttps://img1.doubanio.com/view/dianpu_product_item/medium/public/p509169.jpg---64++++/Users/touna/Desktop/image/p509169.jpg++++p509169.jpghttps://img1.doubanio.com/icon/g37688-27.jpg---30++++/Users/touna/Desktop/image/g37688-27.jpg++++g37688-27.jpghttps://img3.doubanio.com/view/dianpu_product_item/medium/public/p377790.jpg---64++++/Users/touna/Desktop/image/p377790.jpg++++p377790.jpghttps://img3.doubanio.com/view/ark_article_cover/large/public/20165020.jpg
保存到本地的图片如下:


如有不妥 请大神多多指点   

原创粉丝点击