爬虫下载壁纸,并设置壁纸自动切换

来源:互联网 发布:java中文件流 编辑:程序博客网 时间:2024/05/16 23:51

贴代码(初版,没有排版,更改,封装):

1.爬虫部分

一开始找到百度壁纸,个人比较喜欢雪景,所以想用爬虫批量下载,结果发现百度壁纸是动态的,就用request结果还是不行,所以最后又不得不用phantomjs来获取网页远吗,后来获取源码以后,解析出来了图片的url地址,然而mdzz用urlretrieve下载,结果百度给403foribidden了,后来一直在找办法,什么访问带头部,GG,什么发放稳获取response然后写入文件,GG,最后想到了unbuntu系统里面的wget来下载于是写入文件然后wget-i文件名进行批量下载,可以下,但是wget访问频率太快,在下载十多张图片之后就会被百度封了,于是查看wget命令,发现-w(-wait)课进行休眠,于是准备合理化访问。

2.图片背景自动更换设置(win10)

个性化-背景-背景-幻灯片放映,选择图片存储的文件夹,done

import urllib'''from urllib import requestfrom bs4 import BeautifulSoup as bsurl = "http://image.baidu.com/search/index?tn=baiduimage&ct=201326592&lm=-1&cl=2&ie=gbk&word=%D1%A9%BE%B0%D7%C0%C3%E6%B1%DA%D6%BD&fr=ala&ala=1&pos=0&alatpl=wallpaper&oriquery=%E9%9B%AA%E6%99%AF%E6%A1%8C%E9%9D%A2%E5%A3%81%E7%BA%B8"response = request.urlopen(url).read()content = str(response,encoding = "utf-8")bs_obj = bs(content,"html.parser")print(bs_obj)'''#urlopen是最简单的但tmd也是最垃圾的over#下面有请我们最高级的phatomjsfrom selenium import webdriverdriver = webdriver.PhantomJS()driver.set_window_size(25600,14400)driver.get("http://image.baidu.com/search/index?tn=baiduimage&ct=201326592&lm=-1&cl=2&ie=gbk&word=%D1%A9%BE%B0%B1%DA%D6%BD&fr=ala&ala=1&pos=0&alatpl=wallpaper&oriquery=%E9%9B%AA%E6%99%AF%E5%A3%81%E7%BA%B8s")page_source = driver.page_source#print(page_source)#print(page_source)#有请伟大的re#伟大的re失败了。。。这尼玛代码还会自动消失出现,!绝了'''import repattern = re.compile(r'src="http://.*?.jpg"')img_src_list = pattern.findall(page_source,re.I)url_pattern = re.compile(r'http://.*?.jpg')img_url_list = []for i in img_src_list:    img_url_list.append(url_pattern.find(i,re.I))for i in img_url_list:    print(i)'''#有请老伙伴bs4#init download pathdownload_path = r"C:\Users\Mr.Guo\Pictures\BDpictures"from bs4 import BeautifulSoupimport requestsfrom urllib import requestbs_obj = BeautifulSoup(page_source,"html.parser")img_url_list = bs_obj.findAll("img",{"class":"main_img img-hover"})final_url_list = []for i in img_url_list:    final_url_list.append(i.attrs["src"])#print(final_url_list)f = open(download_path+"\test.txt",'a')for i in range(len(final_url_list)):    print(final_url_list[i])    try:        '''        opener=request.build_opener()        opener.addheaders=[('User-Agent','Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36')]        request.install_opener(opener)        #request.urlretrieve(url=url,filename=‘%s/%s.txt‘%(savedir,get_domain_url(url=url)))        request.urlretrieve(final_url_list[i],download_path+"\%s.jpg"%i)        '''        '''        r = requests.get(final_url_list[i])        i_download_path = download_path + "\%s.jpg"%i        with open(i_download_path, "wb") as code:             code.write(r.content)        '''        f.write(final_url_list[i]+'\n')    except Exception as e:        print(e)        pass   

原创粉丝点击