Python自动化(二)使用Beautifu Soup爬取电影下载链接

来源:互联网 发布:python数据分析入门 编辑:程序博客网 时间:2024/05/17 07:03
#coding:utf-8from bs4 import BeautifulSoupimport requestsimport codecshost = "http://www.poxiao.com"url = "http://www.poxiao.com/mtype5.html"html_doc = requests.get(url).content.decode("GBK")with codecs.open("poxiao.html","w",encoding="GBK") as f:    f.write(html_doc)poxiao = BeautifulSoup(html_doc,"lxml")div_content = poxiao.find(name="div",attrs={"class":"content"})movies = div_content.find_all("h3")for movie in movies:    print movie.text    movie_url = host+movie.a.get("href")    movie_content = requests.get(movie_url).content    movie_soup = BeautifulSoup(movie_content,"lxml")    try:        thunder_link = movie_soup.find("input",attrs={"name":"checkbox2"})        print thunder_link.get("value")    except:        print "获取链接失败"
原创粉丝点击