Python爬虫小记(二)

来源:互联网 发布:php工程师 编辑:程序博客网 时间:2024/06/09 07:59
from bs4 import BeautifulSouphtml = urlopen("http://dblp.uni-trier.de/db/conf/www/www2017.html")bsObj = BeautifulSoup(html, 'lxml')nameList = bsObj.findAll("div", {"id": "main"})# for name in nameList:#   print(name.div.a.find("img")['src'])for i in range(len(nameList)):    print(nameList[i].div.a.find("img")["alt"])for i in range (len(nameList)):    print(nameList[i].div.a.find("img").parent["href"])

获得结果:dblp computer science bibliography
http://dblp.uni-trier.de

原创粉丝点击