创建爬虫----导航树

来源:互联网 发布:java switch 例子 编辑:程序博客网 时间:2024/05/20 18:16

1.处理子标签和其他后代标签
孩子(child)和后代(descendant)

from urllib.request import urlopenfrom bs4 import BeautifulSouphtml=urlopen("http://www.pythonscraping.com/pages/page3.html")bsObj=BeautifulSoup(html)for child in bsObj.find("table",{"id":"giftList"}).children:    print(child) #打印giftlist表格中所有产品的数据行

2.处理兄弟标签:

from urllib.request import urlopenfrom bs4 import BeautifulSouphtml=urlopen("http://www.pythonscraping.com/pages/page3.html")bsObj=BeautifulSoup(html)for sibling in bsObj.find("table",{"id":"giftList"}).tr.next_siblings:    print(sibling) 

3.父标签处理:

from urllib.request import urlopenfrom bs4 import BeautifulSouphtml=urlopen("http://www.pythonscraping.com/pages/page3.html")bsObj=BeautifulSoup(html)print( bsObj.find("img",{"src":"../img/gifts/img1.jpg"}).parent.previous_siblings.get_text())