python 爬虫笔记(二)

来源:互联网 发布:希腊罗马神话 知乎 编辑:程序博客网 时间:2024/05/30 22:50

抓取一个页面 id为giftList 的 table,的所有子节点

! /usr/bin/env python

coding=utf-8

import urllib2from    bs4 import  BeautifulSouphtml=urllib2.urlopen("http://www.pythonscraping.com/pages/page3.html")bsObj=BeautifulSoup(html)for child in bsObj.find("table",{"id":"giftList"}).children:            print(child)
#! /usr/bin/env python#coding=utf-8import urllib2from    bs4 import  BeautifulSouphtml=urllib2.urlopen("http://www.pythonscraping.com/pages/page3.html")bsObj=BeautifulSoup(html)for siblings in bsObj.find("table",{"id":"giftList"}).tr.next_siblings:            print(siblings)
#! /usr/bin/env python#coding=utf-8import urllib2from    bs4 import  BeautifulSouphtml=urllib2.urlopen("http://www.pythonscraping.com/pages/page3.html")bsObj=BeautifulSoup(html)print(bsObj.find("img",{"src":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text())
0 0
原创粉丝点击