Python爬虫BeautifulSoup用法(1)

来源:互联网 发布:淘宝上正规药店是哪个 编辑:程序博客网 时间:2024/06/05 05:24

爬新浪新闻网页


import requests
from bs4 import BeautifulSoup
res=requests.get('http://news.sina.com.cn/china/')
res.encoding='utf-8'
soup=BeautifulSoup(res.text,'html.parser')
for news in soup.select('.news-item'):
    if len(news.select('h2'))>0:
        h2=news.select('h2')[0].text
        print(h2)




提取新闻标题

import requests
from bs4 import BeautifulSoup
res=requests.get('http://news.sina.com.cn/china/xlxw/2017-12-09/doc-ifyppemf6082547.shtml')
res.encoding='utf-8'
soup=BeautifulSoup(res.text,'html.parser')
title=soup.select('#artibodyTitle')[0].text
print(title)



原创粉丝点击