Python 3.0安装BeautifulSoup插件并使用 与Python 2 version of Beautiful Soup under Python 3问题处理

来源:互联网 发布:风尚购物网络电视直播 编辑:程序博客网 时间:2024/05/16 14:34

Python 3.0安装BeautifulSoup插件并使用



Python3.0 对BeautifulSoup的兼容性不是特别好,安装后 使用import bs4 from BeautifulSoup 


1. 先下载beautifulSoup 点击打开链接

  https://www.crummy.com/software/BeautifulSoup/bs4/download/ 


python setup,py install 安装 如果装在C盘 最好用 管理员身份打开


2. 

       输入python      然后输入 from bs4 import BeautifulSoup

出现异常:

Windows下安装BeautifulSoup4显示'You are trying to run the Python 2 version of Beautiful Soup under Python 3.(`python setup.py install`) or by running 2to3 (`2to3 -w bs4`).'



beautifulsoup4解压目录(beautifulsoup4-4.6.0\bs4)和 2to3.py(D:\Python安装目录\Tools\scripts\)复制到python的安装目录下的Lib(D:\Python安装目录\Lib)文件夹下

执行命令:

Python 2to3.py-w bs4


                                                    


                                                    


                                                    


                                                    

如何使用参考:点击打开链接    点击打开链接

#!/usr/bin/env python#coding:utf-8# 根据易迅网的商品ID,爬取商品价格信息。# By Tsing# Python 2.7.9import urllib.request as requestfrom bs4 import BeautifulSoupdef get_yixun(id):    price_origin,price_sale = '0','0'    headers = {    'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'    }    #url = 'http://item.yixun.com/item-' + id + '.html'    url ='http://baidu.com'    req = request.Request(url=url, headers=headers)    html = request.urlopen(req).read().decode('utf-8')    #print(html)    soup = BeautifulSoup(html,'lxml')    print('soup')    print(soup.prettify())    print("class")    print(soup.div)   # title = request.unicode(soup.title.text.strip().strip(u'【价格_报价_图片_行情】-易迅网').replace(u'】','')).encode('utf-8').decode('utf-8')   # print(title)    try:        soup_origin = soup.find("dl", { "class" : "xbase_item xprice xprice_origin" })        price_origin = soup_origin.find("span", { "class" : "mod_price xprice_val" }).contents[1].text         print( u'原价:' + price_origin)    except:        pass    try:        soup_sale= soup.find('dl',{'class':'xbase_item xprice'})        price_sale = soup_sale.find("span", { "class" : "mod_price xprice_val" }).contents[1]         print (u'现价:'+ price_sale)    except:        pass    print(url)    return Noneif __name__ == '__main__':    get_yixun('2189654')









阅读全文
0 0
原创粉丝点击