python爬虫:案例二:携程网酒店价格信息
来源:互联网 发布:世界域名 编辑:程序博客网 时间:2024/04/29 07:23
这个案例可能不太智能,有个朋友和我说他们公司让他爬携程的酒店价格信息,我当时去看了一下,发现携程的信息爬起来挺麻烦,城市是必输项,酒店名是选输项,跳转的url中城市后面带一个数字,对于这个每个城市表示的数字的规则我不知道,这样我只能定向爬一个城市,或者就是模拟浏览器之类,觉得挺麻烦,到了酒店页面又有挺多东西看着头疼,我对他说这个挺麻烦的,分析花的时间会很久,后来他说他们公司是人工输入酒店价格详情的url到数据库,然后直接从一个页面获取价格数据
#coding=utf-8import sysreload(sys)sys.setdefaultencoding( "utf-8" )import urllibfrom selenium import webdriverurls=['http://hotels.ctrip.com/hotel/848702.html#ctm_ref=hod_sr_lst_dl_n_2_1']#假设一堆urlclass Xc(): def pc(seif): for url in urls: driver = webdriver.PhantomJS() driver.get(url) fangx_1=driver.find_element_by_class_name('room_unfold').text.split('\n')[0] jiage_1=driver.find_element_by_class_name('base_price').text driver.quit return fangx_1+'|'+jiage_1 #房型和对应的价格s=Xc()print s.pc()
结果:
单人房(无窗)|¥237
上面的代码只是简单的例子,而且所有房型价额需要一个一个解析,太麻烦了,后来我发现源码最下面居然有一段json,里面的内容就是房型,价格这些,于是我改了一下代码
#coding=utf-8import sysreload(sys)sys.setdefaultencoding( "utf-8" )import urllibfrom selenium import webdriverurls=['http://hotels.ctrip.com/hotel/848702.html#ctm_ref=hod_sr_lst_dl_n_2_1']class Xc(): def pc(seif): for url in urls: driver = webdriver.PhantomJS() driver.get(url) #fangx_1=driver.find_element_by_class_name('room_unfold').text.split('\n')[0] #jiage_1=driver.find_element_by_class_name('base_price').text json=driver.find_element_by_xpath('//*[@id="htl_detail_htl_hotel"]').get_attribute('value') driver.quit #return fangx_1+'|'+jiage_1 return jsons=Xc()print s.pc()
结果:
pageid=102003;ht=848702;checkin=2016-05-09;checkout=2016-05-10;rmlist=[{"rm":"30665921","shadowid":"0","rpfq":"0.0","rpfh":"219","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"30265080","shadowid":"0","rpfq":"0.0","rpfh":"263","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125027","shadowid":"0","rpfq":"0.0","rpfh":"294","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"8684722","shadowid":"0","rpfq":"0.0","rpfh":"294","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265081","shadowid":"0","rpfq":"0.0","rpfh":"219","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"8684723","shadowid":"0","rpfq":"0.0","rpfh":"294","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265075","shadowid":"0","rpfq":"0.0","rpfh":"237","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"单人床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125024","shadowid":"0","rpfq":"0.0","rpfh":"265","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"单人床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2890470","shadowid":"0","rpfq":"0.0","rpfh":"265","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"单人床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265074","shadowid":"0","rpfq":"0.0","rpfh":"254","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125041","shadowid":"0","rpfq":"0.0","rpfh":"284","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2890480","shadowid":"0","rpfq":"0.0","rpfh":"284","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265072","shadowid":"0","rpfq":"0.0","rpfh":"280","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125016","shadowid":"0","rpfq":"0.0","rpfh":"313","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2525661","shadowid":"0","rpfq":"0.0","rpfh":"313","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265079","shadowid":"0","rpfq":"0.0","rpfh":"305","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2525665","shadowid":"0","rpfq":"0.0","rpfh":"341","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265077","shadowid":"0","rpfq":"0.0","rpfh":"305","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125021","shadowid":"0","rpfq":"0.0","rpfh":"341","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"8684720","shadowid":"0","rpfq":"0.0","rpfh":"341","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"}]
rpfh是价格
bedtype是房型
0 0
- python爬虫:案例二:携程网酒店价格信息
- python爬虫:案例三:去哪儿酒店价格信息
- python 爬虫获取网站信息(二)
- python爬虫案例-爬取西刺免费代理服务器IP等信息
- Python爬虫入门案例
- python链接爬虫案例
- 酒店价格、集合、文件名称
- python 爬虫爬取所有上市公司公告信息(二)
- Python网络爬虫与信息提取(二) BeautifulSoup库
- Booking网站爬虫,获取酒店评论内容(Python)
- python 多进程爬虫案例
- Python网络爬虫与信息提取(二):网络爬虫之提取
- Python网络爬虫与信息提取(二):网络爬虫之提取
- python 爬虫(二)
- Python爬虫实例二
- Python爬虫入门二
- python爬虫练手二
- python爬虫(二)
- LintCode:在二叉查找树中插入节点
- C++判断字符串是否为变形词
- 15 个 Android 通用流行框架大全
- 增强现实原理解析
- meta标签的作用及整理
- python爬虫:案例二:携程网酒店价格信息
- Spark中文手册1-编程指南
- java读取xml文件的方法汇总
- Anroid-vlc开源播放器代码编译及简单调用手把手
- python 字符串内建函数
- 全选,反选,删除
- 如何通过cPanel附加域来设置多个网站
- 面试之路(27)-链表中倒数第K个结点
- Codeforces Round #299 (Div. 2) C. Tavas and Karafs