phantomjs例子1
来源:互联网 发布:qq windows版 编辑:程序博客网 时间:2024/05/16 13:50
#!/usr/bin/env python# -*- coding: utf-8 -*-# created by fhqplzj on 2017/07/15 下午3:38import osimport refrom urlparse import urljoinfrom gensim.utils import to_utf8from scrapy import Selectorfrom selenium import webdriverout_path = '/Users/fhqplzj/WebstormProjects/untitled/haha.html'def check(): with open(out_path) as fin: data = fin.read() selector = Selector(text=data) urls = selector.xpath('//*[@id="J-head-menu-alert"]/ul/li/div[2]/a/@href').extract() with open(os.path.join('/Users/fhqplzj/data/travel', 'domestic'), 'w') as fout: for url in urls: abs_url = urljoin('https://lvyou.baidu.com/scene/', url) + 'jingdian/\n' abs_url = re.sub(r'^https', 'http', abs_url) fout.write(abs_url) print abs_url, print len(urls)def load(): driver = webdriver.PhantomJS() driver.get('https://lvyou.baidu.com/scene/') button = driver.find_element_by_xpath('//*[@id="J-head-menu"]/li[2]') button.click() with open(out_path, 'w') as fout: fout.write(to_utf8(driver.page_source)) driver.close() driver.quit()if __name__ == '__main__': flag = 0 if flag: load() else: check()
阅读全文
0 0
- phantomjs例子1
- install phantomjs 2.1.1
- phantomjs
- phantomjs
- phantomjs
- PhantomJS
- PhantomJS
- phantomjs
- phantomjs
- PhantomJS
- phantomjs
- 动态网页爬取例子(WebCollector+selenium+phantomjs)
- phantomJS区域截图及保存文本的简单例子
- phantomjs#1脚本编码问题
- 用casperJs phantomJs php 抓取17track订单状态的例子
- Scrapy--phantomjs: error while loading shared libraries: libfontconfig.so.1
- 安装phantomjs
- python phantomjs
- 【Shell】截取字符串
- QT对话框中show和exec的区别
- 二进制求和
- Recording︱有价值的各类AI、机器学习比赛心得、经验抄录
- 从键盘读取文件结束符
- phantomjs例子1
- 25time元素
- 微信热修复Tinker
- Kickstart Round B 2017 Problem A. Math Encoder
- ORACLE触发器
- [leetcode]485. Max Consecutive Ones
- Material Design之滑动菜单
- mysql Limit的用法
- 【HNOI2016模拟4.10】线性代数与逻辑