scrapy爬虫(2)之css
来源:互联网 发布:电子印章软件 编辑:程序博客网 时间:2024/06/07 05:45
- css的功能和上一篇的xpath一样,择一即可
# css: front_image_url = response.meta.get("front_image_url", "") # 文章封面图 title2 = response.css(".entry-header h1::text").extract()[0] create_data2 = response.css("p.entry-meta-hide-on-mobile::text").extract()[0].strip().replace('·', '') praise_nums2 = int(response.css(".href-style h10::text").extract()[0]) favor_nums2 = response.css(".bookmark-btn::text").extract()[0] match_re3 = re.match(".*?(\d+).*", favor_nums2) if match_re3: favor_nums2 = int(match_re3.group(1)) else: favor_nums2 = 0 comment_nums2 = response.css("span.hide-on-480::text").extract()[0] match_re4 = re.match(".*?(\d+).*", comment_nums2) if match_re4: comment_nums2 = int(match_re4.group(1)) else: comment_nums2 = 0 content2 = response.css('div.entry').extract()[0] tag_list2 = response.css("p.entry-meta-hide-on-mobile a::text").extract() [element for element in tag_list2 if not element.strip().endswith("评论")] tags2 = ",".join(tag_list2)
阅读全文
0 0
- scrapy爬虫(2)之css
- scrapy爬虫之Spider
- scrapy爬虫之selectors
- 爬虫之Scrapy
- python爬虫之Scrapy
- scrapy爬虫之Image Pipeline
- scrapy爬虫之Item Pipeline
- scrapy(一)之初探爬虫
- 爬虫学习之Scrapy构建
- scrapy爬虫(1)之xpath
- 爬虫实践之爬虫框架Scrapy安装
- python爬虫----scrapy爬虫之天气预报
- Python爬虫之Scrapy爬虫框架
- scrapy爬虫精要(2)
- scrapy爬虫【2】→爬天猫
- linux分布式scrapy爬虫之安装scrapy-redis
- Python爬虫框架Scrapy实战之安装
- windows-python爬虫之scrapy快速安装
- 4.1、内存这个大话题
- mongodb和redis的区别
- jquery--index()方法
- Word回车符手动一个一个删太麻烦了如何批量删除
- 人工智能: 自动寻路算法实现(三、A*算法)
- scrapy爬虫(2)之css
- MyBatis之快速入门
- 先验概率与后验概率的解释
- 数据连接池默认配置带来的坑testOnBorrow=false,cloes_wait 终于解决了
- 关于java.lang.IncompatibleClassChangeError: Implementing class错误解决
- 关于反射,动态加载,静态加载
- 微信小程序要完败APP了?
- 利用ajax给html动态拼接html代码
- linux常用命令