perl 爬取html findvalues 方法
来源:互联网 发布:java系统如何添加权限 编辑:程序博客网 时间:2024/05/05 09:17
node2:/root/pachong/yylc#cat t500.html <p id="p-page"><input type='submit' style='display:none' name='turnPage' id='turnPage'><input type='hidden' id='pageNum' name='pageNum' value='1'/><span onmouseover="this.className='cur-s-page'" onmouseout="this.className=''"><</span><span class='cur-s-page'>1</span><span onclick="document.getElementById('pageNum').value=2;document.getElementById('turnPage').click();" onmouseover="this.className='cur-s-page'" onmouseout="this.className=''">2</span><span onclick="document.getElementById('pageNum').value=3;document.getElementById('turnPage').click();" onmouseover="this.className='cur-s-page'" onmouseout="this.className=''">3</span>... <span onclick="document.getElementById('pageNum').value=1749;document.getElementById('turnPage').click();" onmouseover="this.className='cur-s-page'" onmouseout="this.className=''">1749</span><span onclick="document.getElementById('pageNum').value=2;document.getElementById('turnPage').click();" onmouseover="this.className='cur-s-page'" onmouseout="this.className=''">></span></p> </form>node2:/root/pachong/yylc#perl t400.pl <html> @0 (IMPLICIT) <head> @0.0 (IMPLICIT) <body> @0.1 (IMPLICIT) <p id="p-page"> @0.1.0 <input id="turnPage" name="turnPage" style="display:none" type="submit" /> @0.1.0.0 <input id="pageNum" name="pageNum" type="hidden" value="1" /> @0.1.0.1 <span onmouseout="this.className=''" onmouseover="this.className='cur-s-page'"> @0.1.0.2 "<" <span class="cur-s-page"> @0.1.0.3 "1" <span onclick="document.getElementById('pageNum').value=2;document.getElementById('turnPage').click();" onmouseout="this.className=''" onmouseover="this.className='cur-s-page'"> @0.1.0.4 "2" <span onclick="document.getElementById('pageNum').value=3;document.getElementById('turnPage').click();" onmouseout="this.className=''" onmouseover="this.className='cur-s-page'"> @0.1.0.5 "3" "...???" <span onclick="document.getElementById('pageNum').value=1749;document.getElementById('turnPage').click();" onmouseout="this.className=''" onmouseover="this.className='cur-s-page'"> @0.1.0.7 "1749" <span onclick="document.getElementById('pageNum').value=2;document.getElementById('turnPage').click();" onmouseout="this.className=''" onmouseover="this.className='cur-s-page'"> @0.1.0.8 ">"@pageString is < 1 2 3 1749 >node2:/root/pachong/yylc#cat t500.pl use LWP::UserAgent; use POSIX; use HTML::TreeBuilder::XPath; use Encode; use HTML::TreeBuilder; use Data::Dumper;use HTML::TreeBuilder::XPath; use DBI; use Encode; my $tree= HTML::TreeBuilder::XPath->new; $tree->parse_file("t500.html"); my @pageString = $tree->findvalues('/html/body//p[@id="p-page"]/span'); print "\@pageString is @pageString\n"; node2:/root/pachong/yylc#perl t500.pl @pageString is < 1 2 3 1749 >findvalues ($path)Returns the values of the matching nodes as a list. This is mostly the same as findnodes_as_strings, except that the elements of the list are objects (with overloaded stringification) instead of plain strings.返回 匹配节点的值作为一个列表,这个是和findnodes_as_strings 很像,
0 0
- perl 爬取html findvalues 方法
- perl 爬取csdn
- perl 爬取 find_by_tag_name
- perl 随机值,取整数方法
- perl 爬取数据<1>
- perl 爬取 csdn 博客
- perl 爬取铜板街
- perl 爬取同花顺数据
- perl 爬取上市公司业绩预告
- perl 循环类选择器 ,爬取内容
- perl取文件大小
- perl取文件大小 .
- python html解析&爬取
- python爬取HTML网页
- Perl嵌入HTML
- 用perl分析html
- perl 处理HTML
- perl - 设置html编码
- C++读书笔记
- java 中的集合(一) 概述
- IDEA——IDEA使用Tomcat服务器出现乱码问题
- 多臂强盗(multi-armed bandit)问题探究
- 基于C++的二叉树的建立,前序中序后序遍历
- perl 爬取html findvalues 方法
- http缓存
- js实现光标闪烁
- Java集合类详解
- tensorflow 错误ImportError: No module named sklearn
- Adapter适配器的基本使用
- python学习笔记,偏函数,模块,
- 数据结构之排序算法(三)
- jQuery radio的取值与赋值