A Great HTML Parser-Html Agility Pack
来源:互联网 发布:qqiphone6s在线软件 编辑:程序博客网 时间:2024/06/08 18:13
What is exactly the Html Agility Pack (HAP)?
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).
Html Agility Pack now supports Linq to Objects (via a LINQ to Xml Like interface). Check out the new beta to play with this feature
Sample applications:
- Page fixing or generation. You can fix a page the way you want, modify the DOM, add nodes, copy nodes, well... you name it.
- Web scanners. You can easily get to img/src or a/hrefs with a bunch XPATH queries.
- Web scrapers. You can easily scrap any existing web page into an RSS feed for example, with just an XSLT file serving as the binding. An example of this is provided.
There is no dependency on anything else than .Net's XPATH implementation. There is no dependency on Internet Explorer's MSHTML dll or W3C's HTML tidy or ActiveX / COM object, or anything like that. There is also no adherence to XHTML or XML, although you can actually produce XML using the tool. The version posted here on CodePlex is for the .NET Framework 2.0. If you need the old version, please go to the old page or drop me a note.
Examples - Code Examples
The home entry of the open source project is http://htmlagilitypack.codeplex.com/
- A Great HTML Parser-Html Agility Pack
- Html Agility Pack解析Html -- Q&A
- Html Agility Pack简单例子
- HTML Agility Pack 搭配 ScrapySharp
- Html Agility Pack 处理通配符
- .Net C# 解析 HTML -- Html Agility Pack
- Html Agility Pack解析HTML页
- Html Agility Pack解析html小结
- Html Agility Pack解析html小结
- Html Agility Pack (HAP) 应用入门
- 开源项目Html Agility Pack实现快速解析Html
- Html Agility Pack ── 一个分析HTML的工具
- Html Agility Pack ── 一个分析HTML的工具
- Html Agility Pack ── 一个分析HTML的工具
- Html Agility Pack (HAP):c# HTML 解析利器
- Html Agility Pack基础类介绍及运用
- HTML Agility Pack for Windows Phone 7 (WP7)
- Attention to get the latest Official Html Agility Pack
- Java基础知识
- Dreamweaver CS5之旅(一)设置页面属性
- IOS 常用网站
- android fih-mms的实现
- android权限大全
- A Great HTML Parser-Html Agility Pack
- 黑马程序员 _集合框架,以及泛型
- asp.net 中实时显示本地时间
- 【写给自己】从今天起,开始写博客吧。
- 模拟Iphone 样式的 AlertDialog
- 《Red5 用户参考手册》之五:入门第四章 Red5 的类库
- c++默认实参 省略实参
- ==和equals()区别(操作符==与对象equals方法的不同)
- HTML 标签 介绍