HTML解析利器HtmlAgilityPack

来源:互联网 发布:python is empty 编辑:程序博客网 时间:2024/05/21 19:32
用HtmlAgilityPack,自行网上下载dll文件,获取:
C# code
HttpWebRequest httpWebRequest = WebRequest.Create(@"http://www.sooker.com/xuexiao/") as HttpWebRequest; HttpWebResponse httpWebResponse = httpWebRequest.GetResponse() as HttpWebResponse; Stream stream = httpWebResponse.GetResponseStream(); StreamReader reader = new StreamReader(stream, Encoding.GetEncoding("gb2312")); string s = reader.ReadToEnd(); reader.Close(); httpWebResponse.Close(); HtmlDocument htmlDoc = new HtmlDocument(); htmlDoc.LoadHtml(s); HtmlNodeCollection imgs = htmlDoc.DocumentNode.SelectNodes(@"//ul[@class='curriculumUl']/li//div[@class='pic']/a/img"); foreach (HtmlNode img in imgs) Response.Write(img.Attributes["src"].Value + "<br/>"); HtmlNodeCollection anchors = htmlDoc.DocumentNode.SelectNodes(@"//ul[@class='curriculumUl']/li//a[@class='school-name']"); foreach (HtmlNode anchor in anchors) { Response.Write(anchor.Attributes["href"].Value + "<br/>"); Response.Write(anchor.InnerHtml + "<br/>"); } Response.End();
http://zhoufoxcn.blog.51cto.com/792419/595344
原创粉丝点击