mathematica抓取网页

来源:互联网 发布:画婚礼图软件 编辑:程序博客网 时间:2024/06/07 13:41

一个简易的获取网页.nb程序

list = List[];url = "http://blog.csdn.net/gl486546/article/category/6389727/";catchPageElem[n_] := Block[{xml, len, d, i, temp},  xml = Import[url <> ToString[n], "XMLObject"];  d = Cases[xml,    XMLElement[      "span", {"class" -> "link_title"}, {XMLElement[        "a", {"shape" -> "rect",          "href" ->           href_}, {title_}], __}] :> {"http://blog.csdn.net" <> href,       StringTrim[title]}, {0, Infinity}];  len = Length[d];  Do[AppendTo[list, d[[i]]], {i, 1, len}]  ]Do[catchPageElem[i], {i, 1, 13}];list

运行结果:
这里写图片描述
“`