Google Web Search

来源:互联网 发布:三亚市人民政府的域名 编辑:程序博客网 时间:2024/06/08 03:08


这学期和老师做的项目要先进行google search, 找了一下,google web search api刚好符合我所需要的各种要求( https://developers.google.com/web-search/docs/?csw=1#fonje_snippets_java),但是Deprecated了。倒不是不可以用,但是得到的结果确实有点奇怪,估计的结果数目和显示的数目完全不一样。。。


于是开始研究custom search JSON/Atom API。还没太明白...

这里算是做一个记录吧。

package searchTeat1;import java.io.BufferedReader;import java.io.IOException;import java.io.InputStreamReader;import java.io.UnsupportedEncodingException;import java.net.*;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.nodes.Element;import org.jsoup.select.Elements;import org.json.JSONArray;import org.json.JSONObject;public class test1 {public static void main(String[] args) throws Exception, Exception {///////////////////////////////////////////////////////////////////////////////////////////////////////String google = "http://www.google.com/search?q=";//String search = "Barack Obama  Honolulu, Hawaii";//String charset = "UTF-8";//String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!////Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select("li.g>h3>a");////for (Element link : links) //{//    String title = link.text();//    String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".//    url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");////    if (!url.startsWith("http")) {//        continue; // Ads/news/etc.//    }////    System.out.println("Title: " + title);//    System.out.println("URL: " + url);//    System.out.println("=============================\n");//}/////////////////////////////////////////////////////////////////////////////////////////////// The request also includes the userip parameter which provides the end// user's IP address. Doing so will help distinguish this legitimate// server-side traffic from traffic which doesn't come from an end-user.//strQuery = "q=George+Bush+"new+haven"&userip=10.109.3.60";//function1String entity = "George+Bush";String attrValue = "new+haven";//everyday 900//function2String strQuery = String.format("q=%s+"%s"&userip=10.109.3.60", entity, attrValue);URL url = new URL(    "https://ajax.googleapis.com/ajax/services/search/web?v=1.0&" + strQuery);URLConnection connection = url.openConnection();connection.addRequestProperty("Referer", "www.google.com");String line;StringBuilder builder = new StringBuilder();BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));while((line = reader.readLine()) != null) { builder.append(line);}JSONObject json = new JSONObject(builder.toString());// now have some fun with the results...System.out.println("Total results = "                  + json.getJSONObject("responseData")                          .getJSONObject("cursor").getString(                                  "estimatedResultCount"));          JSONArray ja = json.getJSONObject("responseData").getJSONArray("results");          System.out.println(" Results:");          for (int i = 0; i < ja.length(); i++) {              JSONObject j = ja.getJSONObject(i);              System.out.println(j.getString("titleNoFormatting"));              System.out.println(j.getString("url"));              System.out.println(j.get("content"));              System.out.println("\n");        }}}

过两天代码应该又面目全非了。。。

0 0
原创粉丝点击