Google Web Search
来源:互联网 发布:三亚市人民政府的域名 编辑:程序博客网 时间:2024/06/08 03:08
这学期和老师做的项目要先进行google search, 找了一下,google web search api刚好符合我所需要的各种要求( https://developers.google.com/web-search/docs/?csw=1#fonje_snippets_java),但是Deprecated了。倒不是不可以用,但是得到的结果确实有点奇怪,估计的结果数目和显示的数目完全不一样。。。
于是开始研究custom search JSON/Atom API。还没太明白...
这里算是做一个记录吧。
package searchTeat1;import java.io.BufferedReader;import java.io.IOException;import java.io.InputStreamReader;import java.io.UnsupportedEncodingException;import java.net.*;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.nodes.Element;import org.jsoup.select.Elements;import org.json.JSONArray;import org.json.JSONObject;public class test1 {public static void main(String[] args) throws Exception, Exception {///////////////////////////////////////////////////////////////////////////////////////////////////////String google = "http://www.google.com/search?q=";//String search = "Barack Obama Honolulu, Hawaii";//String charset = "UTF-8";//String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!////Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select("li.g>h3>a");////for (Element link : links) //{// String title = link.text();// String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".// url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");//// if (!url.startsWith("http")) {// continue; // Ads/news/etc.// }//// System.out.println("Title: " + title);// System.out.println("URL: " + url);// System.out.println("=============================\n");//}/////////////////////////////////////////////////////////////////////////////////////////////// The request also includes the userip parameter which provides the end// user's IP address. Doing so will help distinguish this legitimate// server-side traffic from traffic which doesn't come from an end-user.//strQuery = "q=George+Bush+"new+haven"&userip=10.109.3.60";//function1String entity = "George+Bush";String attrValue = "new+haven";//everyday 900//function2String strQuery = String.format("q=%s+"%s"&userip=10.109.3.60", entity, attrValue);URL url = new URL( "https://ajax.googleapis.com/ajax/services/search/web?v=1.0&" + strQuery);URLConnection connection = url.openConnection();connection.addRequestProperty("Referer", "www.google.com");String line;StringBuilder builder = new StringBuilder();BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));while((line = reader.readLine()) != null) { builder.append(line);}JSONObject json = new JSONObject(builder.toString());// now have some fun with the results...System.out.println("Total results = " + json.getJSONObject("responseData") .getJSONObject("cursor").getString( "estimatedResultCount")); JSONArray ja = json.getJSONObject("responseData").getJSONArray("results"); System.out.println(" Results:"); for (int i = 0; i < ja.length(); i++) { JSONObject j = ja.getJSONObject(i); System.out.println(j.getString("titleNoFormatting")); System.out.println(j.getString("url")); System.out.println(j.get("content")); System.out.println("\n"); }}}
过两天代码应该又面目全非了。。。
0 0
- Google Web Search
- Google Web Search
- Google Web Search API 实现
- Google Web Search API (Deprecated)
- Search Google
- google search
- 因为Google code search,Web从来没有这么不安全过!!!
- Google Desktop Search
- google's map search
- other use google search
- google search key
- Google Ajax Search 参考
- Google Search API Worms
- google map Search
- The Google Search toolbar
- Google Search Syntax
- Google Search技巧
- Google Search Result
- No system images installed for this target 的解决方法
- 4 Sum
- Letter Combination of phone number
- 大学断代史(二)——生活,不是那样简单
- 大学断代史(一)——浑浑噩噩
- Google Web Search
- 使用Handler在子线程中更新UI
- Android中Activity的跳转
- IOS开发--自定义控件
- UVA LIVE-4642 - Malfatti Circles
- fpga的驱动调试dev_dbg 无输出
- mybatis 异常:Invalid bound statement (not found) .
- win7 打开plsql登录不上显示空白提示框解决办法
- 转意符号 语句的嵌套 for 习题 break continue