httpclient介绍

来源:互联网 发布:淘宝买lol号卖家不给 编辑:程序博客网 时间:2024/05/22 14:32

author:madding.lip

date:2010.06.24

基本介绍

http协议的封装

  • Java中相似的实现

    • java.net.HttpURLConnection

与浏览器异同

  • 相似

    • 传输接受http消息

  • 不同

    • 不缓存内容

    • 不运行内嵌的javascript代码

    • 不去试着解些contenttype

    • 不重新处理本地URI

    • 不处理其他和http协议无关的内容

  • 图比较:


特点

  • http://jakarta.apache.org/commons/httpclient/features.html

组成

  • HTTP Message

  • HTTP Request

  • HTTP Response

  • Method

  • Header Fields

  • Entity

  • Session

  • Cookies

类图


用途

  • 模拟站点的登录发帖回复

  • 模拟简单的自动化测试

  • 进行http文件上传

  • 网络资源批量下载

  • 小型爬虫

  • json等基于http协议接口的JAVA解析

编程

  • 基本设置

    • 简单get请求

      • HttpClient client = new HttpClient();HttpMethod method = new GetMethod("http://www.cascadetg.com/");client.executeMethod(method);System.out.println(method.getStatusCode());System.out.println(method.getResponseBodyAsString());method.releaseConnection(); 

    • 构造请求URL

      • HttpGet httpget = new HttpGet("http://www.google.com/search?hl=en&q=httpclient&btnG=Google+Search&aq=f&oq=");URI uri =URIUtils.createURI("http", "www.google.com",-1, "/search","q=httpclient&btnG=Google+Search&aq=f&oq=",null);HttpGet httpget = newHttpGet(uri);System.out.println(httpget.getURI());List<NameValuePair>qparams = new ArrayList<NameValuePair>();qparams.add(newBasicNameValuePair("q", "httpclient"));qparams.add(newBasicNameValuePair("btnG", "Google Search"));qparams.add(newBasicNameValuePair("aq", "f"));qparams.add(newBasicNameValuePair("oq", null));URI uri = URIUtils.createURI("http","www.google.com", -1, "/search",URLEncodedUtils.format(qparams,"UTF-8"), null);HttpGet httpget = new HttpGet(uri);System.out.println(httpget.getURI());

      • http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq=

    • 返回结果

      • HttpResponse response = newBasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK, "OK");HttpResponse response = newBasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK, "OK");System.out.println(response.getProtocolVersion());System.out.println(response.getStatusLine().getStatusCode());System.out.println(response.getStatusLine().getReasonPhrase());System.out.println(response.getStatusLine().toString()); System.out.println(response.getProtocolVersion());System.out.println(response.getStatusLine().getStatusCode());System.out.println(response.getStatusLine().getReasonPhrase());System.out.println(response.getStatusLine().toString());

      • HTTP/1.1 200 OK HTTP/1.1 200 OK

    • 消息头设置

      • HttpResponse response = newBasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK, "OK");response.addHeader("Set-Cookie", "c1=a; path=/;domain=localhost"); response.addHeader("Set-Cookie","c2=b; path=\"/\", c3=c; domain=\"localhost\"");Header h1 = response.getFirstHeader("Set-Cookie");System.out.println(h1); Header h2 =response.getLastHeader("Set-Cookie");System.out.println(h2); Header[] hs =response.getHeaders("Set-Cookie");System.out.println(hs.length); 

      • Set-Cookie: c1=a; path=/;domain=localhost
        Set-Cookie: c2=b; path="/", c3=c;domain="localhost"

      • HttpResponse response =new BasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK,"OK"); response.addHeader("Set-Cookie", "c1=a;path=/; domain=localhost");response.addHeader("Set-Cookie","c2=b; path=\"/\", c3=c;domain=\"localhost\"");HeaderElementIterator it = newBasicHeaderElementIterator(response.headerIterator("Set-Cookie"));while (it.hasNext()) {    HeaderElement elem = it.nextElement();    System.out.println(elem.getName() + " = " +elem.getValue());    NameValuePair[] params =elem.getParameters();    for (int i = 0; i < params.length;i++) {        System.out.println(" " + params[i]);    }} 

      • c1 = a
        path=/
        domain=localhost
        c2 = b
        path=/
        c3 = c
        domain=localhost

  • 实例

    • 读取网页

      • 1.读取网页(HTTP/HTTPS)内容

        最简单的HTTP客户端,用来演示通过GET或者POST方式访问某个页面  

        packagehttp.demo;  importjava.io.IOException;  importorg.apache.commons.httpclient.*;   importorg.apache.commons.httpclient.methods.*;    public classSimpleClient {        publicstatic void main(String[] args) throws IOException {           HttpClientclient = new HttpClient();                //设置代理服务器地址和端口                 //client.getHostConfiguration().setProxy("proxy_host_addr",proxy_port);            //使用GET方法,如果服务器需要通过HTTPS连接,那只需要将下面URL中的http换成https           HttpMethodmethod = new GetMethod("http://java.sun.com");            //使用POST方法          //HttpMethodmethod = new PostMethod("http://java.sun.com");        client.executeMethod(method);           //打印服务器返回的状态          System.out.println(method.getStatusLine());           //打印返回的信息          System.out.println(method.getResponseBodyAsString());           //释放连接          method.releaseConnection();     }  }  

    • 向网页提交内容

      • 2.以GET或者POST方式向网页提交参数

        package http.demo;  import java.io.IOException;  import org.apache.commons.httpclient.*; import org.apache.commons.httpclient.methods.*;   /**  * 提交参数演示  *该程序连接到一个用于查询手机号码所属地的页面 *以便查询号码段1330227所在的省份以及城市  */   public classSimpleHttpClient {       public static void main(String[] args) throws IOException{           HttpClientclient = new HttpClient();         client.getHostConfiguration().setHost("www.imobile.com.cn",80, "http");           HttpMethodmethod = getPostMethod();//使用POST方式提交数据           client.executeMethod(method);                    //打印服务器返回的状态           System.out.println(method.getStatusLine());                    //打印结果页面           Stringresponse =  newString(method.getResponseBodyAsString().getBytes("8859_1"));          //打印返回的信息         System.out.println(response);                    method.releaseConnection();     }       /**       * 使用GET方式提交数据       * @return      */    private static HttpMethod getGetMethod(){         return new GetMethod("/simcard.php?simcard=1330227");   }     /**     * 使用POST方式提交数据     * @return     */   private static HttpMethod getPostMethod(){         PostMethodpost = new PostMethod("/simcard.php");         NameValuePairsimcard = new NameValuePair("simcard","1330227");         post.setRequestBody(newNameValuePair[] { simcard});         return post;       }  }

        package http.demo;import java.io.IOException;importorg.apache.commons.httpclient.*;importorg.apache.commons.httpclient.methods.*;/** * 提交参数演示 * 该程序连接到一个用于查询手机号码所属地的页面 * 以便查询号码段1330227所在的省份以及城市 */public class SimpleHttpClient {    public static void main(String[]args) throws IOException {        HttpClient client = newHttpClient();        client.getHostConfiguration().setHost("www.imobile.com.cn",80, "http");
                HttpMethod method =getPostMethod();//使用POST方式提交数据        client.executeMethod(method);        //打印服务器返回的状态        System.out.println(method.getStatusLine());        //打印结果页面        String response = newString(method.getResponseBodyAsString().getBytes("8859_1"));        //打印返回的信息        System.out.println(response);        method.releaseConnection();    }    /**     * 使用GET方式提交数据     * @return     */    private static HttpMethodgetGetMethod(){        return newGetMethod("/simcard.php?simcard=1330227");    }    /**     * 使用POST方式提交数据     * @return     */    private static HttpMethodgetPostMethod(){        PostMethod post = newPostMethod("/simcard.php");        NameValuePair simcard = newNameValuePair("simcard","1330227");        post.setRequestBody(newNameValuePair[] { simcard});        return post;    }} 

    • 处理页面重顶向

      • 3.处理页面重定向

        详细描述:

        状态码  对应HttpServletResponse的常量

        301   SC_MOVED_PERMANENTLY  页面已经永久移到另外一个新地址

        302   SC_MOVED_TEMPORARILY  页面暂时移动到另外一个新的地址

        303   SC_SEE_OTHER  客户端请求的地址必须通过另外的URL来访问

        307   SC_TEMPORARY_REDIRECT  SC_MOVED_TEMPORARILY

        下面的代码片段演示如何处理页面的重定向

      •     client.executeMethod(post);      System.out.println(post.getStatusLine().toString());      post.releaseConnection();      // 检查是否重定向      int statuscode= post.getStatusCode();      if ((statuscode ==HttpStatus.SC_MOVED_TEMPORARILY) ||           (statuscode== HttpStatus.SC_MOVED_PERMANENTLY) ||            (statuscode== HttpStatus.SC_SEE_OTHER) ||            (statuscode== HttpStatus.SC_TEMPORARY_REDIRECT)) {                        //读取新的URL地址                Headerheader = post.getResponseHeader("location");                          if(header != null) {                    Stringnewuri = header.getValue();                                  if((newuri == null) || (newuri.equals("")))                    newuri= "/";                                  GetMethodredirect = new GetMethod(newuri);                                  client.executeMethod(redirect);                                  System.out.println("Redirect:"+ redirect.getStatusLine().toString());                                  redirect.releaseConnection();                }else {                  System.out.println("Invalidredirect");               }    }
      • client.executeMethod(post);    System.out.println(post.getStatusLine().toString());    post.releaseConnection();    //检查是否重定向    int statuscode =post.getStatusCode();        if ((statuscode ==HttpStatus.SC_MOVED_TEMPORARILY) ||            (statuscode ==HttpStatus.SC_MOVED_PERMANENTLY) ||            (statuscode ==HttpStatus.SC_SEE_OTHER) ||            (statuscode ==HttpStatus.SC_TEMPORARY_REDIRECT)) {                // 读取新的URL地址                Header header =post.getResponseHeader("location");                if (header != null) {                    String newuri = header.getValue();                    if ((newuri == null) ||(newuri.equals(""))) {                        newuri = "/";                        GetMethod redirect = newGetMethod(newuri);                        client.executeMethod(redirect);                        System.out.println("Redirect:"+ redirect.getStatusLine().toString());                        redirect.releaseConnection();                    }                } else {                    System.out.println("Invalidredirect");                }    }
    • 模拟输入用户名密码进行登录

      • 4.模拟输入用户名和口令进行登录

            本小节应该说是HTTP客户端编程中最常碰见的问题,很多网站的内容都只是对注册用户可见的,这种情况下就必须要求使用正确的用户名和口令登录成功后,方可浏览到想要的页面。因为HTTP协议是无状态的,也就是连接的有效期只限于当前请求,请求内容结束后连接就关闭了。在这种情况下为了保存用户的登录信息必须使用到Cookie机制。以JSP/Servlet为例,当浏览器请求一个JSP或者是Servlet的页面时,应用服务器会返回一个参数,名为jsessionid(因不同应用服务器而异),值是一个较长的唯一字符串的Cookie,这个字符串值也就是当前访问该站点的会话标识。浏览器在每访问该站点的其他页面时候都要带上jsessionid这样的Cookie信息,应用服务器根据读取这个会话标识来获取对应的会话信息。

            对于需要用户登录的网站,一般在用户登录成功后会将用户资料保存在服务器的会话中,这样当访问到其他的页面时候,应用服务器根据浏览器送上的Cookie中读取当前请求对应的会话标识以获得对应的会话信息,然后就可以判断用户资料是否存在于会话信息中,如果存在则允许访问页面,否则跳转到登录页面中要求用户输入帐号和口令进行登录。这就是一般使用JSP开发网站在处理用户登录的比较通用的方法。

         

            对于HTTP的客户端来讲,如果要访问一个受保护的页面时就必须模拟浏览器所做的工作,首先就是请求登录页面,然后读取Cookie值;再次请求登录页面并加入登录页所需的每个参数;最后就是请求最终所需的页面。当然在除第一次请求外其他的请求都需要附带上Cookie信息以便服务器能判断当前请求是否已经通过验证。

        package http.demo;  import org.apache.commons.httpclient.*;  import org.apache.commons.httpclient.cookie.*;  import org.apache.commons.httpclient.methods.*;  /**  * 用来演示登录表单的示例  */public classFormLoginDemo {       static final String LOGON_SITE = "localhost";       static final int    LOGON_PORT = 8080;       public static void main(String[] args) throws Exception{           HttpClientclient = new HttpClient();           client.getHostConfiguration().setHost(LOGON_SITE,LOGON_PORT);           //模拟登录页面login.jsp->main.jsp           PostMethodpost = new PostMethod("/main.jsp");           NameValuePairname = new NameValuePair("name", "ld");               NameValuePairpass = new NameValuePair("password", "ld");              post.setRequestBody(newNameValuePair[]{name,pass});           intstatus = client.executeMethod(post);           System.out.println(post.getResponseBodyAsString());           post.releaseConnection();           //查看cookie信息           CookieSpeccookiespec = CookiePolicy.getDefaultSpec();           Cookie[]cookies = cookiespec.match(LOGON_SITE, LOGON_PORT, "/",false, client.getState().getCookies());           if(cookies.length == 0) {              System.out.println("None");             }else {              for(int i = 0; i < cookies.length; i++) {                  System.out.println(cookies[i].toString());                 }          }          //访问所需的页面main2.jsp           GetMethodget = new GetMethod("/main2.jsp");           client.executeMethod(get);           System.out.println(get.getResponseBodyAsString());           get.releaseConnection();       }   }  

        package http.demo;import org.apache.commons.httpclient.*;import org.apache.commons.httpclient.cookie.*;import org.apache.commons.httpclient.methods.*;/*** 用来演示登录表单的示例*/public class FormLoginDemo {    static final String LOGON_SITE ="localhost";    static final int    LOGON_PORT= 8080;    public static void main(String[]args) throws Exception{        HttpClient client = newHttpClient();        client.getHostConfiguration().setHost(LOGON_SITE,LOGON_PORT);        //模拟登录页面login.jsp->main.jsp        PostMethod post = newPostMethod("/main.jsp");        NameValuePair name = newNameValuePair("name", "ld");        NameValuePair pass = newNameValuePair("password", "ld");        post.setRequestBody(newNameValuePair[]{name,pass});        int status =client.executeMethod(post);        System.out.println(post.getResponseBodyAsString());        post.releaseConnection();        //查看cookie信息        CookieSpec cookiespec =CookiePolicy.getDefaultSpec();        Cookie[] cookies =cookiespec.match(LOGON_SITE, LOGON_PORT, "/", false,client.getState().getCookies());        if (cookies.length == 0) {            System.out.println("None");        } else {            for (int i = 0; i <cookies.length; i++) {                System.out.println(cookies[i].toString());            }        }        //访问所需的页面main2.jsp        GetMethod get = newGetMethod("/main2.jsp");        client.executeMethod(get);        System.out.println(get.getResponseBodyAsString());        get.releaseConnection();    }} 

    • 提交xml格式的参数

      • 5.提交XML格式参数

        提交XML格式的参数很简单,仅仅是一个提交时候的ContentType问题,下面的例子演示从文件文件中读取XML信息并提交给服务器的过程,该过程可以用来测试Web服务。

      •  

        import java.io.File;  import java.io.FileInputStream;  import org.apache.commons.httpclient.HttpClient;  import org.apache.commons.httpclient.methods.EntityEnclosingMethod;  import org.apache.commons.httpclient.methods.PostMethod;  /**  * 用来演示提交XML格式数据的例子  */ public classPostXMLClient {       public static void main(String[] args) throws Exception {           Fileinput = new File(“test.xml”);           PostMethodpost = newPostMethod(“http://localhost:8080/httpclient/xml.jsp”);           //设置请求的内容直接从文件中读取           post.setRequestBody(newFileInputStream(input));           if(input.length() < Integer.MAX_VALUE)               post.setRequestContentLength(input.length());           else                        post.setRequestContentLength(EntityEnclosingMethod.CONTENT_LENGTH_CHUNKED);          //指定请求内容的类型           post.setRequestHeader("Content-type","text/xml; charset=GBK");           HttpClienthttpclient = new HttpClient();           intresult = httpclient.executeMethod(post);           System.out.println("Responsestatus code: " + result);           System.out.println("Responsebody: ");           System.out.println(post.getResponseBodyAsString());           post.releaseConnection();       }  } 

      • import java.io.File;import java.io.FileInputStream;import org.apache.commons.httpclient.HttpClient;import org.apache.commons.httpclient.methods.EntityEnclosingMethod;import org.apache.commons.httpclient.methods.PostMethod;/** * 用来演示提交XML格式数据的例子 */public class PostXMLClient {    public static void main(String[]args) throws Exception {        File input = new File(“test.xml”);        PostMethod post = newPostMethod(“http://localhost:8080/httpclient/xml.jsp”);        // 设置请求的内容直接从文件中读取        post.setRequestBody(newFileInputStream(input));        if (input.length() <Integer.MAX_VALUE)            post.setRequestContentLength(input.length());        else                       post.setRequestContentLength(EntityEnclosingMethod.CONTENT_LENGTH_CHUNKED);        // 指定请求内容的类型        post.setRequestHeader("Content-type","text/xml; charset=GBK");        HttpClient httpclient = newHttpClient();        int result =httpclient.executeMethod(post);        System.out.println("Responsestatus code: " + result);        System.out.println("Responsebody: ");        System.out.println(post.getResponseBodyAsString());        post.releaseConnection();    }} 

    • 上传文件

      • 6.通过HTTP上传文件

               httpclient使用了单独的一个HttpMethod子类来处理文件的上传,这个类就是MultipartPostMethod,该类已经封装了文件上传的细节,我们要做的仅仅是告诉它我们要上传文件的全路径即可,下面的代码片段演示如何使用这个类。

      •     MultipartPostMethod filePost = new MultipartPostMethod(targetURL);     filePost.addParameter("fileName", targetFilePath);      HttpClientclient = new HttpClient();          //由于要上传的文件可能比较大,因此在此设置最大的连接超时时间      client.getHttpConnectionManager().getParams().setConnectionTimeout(5000);     int status =client.executeMethod(filePost);     MultipartPostMethod filePost = newMultipartPostMethod(targetURL);    filePost.addParameter("fileName",targetFilePath);    HttpClient client = newHttpClient();    //由于要上传的文件可能比较大,因此在此设置最大的连接超时时间    client.getHttpConnectionManager().getParams().setConnectionTimeout(5000);    int status =client.executeMethod(filePost); 

        上面代码中,targetFilePath即为要上传的文件所在的路径。

    • 访问启用认证的页面

      • 7.访问启用认证的页面

            我们经常会碰到这样的页面,当访问它的时候会弹出一个浏览器的对话框要求输入用户名和密码后方可,这种用户认证的方式不同于我们在前面介绍的基于表单的用户身份验证。

            这是HTTP的认证策略,httpclient支持三种认证方式包括:基本、摘要以及NTLM认证。

            其中基本认证最简单、通用但也最不安全;摘要认证是在HTTP1.1中加入的认证方式,

        NTLM则是微软公司定义的而不是通用的规范,最新版本的NTLM是比摘要认证还要安全的一种方式。

      • import org.apache.commons.httpclient.HttpClient;  import org.apache.commons.httpclient.UsernamePasswordCredentials;   import org.apache.commons.httpclient.methods.GetMethod;  public classBasicAuthenticationExample {      public BasicAuthenticationExample() {      }      public static void main(String[] args) throws Exception {          HttpClientclient = new HttpClient();          client.getState().setCredentials("www.verisign.com",                                         "realm",                                         newUsernamePasswordCredentials("username", "password"));          GetMethodget = newGetMethod("https://www.verisign.com/products/index.html");          get.setDoAuthentication(true );          intstatus = client.executeMethod( get );          System.out.println(status+""+get.getResponseBodyAsString());          get.releaseConnection();      }  }  

        import org.apache.commons.httpclient.HttpClient;import org.apache.commons.httpclient.UsernamePasswordCredentials;import org.apache.commons.httpclient.methods.GetMethod;public classBasicAuthenticationExample {    public BasicAuthenticationExample(){    }    public static void main(String[]args) throws Exception {        HttpClient client = newHttpClient();        client.getState().setCredentials("www.verisign.com",                                         "realm",                                         newUsernamePasswordCredentials("username", "password"));        GetMethod get = newGetMethod("https://www.verisign.com/products/index.html");        get.setDoAuthentication( true );        int status = client.executeMethod(get );        System.out.println(status+""+get.getResponseBodyAsString());        get.releaseConnection();    }} 

    • 多线程模式使用httpclient

      • 8.多线程模式下使用httpclient

            多线程同时访问httpclient,例如同时从一个站点上下载多个文件。对于同一个HttpConnection同一个时间只能有一个线程访问,为了保证多线程工作环境下不产生冲突,httpclient使用了一个多线程连接管理器类:MultiThreadedHttpConnectionManager,要使用这个类很简单,只需要在构造HttpClient实例的时候传入即可,代码如下:

            MultiThreadedHttpConnectionManager connectionManager = newMultiThreadedHttpConnectionManager();      HttpClientclient = new HttpClient(connectionManager);  

参考

  • RFC 1945 Hypertext TransferProtocol -- HTTP/1.0

  • RFC 2616 Hypertext TransferProtocol -- HTTP/1.1

  • RFC 2617 HTTP Authentication:Basic and Digest Access Authentication

  • RFC 2109 HTTP State ManagementMechanism (Cookies)

  • RFC 2965 HTTP State ManagementMechanism (Cookies v2)


原创粉丝点击