HttpClient基础

来源：互联网发布：jdk1.7 64位下载linux 编辑：程序博客网时间：2024/05/18 03:00

1.1    执行请求
HttpClient的最重要的功能是执行HTTP方法。一个HTTP方法的执行涉及到一个或多个HTTP请求或HTTP响应的交流，HttpClient通常是在内部处理的。用户将提供一个执行请求对象，HttpClient发送请求到目标服务器返回一个相应的响应对象，如果执行失败则抛出一个异常。所以，HttpClient API的主要切入点是HttpClient的接口，它定义了上述约定。
下面是一个请求执行过程中的最简单形式的例子：
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
int l;
byte[] tmp = new byte[2048];
while ((l = instream.read(tmp)) != -1) {
}
}
1.1.1       HTTP请求
所有的HTTP请求包含一个由请求行组成的一个方法名，一个请求的URI和一个HTTP协议的版本。
HttpClient的支持在HTTP/1.1规范中定义的所有的HTTP方法：GET, HEAD, POST, PUT, DELETE, TRACE 和 OPTIONS。每有一个方法都有一个对应的类：HttpGet，HttpHead，HttpPost，HttpPut，HttpDelete，HttpTrace和HttpOptions。所有的这些类均实现了HttpUriRequest接口，故可以作为execute的执行参数使用。请求URI是能够应用请求的统一资源标识符。 HTTP请求的URI包含一个协议计划protocol scheme，主机名host name,，可选的端口optional port，资源的路径resource path，可选的查询optional query和可选的片段optional fragment。

HttpGet httpget = new HttpGet(
   "http://www.google.com/search?hl=en&q=httpclient&btnG=Google+Search&aq=f&oq=");
HttpClient提供了一系列实用的方法来简化创建和修改请求URI。
URI可以组装编程：
URI uri = URIUtils.createURI("http", "www.google.com", -1, "/search",
"q=httpclient&btnG=Google+Search&aq=f&oq=", null);
HttpGet httpget = new HttpGet(uri);
System.out.println(httpget.getURI());
输出>
http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq=
查询字符串也可以通过添加参数列表来生成:
List<NameValuePair> qparams = new ArrayList<NameValuePair>();
qparams.add(new BasicNameValuePair("q", "httpclient"));
qparams.add(new BasicNameValuePair("btnG", "Google Search"));
qparams.add(new BasicNameValuePair("aq", "f"));
qparams.add(new BasicNameValuePair("oq", null));
URI uri = URIUtils.createURI("http", "www.google.com", -1, "/search",
URLEncodedUtils.format(qparams, "UTF-8"), null);
HttpGet httpget = new HttpGet(uri);
System.out.println(httpget.getURI());
输出>
http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq=
1.1.2       HTTP响应
HTTP响应是由服务器收到和解释一个请求消息后返回给客户端的消息。该消息的第一行包含遵循的协议版本,他由一个数字状态代码及其相关的文本来表示。
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");

System.out.println(response.getProtocolVersion());
System.out.println(response.getStatusLine().getStatusCode());
System.out.println(response.getStatusLine().getReasonPhrase());
System.out.println(response.getStatusLine().toString());
输出>
HTTP/1.1
200
OK
HTTP/1.1 200 OK
1.1.3       Headers处理
一个HTTP消息可以包含一系列headers参数描述信息，例如内容长度，内容类型等。 HttpClient的提供方法来检索，添加，删除和枚举headers。
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,

HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie", "c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie","c2=b; path=\"/\", c3=c; domain=\"localhost\"");
Header h1 = response.getFirstHeader("Set-Cookie");
System.out.println(h1);
Header h2 = response.getLastHeader("Set-Cookie");
System.out.println(h2);
Header[] hs = response.getHeaders("Set-Cookie");
System.out.println(hs.length);
输出>
Set-Cookie: c1=a; path=/; domain=localhost
Set-Cookie: c2=b; path="/", c3=c; domain="localhost"
2
最有效获取所有的给定类型的headers方式是使用HeaderIterator接口。
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
HeaderIterator it = response.headerIterator("Set-Cookie");
while (it.hasNext()) {
System.out.println(it.next());
}
输出>
Set-Cookie: c1=a; path=/; domain=localhost
Set-Cookie: c2=b; path="/", c3=c; domain="localhost"
它还提供了方便的方法来解析HTTP消息的单个header元素
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");

HeaderElementIterator it = new BasicHeaderElementIterator(
response.headerIterator("Set-Cookie"));

while (it.hasNext()) {
HeaderElement elem = it.nextElement();
System.out.println(elem.getName() + " = " + elem.getValue());
NameValuePair[] params = elem.getParameters();
for (int i = 0; i < params.length; i++) {
      System.out.println(" " + params);
}
输出>
c1 = a
path=/
domain=localhost
c2 = b
path=/
c3 = c
domain=localhost
1.1.4 HTTP 实体
HTTP消息的可以进行内容实体和请求或响应关联。可以在一些要求和一些回应中找到实体，因为它们是可选的。实体内附与请求之中。 HTTP规范定义了两个实体内附方法：POST和PUT。响应通常将会附上一个内容实体。但是响应HEAD方法和 204 No Content, 304 Not Modified, 205 Reset Content responses 除外。
HttpClient的区分三种不同实体的地方在于内容来源于：
streamed流媒体：内容是从收到的流，或在运行中产生的。特别是包括被从HTTP响应收到的实体。流媒体的实体一般不可重复。
self-contained独立的：内容是在内存或以从一个连接或其他实体的独立获得的。独立的的实体，一般可重复的。这种类型的实体将主要用于内附在HTTP请求中。
wrapping包装：实体内容是从另一个实体获得。
当得到一个HTTP响应流的内容的时候，这种区分对于连接管理是很重要的。对于请求实体，通过应用程序来创建和只通过使用的HttpClient发送，流和自载之间差别很小。在这种情况下，建议考虑非重复的流实体，以及那些重复的自载实体。
1.1.4.1          重复实体
一个实体可以是可重复的，这意味着它的内容可以被读取一次以上。唯一有可能是的独立的实体（如ByteArrayEntity或StringEntity（））
1.1.4.2          使用HTTP实体
由于一个实体能够表示二进制和字符的内容，它可以提供编码的支持（支持文字、IE和字符内容）。
这个实体在执行封闭内容的请求的时候或者在请求成功和响应返回成功的时候被创建。
若要读取从实体内容，一可以通过检索HttpEntity＃getContent（）方法，它返回一个java.io.InputStream，或一个可以提供一个输出流的HttpEntity＃writeTo（OutputStream中）方法的输入流，这将返回已被写入给定的流的所有内容。
当通过传入的消息收到实体，方法HttpEntity＃getContentType（）和HttpEntity＃getContentLength（）方法可用于阅读通用元数据metadata，如Content-Type，Content-Length headers（如果可用）。由于Content-Type header可以包含一个像text/plain或者text/html的文本mime-types的character encoding，HttpEntity＃getContentEncoding（）方法用来读取此信息。如果headers是不可用，返回的长度是-1，content type并为NUL。如果Content – Type header可用，将返回一个header对象。
当创建了一个即将卸任的消息实体，该meta data必须提供由该实体的创造者。
StringEntity myEntity = new StringEntity("important message",
"UTF-8");

System.out.println(myEntity.getContentType());
System.out.println(myEntity.getContentLength());
System.out.println(EntityUtils.getContentCharSet(myEntity));
System.out.println(EntityUtils.toString(myEntity));
System.out.println(EntityUtils.toByteArray(myEntity).length);
stdout >
Content-Type: text/plain; charset=UTF-8
17
UTF-8
important message
17
1.1.5 确保资源释放
当响应实体完成之后，重要的是要确保所有的实体内容已被完全消耗，使该连接可以安全地返回到连接池，重新由连接管理器提供给后续请求使用。最简单的方法是调用HttpEntity＃consumeContent（）方法来消耗流上的所有可用的内容。当检测到内容已经达到流末尾的时候，HttpClient会自动释放底层连接返回到连接管理器。HttpEntity＃consumeContent（）方法多次调用也是安全的。
当只有小部分实体响应内容需要被检索和消费。其余内容，使用可重复的连接性能损失太大，可以简单地调用HttpUriRequest＃abort()方法来终止请求。
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
int byteOne = instream.read();
int byteTwo = instream.read();
// Do not need the rest
httpget.abort();
}
该连接将不可重用，但是所有资源会被释放。
1.1.6 获取实体内容
获取实体内容推荐的方法是通过使用HttpEntity＃getContent（）或HttpEntity＃writeTo（OutputStream中）方法。 HttpClient的还配备了EntityUtils类，它暴露了一些静态方法，以更轻松地阅读一个实体的内容或资料。使用这个类的方法和直接使用java.io.InputStream方法不同的是，他可以检索字符串中的全部内容机构/字节数组。强烈建议不要使用EntityUtils，除非响应实体来自一个可信赖的HTTP服务器和已知的有限长度。
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
long len = entity.getContentLength();
if (len != -1 && len < 2048) {
      System.out.println(EntityUtils.toString(entity));
} else {
      // Stream content out
}
}
在某些情况下，实体内容可能需要能够读被多次读取。在这种情况下实体的内容必须以某种方式被缓冲在内存或磁盘上。最简单的方法是通过BufferedHttpEntity类来封装原始实体。原始实体内容可以从内存中的缓冲区来读取。其他方式封装实体都包含原始实体。
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
entity = new BufferedHttpEntity(entity);
}
1.1.7 生产实体内容
HttpClient提供了一些类可以从HTTP连接内容中获得效地流。这些类的实例可以与实体内附如POST和PUT请求，以便为即将离任的HTTP请求附上实体内容。 HttpClient的为最常见的数据容器几类，如串，字节数组输入流和文件：StringEntity，ByteArrayEntity，InputStreamEntity和FileEntity。
File file = new File("somefile.txt");
FileEntity entity = new FileEntity(file, "text/plain; charset=\"UTF-8\"");

HttpPost httppost = new HttpPost("http://localhost/action.do");
httppost.setEntity(entity);
请注意InputStreamEntity是不可重复的，因为它只能从底层数据流中读取一次。一般来说，建议实现自定义HttpEntity类是自载的，而不是使用通用InputStreamEntity。 FileEntity可以是一个很好的起点。
1.1.7.1          动态内容实体
通常的HTTP实体需要在执行上下文的时候动态生成的。 HttpClient的提供使用EntityTemplate实体类和ContentProducer接口支持动态实体。内容制作是通过写需求的内容到一个输出流，每次请求的时候都会产生。因此，通过EntityTemplate创建实体通常是独立的，重复性好。
ContentProducer cp = new ContentProducer() {
public void writeTo(OutputStream outstream) throws IOException {
      Writer writer = new OutputStreamWriter(outstream, "UTF-8");
      writer.write("<response>");
      writer.write("  <content>");
      writer.write(" important stuff");
      writer.write("  </content>");
      writer.write("</response>");
      writer.flush();
}
};
HttpEntity entity = new EntityTemplate(cp);
HttpPost httppost = new HttpPost("http://localhost/handler.do");
httppost.setEntity(entity);
1.1.7.2          HTML forms
许多应用程序经常需要模拟一个HTML表单提交的过程，例如，以登录到Web应用程序或提交的输入数据。 HttpClient的实体类UrlEncodedFormEntity可以帮助实现这一步。
List<NameValuePair> formparams = new ArrayList<NameValuePair>();
formparams.add(new BasicNameValuePair("param1", "value1"));
formparams.add(new BasicNameValuePair("param2", "value2"));
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formparams, "UTF-8");
HttpPost httppost = new HttpPost("http://localhost/handler.do");
httppost.setEntity(entity);
这UrlEncodedFormEntity实例将使用 URL编码的编码参数，并出示下列内容：
param1=value1¶m2=value2
1.1.7.3          内容组块
一般来说，建议当HTTP消息的发送的时候让HttpClient的选择最合适的传输编码。这是可能的，但是，HttpClient首选通过设置HttpEntity＃setChunked（）为true设置编码。请注意的HttpClient将使用这个标志作为提示。当使用HTTP协议的版本不支持时候该值将被忽略时，如HTTP/1.0的块编码。
StringEntity entity = new StringEntity("important message",
"text/plain; charset=\"UTF-8\"");
entity.setChunked(true);
HttpPost httppost = new HttpPost("http://localhost/acrtion.do");
httppost.setEntity(entity);
1.1.8 响应处理程序
最简单和最方便的处理响应方式是通过使用ResponseHandler接口。这种方法完全免除了用户去担心连接管理。无论执行请求是否成功或导致异常，HttpClient的ResponseHandler会确保自动释放该连接回连接管理器。
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://localhost/");

ResponseHandler<byte[]> handler = new ResponseHandler<byte[]>() {
public byte[] handleResponse(
         HttpResponse response) throws ClientProtocolException, IOException {
      HttpEntity entity = response.getEntity();
      if (entity != null) {
         return EntityUtils.toByteArray(entity);
      } else {
         return null;
      }
}
};

byte[] response = httpclient.execute(httpget, handler);
1.2    HTTP的执行上下文
最初的HTTP被设计成一个无状态，响应，要求面向协议。然而，现实世界应用程序通常需要能够坚持通过几个逻辑上相关的请求响应交换状态信息。为了使应用程序能够保持状态，HttpClient允许 HTTP请求在一个特定的上下文中执行，被称为HTTP的上下文。如果一个逻辑同样的情况下连续请求之间重用,多个逻辑相关的要求可以参加到会话中，。 HTTP上下文功能类似于java.util.Map的<String, Object>。它只是一个任意命名的值的集合。可以在请求执行或者执行完毕之后校验下文的时候在添加属性参数到应用程序。
在HTTP请求执行的过程中HttpClient添加属性到执行上下文：
'http.connection'：HttpConnection实例代表实际连接到目标服务器。
'http.target_host'：HttpHost实例代表连接的目标。
'http.proxy_host'：HttpHost实例代表连接代理，如果使用
'http.request'：HttpRequest实例代表实际的HTTP请求。
'http.response'：HttpResponse实例代表了实际的HTTP响应。
'http.request_sent'：java.lang.Boolean的对象，表示该标志指示是否实际的要求已完全传输到连接的目标。

例如，想确定最后的重定向目标，一种方法可以在要求执行之后交验该http.target_host属性值：

DefaultHttpClient httpclient = new DefaultHttpClient();

HttpContext localContext = new BasicHttpContext();
HttpGet httpget = new HttpGet("http://www.google.com/");

HttpResponse response = httpclient.execute(httpget, localContext);

HttpHost target = (HttpHost) localContext.getAttribute(
ExecutionContext.HTTP_TARGET_HOST);

System.out.println("Final target: " + target);

HttpEntity entity = response.getEntity();
if (entity != null) {
entity.consumeContent();
}
输出>
Final target:
http://www.google.ch
1.3    异常处理
HttpClient的可以抛出两种例外情况：在I/O错误的时候java.io.IOException如socket timeout或者socket reset。HttpException 如违反了HTTP协议的HTTP错误故障。通常的I/O错误被视为非致命性和可恢复的，而HTTP协议错误被认为是致命的错误，不能自动收回。
1.3.1       HTTP传输安全
HTTP协议并没有适用所有应用。 HTTP是一个简单的请求/响应协议，最初旨在支持静态或动态生成的内容检索。它从未打算支持事务操作。例如，HTTP服务器将考虑其对履行合同的一部分，如果它在接收和处理请求成功，产生了反应，发出了一个状态代码返回给客户端。该服务器将不作任何尝试回滚事务，如果客户端无法接收的全部原因是读超时，要求取消或系统崩溃的反应。如果客户决定重试相同的请求，服务器将不可避免地最终执行相同的交易超过一次。在某些情况下，这可能导致应用数据损坏或不一致的应用现状。
虽然HTTP从来没有被设计成支持事务处理，它仍然可以作为传输协议为应用程序用执行关键任务。为了确保HTTP传输层安全，系统必须确保幂等的HTTP方法在应用层。