【转】 httpclient 模拟浏览器动作需注意的cookie和HTTP头等信息

来源:互联网 发布:淘宝网txt 编辑:程序博客网 时间:2024/05/20 04:13
ommons-httpclient是apache下的一个开源项目,提供了一个纯java实现的http客户端,使用它可以很方便发送HTTP请求,接受HTTP应答,自动管理Cookie等等。

对于contact-list类库来说,需要使用的功能有,自动管理Cookie,设置HTTP头,发送HTTP请求,接受HTTP应答,转发HTTP重定向,还有输出HTTP请求/应答日志,下面对这些功能的实现进行解释:

1. 自动管理Cookie
view source
print?
1.public EmailImporter(String email, String password, String encoding) {
2.    ......
3.    client = new HttpClient();
4.    client.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
5.    client.getParams().setParameter("http.protocol.single-cookie-header", true);
6.}

其中将HttpClient的Cookie策略设置为CookiePolicy.BROWSER_COMPATIBILITY,即表示java client将按照浏览器的方式来自动处理Cookie。当然你也可以在运行过程中手动调整cookie,比如:

hotmail登录之前需要设置当前时间的Cookie:
view source
print?
1.client.getState().addCookie(new Cookie("login.live.com", "CkTst", "G" + new Date().getTime()));

不过,httpclient似乎没有提供删除cookie的功能,于是我增加了两个cookie管理的接口,一个是保留指定的cookies,一个是删除指定的cookies:
view source
print?
01.protected void retainCookies(String[] cookieNames) {
02.    Cookie[] cookies = client.getState().getCookies();
03.    ArrayList<Cookie> retainCookies = new ArrayList<Cookie>();
04.    for (Cookie cookie : cookies) {
05.        if (Arrays.binarySearch(cookieNames, cookie.getName()) >= 0) {
06.            retainCookies.add(cookie);
07.        }
08.    }
09.    client.getState().clearCookies();
10.    client.getState().addCookies(retainCookies.toArray(new Cookie[0]));
11.}
12.
13.protected void removeCookies(String[] cookieNames) {
14.    Cookie[] cookies = client.getState().getCookies();
15.    ArrayList<Cookie> retainCookies = new ArrayList<Cookie>();
16.    for (Cookie cookie : cookies) {
17.        if (Arrays.binarySearch(cookieNames, cookie.getName()) < 0) {
18.            retainCookies.add(cookie);
19.        }
20.    }
21.    client.getState().clearCookies();
22.    client.getState().addCookies(retainCookies.toArray(new Cookie[0]));
23.}

2. 设置HTTP头:

http头的设置,可以让邮件服务器认为是在和浏览器打交道,而避免被refuse的可能:
view source
print?
01.private void setHeaders(HttpMethod method) {
02.    method.setRequestHeader("Accept", "text/html,application/xhtml+xml,application/xml;");
03.    method.setRequestHeader("Accept-Language", "zh-cn");
04.    method.setRequestHeader("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3");
05.    method.setRequestHeader("Accept-Charset", encoding);
06.    method.setRequestHeader("Keep-Alive", "300");
07.    method.setRequestHeader("Connection", "Keep-Alive");
08.    method.setRequestHeader("Cache-Control", "no-cache");
09.}

另外,在GET和POST的时候设置referer值,以及在POST的时候设置Content-Type:
view source
print?
1.protected String doPost(String actionUrl, NameValuePair[] params, String referer) throws HttpException, IOException {
2.    ......
3.    method.setRequestHeader("Referer", referer);
4.    method.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
5.    ......
6.}

3. 发送HTTP请求,接收HTTP应答。在contact-list中只使用了GET和POST请求,我也做了简单的封装:
view source
print?
01.protected String doGet(String url, String referer) throws HttpException, IOException {
02.    GetMethod method = new GetMethod(url);
03.    setHeaders(method);
04.    method.setRequestHeader("Referer", referer);
05.    // log request
06.    client.executeMethod(method);
07.    String responseStr = readInputStream(method.getResponseBodyAsStream());
08.    // log response
09.    method.releaseConnection();
10.    lastUrl = method.getURI().toString();
11.    return responseStr;
12.}
13.
14.protected String doPost(String actionUrl, NameValuePair[] params, String referer) throws HttpException, IOException {
15.    PostMethod method = new PostMethod(actionUrl);
16.    setHeaders(method);
17.    method.setRequestHeader("Referer", referer);
18.    method.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
19.    method.setRequestBody(params);
20.    // log request
21.    client.executeMethod(method);
22.    String responseStr = readInputStream(method.getResponseBodyAsStream());
23.    // log response
24.    method.releaseConnection();
25.    if (method.getResponseHeader("Location") != null) {
26.        // do redirect
27.    } else {
28.        lastUrl = method.getURI().toString();
29.        return responseStr;
30.    }
31.}

4. HTTP重定向,主要是两种,一种是根据HTTP头的Location
view source
print?
1.if (method.getResponseHeader("Location").getValue().startsWith("http")) {
2.    return doGet(method.getResponseHeader("Location").getValue());
3.} else {
4.    return doGet("http://" + getResponseHost(method) + method.getResponseHeader("Location").getValue());
5.}

另一种是根据javascript中的window.location.replace。

5. 输出请求/应答日志,这个对调试非常重要:
view source
print?
01.private void logGetRequest(GetMethod method) throws URIException {
02.    logger.debug("do get request: " + method.getURI().toString());
03.    logger.debug("header:/n" + getHeadersStr(method.getRequestHeaders()));
04.    logger.debug("cookie:/n" + getCookieStr());
05.}
06.
07.private void logGetResponse(GetMethod method, String responseStr) throws URIException {
08.    logger.debug("do get response: " + method.getURI().toString());
09.    logger.debug("header: /n" + getHeadersStr(method.getResponseHeaders()));
10.    logger.debug("body: /n" + responseStr);
11.}
12.
13.private void logPostRequest(PostMethod method) throws URIException {
14.    logger.debug("do post request: " + method.getURI().toString());
15.    logger.debug("header:/n" + getHeadersStr(method.getRequestHeaders()));
16.    logger.debug("body:/n" + getPostBody(method.getParameters()));
17.    logger.debug("cookie:/n" + getCookieStr());
18.}
19.
20.private void logPostResponse(PostMethod method, String responseStr) throws URIException {
21.    logger.debug("do post response:" + method.getURI().toString());
22.    logger.debug("header:/n" + getHeadersStr(method.getResponseHeaders()));
23.    logger.debug("body:/n" + responseStr);
24.}
25.
26.private String getHeadersStr(Header[] headers) {
27.    StringBuilder builder = new StringBuilder();
28.    for (Header header : headers) {
29.        builder.append(header.getName()).append(": ").append(header.getValue()).append("/n");
30.    }
31.    return builder.toString();
32.}
33.
34.private String getPostBody(NameValuePair[] postValues) {
35.    StringBuilder builder = new StringBuilder();
36.    for (NameValuePair pair : postValues) {
37.        builder.append(pair.getName()).append(":").append(pair.getValue()).append("/n");
38.    }
39.    return builder.toString();
40.}
41.
42.private String getCookieStr() {
43.    Cookie[] cookies = client.getState().getCookies();
44.    StringBuilder builder = new StringBuilder();
45.    for (Cookie cookie : cookies) {
46.        builder.append(cookie.getDomain()).append(":")
47.               .append(cookie.getName()).append("=").append(cookie.getValue()).append(";")
48.               .append(cookie.getPath()).append(";")
49.               .append(cookie.getExpiryDate()).append(";")
50.               .append(cookie.getSecure()).append(";/n");
51.    }
52.    return builder.toString();
53.}
0 0
原创粉丝点击