java 多线程实现 爬虫京东搜索商品爬虫

来源:互联网 发布:淘宝买家刷到钻要多久 编辑:程序博客网 时间:2024/05/17 22:58
第一步

我们先来分析一下我们本次需要的参数内容

入口如下

https://search.jd.com/Search?keyword=%E7%AC%94%E8%AE%B0%E6%9C%AC%E7%94%B5%E8%84%91&enc=utf-8&wq=%E7%AC%94%E8%AE%B0%E6%9C%AC%E7%94%B5%E8%84%91&pvid=0b09350ac3df4f24886bb7a35d3b69ff


位置分析

id="J_goodsList" 

所有商品都在这个容器中

 

data-sku="5025518"

商品的编号

 

class="p-price"

商品的价格

 

class="p-name p-name-type-2"

商品名称

 

class="err-product" src

图片位置所在的img


我们需要去下总页数


入口如下

https://search.jd.com/Search?keyword=笔记本电&enc=utf-8&qrst=1&rt=1&stop=1&vt=2&wq=笔记本电脑&page=3&s=57&click=0


参数解析

keyword

笔记本电脑

关键字

enc

utf-8

编码格式

wq

笔记本电脑

关键字

qrst

1

不知道是个什么鬼,没有也行

rt

1

 

stop

1

 

vt

2

我猜可能是步长的

page

3

Page 都是奇数 不知道为什么



第二步

直接上代码

  1. 创建父工程 主要用来管理jar包版本和插件版本之类的

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">    <modelVersion>4.0.0</modelVersion>    <groupId>com.jianqiao.clawer</groupId>    <artifactId>clawer-system</artifactId>    <packaging>pom</packaging>    <version>1.0-SNAPSHOT</version>    <modules>        <module>clawer-jd-product</module>    </modules>    <name>clawer-system Maven Webapp</name>    <url>http://maven.apache.org</url>    <!-- 集中定义依赖版本号 -->    <properties>        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>        <junit.version>4.12</junit.version>        <spring.version>4.1.3.RELEASE</spring.version>        <mybatis.version>3.4.1</mybatis.version>        <mybatis.spring.version>1.3.1</mybatis.spring.version>        <mybatis.paginator.version>1.2.15</mybatis.paginator.version>        <mysql.version>5.1.32</mysql.version>        <slf4j.version>1.6.4</slf4j.version>        <jackson.version>2.4.2</jackson.version>        <druid.version>1.0.9</druid.version>        <jolbox.version>0.8.0.RELEASE</jolbox.version>        <jstl.version>1.2</jstl.version>        <servlet-api.version>2.5</servlet-api.version>        <jsp-api.version>2.0</jsp-api.version>        <joda-time.version>2.5</joda-time.version>        <commons-lang3.version>3.3.2</commons-lang3.version>        <commons-io.version>1.3.2</commons-io.version>        <commons-net.version>3.3</commons-net.version>        <pagehelper.version>5.0.3</pagehelper.version>        <mapper.version>2.3.4</mapper.version>        <jsqlparser.version>0.9.1</jsqlparser.version>        <commons-fileupload.version>1.3.1</commons-fileupload.version>        <commons-codec.version>1.9</commons-codec.version>        <jedis.version>2.7.2</jedis.version>        <solrj.version>4.10.3</solrj.version>        <dubbo.version>2.5.3</dubbo.version>        <zookeeper.version>3.4.7</zookeeper.version>        <zkclient.version>0.1</zkclient.version>        <activemq.version>5.12.0</activemq.version>        <freemarker.version>2.3.23</freemarker.version>        <!--quartz-->        <quartz.version>2.2.2</quartz.version>        <uediter.version>1.1.1</uediter.version>        <json.version>20160212</json.version>        <fastdfs_client.version>1.25</fastdfs_client.version>        <spring-rabbit.version>1.4.0.RELEASE</spring-rabbit.version>        <httpclient.version>4.3.5</httpclient.version>        <rabbitmq.version>3.4.1</rabbitmq.version>        <jsoup.version>1.10.3</jsoup.version>    </properties>    <dependencyManagement>        <dependencies>            <!-- 单元测试 -->            <dependency>                <groupId>junit</groupId>                <artifactId>junit</artifactId>                <version>${junit.version}</version>                <scope>test</scope>            </dependency>            <!-- Spring -->            <dependency>                <groupId>org.springframework</groupId>                <artifactId>spring-webmvc</artifactId>                <version>${spring.version}</version>            </dependency>            <dependency>                <groupId>org.springframework</groupId>                <artifactId>spring-jdbc</artifactId>                <version>${spring.version}</version>            </dependency>            <dependency>                <groupId>org.springframework</groupId>                <artifactId>spring-aspects</artifactId>                <version>${spring.version}</version>            </dependency>            <dependency>                <groupId>org.springframework</groupId>                <artifactId>spring-context-support</artifactId>                <version>${spring.version}</version>            </dependency>            <!-- 通用Mapper -->            <dependency>                <groupId>com.github.abel533</groupId>                <artifactId>mapper</artifactId>                <version>${mapper.version}</version>            </dependency>            <!-- Mybatis -->            <dependency>                <groupId>org.mybatis</groupId>                <artifactId>mybatis</artifactId>                <version>${mybatis.version}</version>            </dependency>            <dependency>                <groupId>org.mybatis</groupId>                <artifactId>mybatis-spring</artifactId>                <version>${mybatis.spring.version}</version>            </dependency>            <!-- 分页助手 -->            <dependency>                <groupId>com.github.pagehelper</groupId>                <artifactId>pagehelper</artifactId>                <version>${pagehelper.version}</version>            </dependency>            <dependency>                <groupId>com.github.jsqlparser</groupId>                <artifactId>jsqlparser</artifactId>                <version>${jsqlparser.version}</version>            </dependency>            <!-- MySql -->            <dependency>                <groupId>mysql</groupId>                <artifactId>mysql-connector-java</artifactId>                <version>${mysql.version}</version>            </dependency>            <!-- 日志 -->            <dependency>                <groupId>org.slf4j</groupId>                <artifactId>slf4j-log4j12</artifactId>                <version>${slf4j.version}</version>            </dependency>            <!-- Jackson Json处理工具包 -->            <dependency>                <groupId>com.fasterxml.jackson.core</groupId>                <artifactId>jackson-databind</artifactId>                <version>${jackson.version}</version>            </dependency>            <!-- 连接池 -->            <dependency>                <groupId>com.jolbox</groupId>                <artifactId>bonecp-spring</artifactId>                <version>${jolbox.version}</version>            </dependency>            <!-- JSP相关 -->            <dependency>                <groupId>jstl</groupId>                <artifactId>jstl</artifactId>                <version>${jstl.version}</version>            </dependency>            <dependency>                <groupId>javax.servlet</groupId>                <artifactId>servlet-api</artifactId>                <version>${servlet-api.version}</version>                <scope>provided</scope>            </dependency>            <dependency>                <groupId>javax.servlet</groupId>                <artifactId>jsp-api</artifactId>                <version>${jsp-api.version}</version>                <scope>provided</scope>            </dependency>            <!-- 时间操作组件 -->            <dependency>                <groupId>joda-time</groupId>                <artifactId>joda-time</artifactId>                <version>${joda-time.version}</version>            </dependency>            <!-- Apache工具组件 -->            <dependency>                <groupId>org.apache.commons</groupId>                <artifactId>commons-lang3</artifactId>                <version>${commons-lang3.version}</version>            </dependency>            <dependency>                <groupId>org.apache.commons</groupId>                <artifactId>commons-io</artifactId>                <version>${commons-io.version}</version>            </dependency>            <!-- 文件上传组件 -->            <dependency>                <groupId>commons-fileupload</groupId>                <artifactId>commons-fileupload</artifactId>                <version>${commons-fileupload.version}</version>            </dependency>            <!-- dubbo相关 -->            <dependency>                <groupId>com.alibaba</groupId>                <artifactId>dubbo</artifactId>                <version>${dubbo.version}</version>                <exclusions>                    <exclusion>                        <groupId>org.springframework</groupId>                        <artifactId>spring</artifactId>                    </exclusion>                    <exclusion>                        <groupId>org.jboss.netty</groupId>                        <artifactId>netty</artifactId>                    </exclusion>                </exclusions>            </dependency>            <dependency>                <groupId>org.apache.zookeeper</groupId>                <artifactId>zookeeper</artifactId>                <version>${zookeeper.version}</version>            </dependency>            <dependency>                <groupId>com.github.sgroschupf</groupId>                <artifactId>zkclient</artifactId>                <version>${zkclient.version}</version>            </dependency>            <!-- 加密解密 -->            <dependency>                <groupId>commons-codec</groupId>                <artifactId>commons-codec</artifactId>                <version>${commons-codec.version}</version>            </dependency>            <!-- 定时任务Quartz -->            <dependency>                <groupId>org.quartz-scheduler</groupId>                <artifactId>quartz</artifactId>                <version>${quartz.version}</version>            </dependency>            <!-- ActiveMQ依赖 -->            <dependency>                <groupId>org.apache.activemq</groupId>                <artifactId>activemq-all</artifactId>                <version>${activemq.version}</version>            </dependency>            <dependency>                <groupId>org.springframework</groupId>                <artifactId>spring-jms</artifactId>                <version>${spring.version}</version>            </dependency>            <!-- RabbitMq依赖 -->            <dependency>                <groupId>org.springframework.amqp</groupId>                <artifactId>spring-rabbit</artifactId>                <version>${spring-rabbit.version}</version>            </dependency>            <dependency>                <groupId>com.rabbitmq</groupId>                <artifactId>amqp-client</artifactId>                <version>${rabbitmq.version}</version>            </dependency>            <!-- 静态化freemarker -->            <dependency>                <groupId>org.freemarker</groupId>                <artifactId>freemarker</artifactId>                <version>${freemarker.version}</version>            </dependency>            <!-- Redis客户端 -->            <dependency>                <groupId>redis.clients</groupId>                <artifactId>jedis</artifactId>                <version>${jedis.version}</version>            </dependency>            <!-- solr客户端 -->            <dependency>                <groupId>org.apache.solr</groupId>                <artifactId>solr-solrj</artifactId>                <version>${solrj.version}</version>            </dependency>            <!-- 百度编辑器 -->            <dependency>                <groupId>com.baidu</groupId>                <artifactId>ueditor</artifactId>                <version>${uediter.version}</version>            </dependency>            <dependency>                <groupId>org.json</groupId>                <artifactId>json</artifactId>                <version>${json.version}</version>            </dependency>            <dependency>                <groupId>com.alibaba.fastdfs</groupId>                <artifactId>fastdfs_client</artifactId>                <version>${fastdfs_client.version}</version>            </dependency>            <!-- httpclient -->            <dependency>                <groupId>org.apache.httpcomponents</groupId>                <artifactId>httpclient</artifactId>                <version>${httpclient.version}</version>            </dependency>            <dependency>                <groupId>org.jsoup</groupId>                <artifactId>jsoup</artifactId>                <version>${jsoup.version}</version>            </dependency>        </dependencies>    </dependencyManagement>    <build>        <finalName>${project.artifactId}</finalName>        <plugins>            <!-- 资源文件拷贝插件 -->            <plugin>                <groupId>org.apache.maven.plugins</groupId>                <artifactId>maven-resources-plugin</artifactId>                <version>2.7</version>                <configuration>                    <encoding>UTF-8</encoding>                </configuration>            </plugin>            <!-- java编译插件 -->            <plugin>                <groupId>org.apache.maven.plugins</groupId>                <artifactId>maven-compiler-plugin</artifactId>                <version>3.2</version>                <configuration>                    <source>1.7</source>                    <target>1.7</target>                    <encoding>UTF-8</encoding>                </configuration>            </plugin>        </plugins>        <pluginManagement>            <plugins>                <!-- 配置Tomcat插件 -->                <plugin>                    <groupId>org.apache.tomcat.maven</groupId>                    <artifactId>tomcat7-maven-plugin</artifactId>                    <version>2.2</version>                </plugin>            </plugins>        </pluginManagement>        <resources>            <!-- 使用Maven部署的时候,xml和properties配置文件也一起部署到Tomcat -->            <resource>                <directory>src/main/java</directory>                <includes>                    <include>**/*.properties</include>                    <include>**/*.xml</include>                    <include>**/*.cnf</include>                </includes>                <filtering>false</filtering>            </resource>            <!-- 默认是以下配置 -->            <resource>                <directory>src/main/resources</directory>                <includes>                    <include>**/*.properties</include>                    <include>**/*.xml</include>                    <include>**/*.cnf</include>                </includes>                <filtering>false</filtering>            </resource>        </resources>    </build></project>

2.创建一个子模块 clawer-jd-product

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">    <parent>        <artifactId>clawer-system</artifactId>        <groupId>com.jianqiao.clawer</groupId>        <version>1.0-SNAPSHOT</version>    </parent>    <modelVersion>4.0.0</modelVersion>    <artifactId>clawer-jd-product</artifactId>    <packaging>war</packaging>    <name>clawer-jd-product Maven Webapp</name>    <dependencies>        <dependency>            <groupId>junit</groupId>            <artifactId>junit</artifactId>        </dependency>        <!-- httpclient -->        <dependency>            <groupId>org.apache.httpcomponents</groupId>            <artifactId>httpclient</artifactId>        </dependency>        <!-- Apache工具组件 -->        <dependency>            <groupId>org.apache.commons</groupId>            <artifactId>commons-lang3</artifactId>        </dependency>        <dependency>            <groupId>org.apache.commons</groupId>            <artifactId>commons-io</artifactId>        </dependency>        <!-- 文件上传组件 -->        <dependency>            <groupId>commons-fileupload</groupId>            <artifactId>commons-fileupload</artifactId>        </dependency>        <!-- Jackson Json处理工具包 -->        <dependency>            <groupId>com.fasterxml.jackson.core</groupId>            <artifactId>jackson-databind</artifactId>        </dependency>        <!-- spring相关的 -->        <dependency>            <groupId>org.springframework</groupId>            <artifactId>spring-webmvc</artifactId>        </dependency>        <dependency>            <groupId>org.springframework</groupId>            <artifactId>spring-jdbc</artifactId>        </dependency>        <dependency>            <groupId>org.springframework</groupId>            <artifactId>spring-aspects</artifactId>        </dependency>        <dependency>            <groupId>org.springframework</groupId>            <artifactId>spring-context-support</artifactId>        </dependency>        <!-- 通用Mapper -->        <dependency>            <groupId>com.github.abel533</groupId>            <artifactId>mapper</artifactId>        </dependency>        <!-- Mybatis -->        <dependency>            <groupId>org.mybatis</groupId>            <artifactId>mybatis</artifactId>        </dependency>        <dependency>            <groupId>org.mybatis</groupId>            <artifactId>mybatis-spring</artifactId>        </dependency>        <dependency>            <groupId>com.github.jsqlparser</groupId>            <artifactId>jsqlparser</artifactId>        </dependency>        <!-- 分页助手 -->        <dependency>            <groupId>com.github.pagehelper</groupId>            <artifactId>pagehelper</artifactId>        </dependency>        <dependency>            <groupId>com.github.jsqlparser</groupId>            <artifactId>jsqlparser</artifactId>        </dependency>        <!-- MySql -->        <dependency>            <groupId>mysql</groupId>            <artifactId>mysql-connector-java</artifactId>        </dependency>        <!-- Jackson Json处理工具包 -->        <dependency>            <groupId>com.fasterxml.jackson.core</groupId>            <artifactId>jackson-databind</artifactId>        </dependency>        <!-- 连接池 -->        <dependency>            <groupId>com.jolbox</groupId>            <artifactId>bonecp-spring</artifactId>        </dependency>        <!-- JSP相关 -->        <dependency>            <groupId>jstl</groupId>            <artifactId>jstl</artifactId>        </dependency>        <dependency>            <groupId>javax.servlet</groupId>            <artifactId>servlet-api</artifactId>            <scope>provided</scope>        </dependency>        <dependency>            <groupId>javax.servlet</groupId>            <artifactId>jsp-api</artifactId>            <scope>provided</scope>        </dependency>        <!-- 日志 -->        <dependency>            <groupId>org.slf4j</groupId>            <artifactId>slf4j-log4j12</artifactId>        </dependency>        <!-- 日志 -->        <dependency>            <groupId>org.slf4j</groupId>            <artifactId>slf4j-log4j12</artifactId>        </dependency>        <!-- html解释器 -->        <dependency>            <groupId>org.jsoup</groupId>            <artifactId>jsoup</artifactId>        </dependency>    </dependencies>    <build>        <finalName>clawer-jd-product</finalName>        <plugins>            <!-- 资源文件拷贝插件 -->            <plugin>                <groupId>org.apache.maven.plugins</groupId>                <artifactId>maven-resources-plugin</artifactId>                <version>2.7</version>                <configuration>                    <encoding>UTF-8</encoding>                </configuration>            </plugin>            <!-- java编译插件 -->            <plugin>                <groupId>org.apache.maven.plugins</groupId>                <artifactId>maven-compiler-plugin</artifactId>                <version>3.2</version>                <configuration>                    <source>1.7</source>                    <target>1.7</target>                    <encoding>UTF-8</encoding>                </configuration>            </plugin>            <!-- 配置Tomcat插件 -->            <plugin>                <groupId>org.apache.tomcat.maven</groupId>                <artifactId>tomcat7-maven-plugin</artifactId>                <configuration>                    <port>8081</port>                    <path>/</path>                </configuration>            </plugin>        </plugins>        <resources>            <!-- 使用Maven部署的时候,xml和properties配置文件也一起部署到Tomcat -->            <resource>                <directory>src/main/java</directory>                <includes>                    <include>**/*.properties</include>                    <include>**/*.xml</include>                    <include>**/*.cnf</include>                </includes>                <filtering>false</filtering>            </resource>            <!-- 默认是以下配置 -->            <resource>                <directory>src/main/resources</directory>                <includes>                    <include>**/*.properties</include>                    <include>**/*.xml</include>                    <include>**/*.cnf</include>                </includes>                <filtering>false</filtering>            </resource>        </resources>    </build></project>

3.1 准备基本的工具类 httpclient 自己封装过了

package com.jianqiao.util;import com.jianqiao.pojo.HttpResult;import org.apache.http.NameValuePair;import org.apache.http.client.ClientProtocolException;import org.apache.http.client.config.RequestConfig;import org.apache.http.client.entity.UrlEncodedFormEntity;import org.apache.http.client.methods.*;import org.apache.http.client.utils.URIBuilder;import org.apache.http.entity.ContentType;import org.apache.http.entity.StringEntity;import org.apache.http.impl.client.CloseableHttpClient;import org.apache.http.message.BasicNameValuePair;import org.apache.http.util.EntityUtils;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.stereotype.Component;import java.io.IOException;import java.net.URI;import java.util.ArrayList;import java.util.List;import java.util.Map;import java.util.Set;/** * @Auther: Alone_XuXu * @Description: 描述信息: 这个是协助 发送 HTTP 请求的小工具 * @Date: Created in 19:24 - 27 - 10 -2017 * @Modified By: */@Componentpublic class HttpClientUtilImpl{    @Autowired    private CloseableHttpClient httpClient;    @Autowired    private RequestConfig config;    /**     * 带参数的 doGet 请求     *     * @param url 请求地址     * @return 响应200 返回网页内容 其它,返回为null     * @throws Exception     */    public String doGet1(String url, Map<String, Object> params) throws Exception {        StringBuilder sb = new StringBuilder(url);        sb.append("?");        //如果调用者携带了参数过来        if(params != null && params.size() > 0){            //设置请求参数            Set<Map.Entry<String, Object>> entries = params.entrySet();                        //遍历整理参数信息 啊            for (Map.Entry<String, Object> entry : entries) {                sb.append(entry.getKey() + "=" + entry.getValue() + "&");            }            url = sb.substring(0, sb.length() - 1).toString();        }        //创建请求        URIBuilder uriBuilder = new URIBuilder(url);        URI uriBuild = uriBuilder.build();        //声明一个请求        HttpGet httpGet  = new HttpGet(uriBuild);        //执行了这个请求        CloseableHttpResponse executeResponse = null;        try {            executeResponse  = httpClient.execute(httpGet);            if(executeResponse.getStatusLine().getStatusCode() == 200){                return EntityUtils.toString(executeResponse.getEntity(), "UTF-8");            }        } finally{            if(executeResponse != null){                executeResponse.close();            }        }        return null;    }    /**     * 带参数的 doGet 请求     *     * @param url 请求地址     * @return 响应200 返回网页内容 其它,返回为null     * @throws Exception     */    public String doGet(String url, Map<String, Object> params) throws Exception {        //创建请求        URIBuilder uriBuilder = new URIBuilder(url);        //如果调用者携带了参数过来        if(params != null && params.size() > 0){            //设置请求参数            Set<Map.Entry<String, Object>> entries = params.entrySet();            //遍历整理参数信息 啊            for (Map.Entry<String, Object> entry : entries) {                uriBuilder.setParameter(entry.getKey(),entry.getValue().toString());            }        }        URI uriBuild = uriBuilder.build();        //声明一个请求        HttpGet httpGet  = new HttpGet(uriBuild);        //执行了这个请求        CloseableHttpResponse executeResponse = null;        try {            executeResponse  = httpClient.execute(httpGet);            if(executeResponse.getStatusLine().getStatusCode() == 200){                return EntityUtils.toString(executeResponse.getEntity(), "UTF-8");            }        } finally{            if(executeResponse != null){                executeResponse.close();            }        }        return null;    }    /**     * 带有参数的 doPost 请求     *     * @throws IOException     * @throws ClientProtocolException     */    public HttpResult doPost(String url, Map<String, Object> params) throws IOException,ClientProtocolException {        //声明一个请求        HttpPost httpPost = new HttpPost(url);        //整理参数列表        List<NameValuePair> paramterList = getNameValuePairs(params);        // 将请求实体设置到httpPost对象中        //设置 参数信息        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(paramterList, "utf-8");        httpPost.setEntity(formEntity);        httpPost.setConfig(config);        //执行        return executePostOrPutOrDeleteMethod(httpPost);    }    /**     * 带参数 格式为json类型的 的 doPost 请求     *     * @param url     * @param json 请求参数信息     * @return 状态码和请求的body     * @throws IOException     */    public HttpResult doPostJson(String url, String json) throws IOException {        // 创建http POST请求        HttpPost httpPost = new HttpPost(url);        httpPost.setConfig(this.config);        //我们需要把json参数解析出来        if(json != null){            //给他说明他是什么类型的实体类型            StringEntity stringEntity = new StringEntity(json, ContentType.APPLICATION_JSON);            //将实体参数设置回去            httpPost.setEntity(stringEntity);        }        //执行        return executePostOrPutOrDeleteMethod(httpPost);    }    /**     * 带参数PUT请求     *     * @param url     * @param params 请求参数     * @return 状态码和请求的body     * @throws IOException     */    public HttpResult doPut(String url, Map<String, Object> params) throws IOException {        //构造一个httpPut 请求        HttpPut httpPut = new HttpPut(url);        //设置参数信息        httpPut.setConfig(config);        //整理参数列表        List<NameValuePair> paramterList = getNameValuePairs(params);        // 将请求实体设置到httpPost对象中        //设置 参数信息        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(paramterList, "utf-8");        httpPut.setEntity(formEntity);        //执行        return executePostOrPutOrDeleteMethod(httpPut);    }    /**     * DELETE请求,通过POST提交,_method指定真正的请求方法     *     * @param url     * @param param 请求参数     * @return 状态码和请求的body     * @throws IOException     */    public HttpResult doDelete(String url, Map<String, Object> param) throws Exception {        param.put("_method", "DELETE");        return this.doPost(url, param);    }    /**     * 不带参数的Doget请求     *     * @param url 请求地址     * @return 响应200 返回网页内容 其它,返回为null     * @throws Exception     */    public String doGet(String url) throws Exception {        //这里我们直接调用了他的 doGet 带参数的请求方式        return doGet(url, null);    }    /**     * 没有带参数的 doPost     *     * @throws Exception     */    public HttpResult doPost(String url) throws Exception {        //我直接调用了,单携带参数的doPost        return doPost(url,null);    }    /**     * 不带参数PUT请求     *     * @param url     * @return 状态码和请求的body     * @throws IOException     */    public HttpResult doPut(String url) throws IOException {        //其实我也是调用的有参数的构造器实现的功能        return doPut(url,null);    }    /**     * 执行DELETE请求(真正的DELETE请求)     *     * @param url     * @return 状态码和请求的body     * @throws IOException     */    public HttpResult doDelete(String url) throws Exception {        // 创建http DELETE请求        HttpDelete httpDelete = new HttpDelete(url);        httpDelete.setConfig(config);        //执行        return executePostOrPutOrDeleteMethod(httpDelete);    }    /**     * 开始执行POST 或者 PUT 或者 DELETE 方法,并且返回结果集     * @param postOrPutOrDelete 需要执行的post 或者 put 请求 或者 DELETE 请求     * @return     * @throws IOException     * HttpEntityEnclosingRequestBase HttpEntityEnclosingRequestBase     */    private HttpResult executePostOrPutOrDeleteMethod(HttpUriRequest postOrPutOrDelete) throws IOException {        CloseableHttpResponse closeableHttpResponse = null;        try {            closeableHttpResponse = httpClient.execute(postOrPutOrDelete);            if(closeableHttpResponse.getEntity() != null){                return new HttpResult(closeableHttpResponse.getStatusLine().getStatusCode(), EntityUtils.toString(closeableHttpResponse.getEntity(), "utf-8"));            }//            int status  = closeableHttpResponse.getStatusLine().getStatusCode();//            if ( status == 200) {//                return new HttpResult(status, EntityUtils.toString(closeableHttpResponse.getEntity(), "utf-8"));//            }            //返回状态码回去呢            return new HttpResult(closeableHttpResponse.getStatusLine().getStatusCode(), null);        } finally {            if( closeableHttpResponse != null){                closeableHttpResponse.close();            }        }    }    /**     * 这个方法是 整理 请求的时候的  POST 或者PUT 携带的参数整理成我们需要的类型     * @param params     * @return     */    private List<NameValuePair> getNameValuePairs(Map<String, Object> params) {        List<NameValuePair> paramterList = new ArrayList<>();        //遍历参数信息,整理参数信息        //如果有数据        if (params != null) {            for (Map.Entry<String, Object> entry : params.entrySet()) {                NameValuePair nameValuePair = new BasicNameValuePair(entry.getKey(),entry.getValue().toString());                paramterList.add(nameValuePair);            }        }        return paramterList;    }}

3.2  httpclient 连接池来管理所有的httpclient连接

package com.jianqiao.util;import org.apache.http.conn.HttpClientConnectionManager;/** * @Auther: Alone_XuXu * @Description: 使用线程来管理不使用的连接操作啊 * @Date: Created in 19:53 - 27 - 10 -2017 * @Modified By: */public class IdleConnectionEvictor extends Thread{    //管理对象    private HttpClientConnectionManager httpClientConnectionManager;    //判断是不是停止的条件    private volatile boolean shutdown;    //构造器    public IdleConnectionEvictor(HttpClientConnectionManager httpClientConnectionManager) {        this.httpClientConnectionManager = httpClientConnectionManager;        this.start();    }    @Override    public void run() {        while(!shutdown){            try {                synchronized(this){                    wait(5000);                    //清理不使用的连接                    httpClientConnectionManager.closeExpiredConnections();                }            } catch (InterruptedException e) {                //            }        }    }    public void shutdown() {        shutdown = true;        synchronized (this) {            notifyAll();        }    }}

4 准备 pojo对象 和 vo对象 (和页面交互的对象)

4.1 准备 vo

package com.jianqiao.vo;/** * @Auther: Alone_XuXu * @Description: 描述信息<p> *     主要也就是关键字了 * </p> * @Date: Created in 6:41 - 27 - 11 -2017 * @Modified By: */public class KeyWord {    private String keyword;    private String enc;    private String wq;    private String page;    public String getKeyword() {        return keyword;    }    public void setKeyword(String keyword) {        this.keyword = keyword;    }    public String getEnc() {        return enc;    }    public void setEnc(String enc) {        this.enc = enc;    }    public String getWq() {        return wq;    }    public void setWq(String wc) {        this.wq = wc;    }    public String getPage() {        return page;    }    public void setPage(String page) {        this.page = page;    }}

4.2 准备 pojo

package com.jianqiao.pojo;import org.apache.commons.lang3.StringUtils;import java.io.Serializable;public class Product implements Serializable {private Long id;private String title;private String sellpoint;private String price;private Integer num;private String image;private Long cid;private Boolean status=true;// 在映射数据库表的时候,忽略该属性public Long getId() {return id;}public void setId(Long id) {this.id = id;}public String getTitle() {return title;}public void setTitle(String title) {this.title = title;}public String getSellpoint() {return sellpoint;}public void setSellpoint(String sellpoint) {this.sellpoint = sellpoint;}public String getPrice() {return price;}public void setPrice(String price) {this.price = price;}public Integer getNum() {return num;}public void setNum(Integer num) {this.num = num;}public String getImage() {return image;}public void setImage(String image) {this.image = image;}public Long getCid() {return cid;}public void setCid(Long cid) {this.cid = cid;}public Boolean getStatus() {return status;}public void setStatus(Boolean status) {this.status = status;}@Overridepublic String toString() {return "Product [id=" + id + ", title=" + title + ", sellPoint="+ sellpoint + ", price=" + price + ", num=" + num + ", image="+ image + ", cid=" + cid + ", status=" + status + "]";}}

package com.jianqiao.pojo;public class HttpResult {    // 状态码    private Integer code;    // 响应body    private String body;        public HttpResult() {        super();    }    public HttpResult(Integer code, String body) {        this.code = code;        this.body = body;    }    public Integer getCode() {        return code;    }    public void setCode(Integer code) {        this.code = code;    }    public String getBody() {        return body;    }    public void setBody(String body) {        this.body = body;    }}


5.0 准备和数据库相关的内容

package com.jianqiao.mapper;import com.github.abel533.mapper.Mapper;import com.jianqiao.pojo.Product;public interface ProductMapper extends Mapper<Product> {}


6.0 服务层准备

package com.jianqiao.service;import com.github.abel533.entity.Example;import com.github.abel533.mapper.Mapper;import com.github.pagehelper.PageHelper;import org.springframework.beans.factory.annotation.Autowired;import java.lang.reflect.ParameterizedType;import java.lang.reflect.Type;import java.util.List;public class BaseServiceImpl<T>{    @Autowired    protected Mapper<T> mapper;    Class<T> clazz;    public BaseServiceImpl() {        Type type = this.getClass().getGenericSuperclass();        ParameterizedType ptype = (ParameterizedType)type;        this.clazz =(Class<T>)ptype.getActualTypeArguments()[0];    }    public T queryById(Long id) {        return this.mapper.selectByPrimaryKey(id);    }    public List<T> queryAll() {        //我们如果在缓存中查找不导数据,这个时候我们才需要去查询数据库        return this.mapper.select(null);    }    public List<T> queryByWhere(T t) {        return this.mapper.select(t);    }    public Integer queryByWhereCount(T t) {        return this.mapper.selectCount(t);    }    public List<T> queryByPage(Integer page, Integer rows) {        //第一个参数:当前页,第二参数:每页显示记录数        PageHelper.startPage(page, rows);        List<T> list = this.mapper.select(null);        return list;    }    public T queryOne(T t) {        return this.mapper.selectOne(t);    }    public void save(T t) {        this.mapper.insert(t);    }    public void saveSelective(T t) {        this.mapper.insertSelective(t);    }    public void update(T t) {        this.mapper.updateByPrimaryKey(t);    }    public void updateSelective(T t) {        this.mapper.updateByPrimaryKeySelective(t);    }    public void deleteById(Long id) {        this.mapper.deleteByPrimaryKey(id);    }    public void deleteByIds(List<Object> ids) {        Example example = new Example(this.clazz);        example.createCriteria().andIn("id", ids);        //批量删除        this.mapper.deleteByExample(example);    }}


package com.jianqiao.service;import com.fasterxml.jackson.databind.ObjectMapper;import com.jianqiao.mapper.ProductMapper;import com.jianqiao.pojo.Product;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.stereotype.Service;@Servicepublic class ProductServiceImpl extends BaseServiceImpl<Product>{}

这个是我们主要的服务

package com.jianqiao.service;import com.jianqiao.constant.AppConstants;import com.jianqiao.pojo.Product;import com.jianqiao.util.HttpClientUtilImpl;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.nodes.Element;import org.jsoup.select.Elements;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.stereotype.Service;import java.util.Map;import java.util.concurrent.ConcurrentHashMap;/** * @Auther: Alone_XuXu * @Description: 描述信息 * @Date: Created in 6:59 - 27 - 11 -2017 * @Modified By: */@Servicepublic class ClawerService {    private Logger logger = LoggerFactory.getLogger(ClawerService.class);    //工具    @Autowired    private HttpClientUtilImpl httpClientUtil;    /**     * 获取都页数     *     * @param url     * @return     */    public Integer getTotalPage(String url) {        try {            String html = httpClientUtil.doGet(url);            if (html != null) {                Document document = Jsoup.parse(html);                //解析文档                //id="J_topPage" 表示这个页数所在的位置                String jtopPageText = document.select("#J_topPage").text();                //使用正则表达式来取值                String[] strings = jtopPageText.split("\\D+");                System.out.println("总页数: " + strings[1]);                return Integer.parseInt(strings[1]);            }        } catch (Exception e) {            e.printStackTrace();        }        return 0;    }    /**     * 循环抓取内容啊     *     * @param url     * @return     */    public Map<String, Product> findProductByPage(final String url,final Map<String, Object> params) {        Map<String, Product> maps = new ConcurrentHashMap<>();        //替换页码        try {            String doGetHtml = httpClientUtil.doGet1(url,params);            //去除中间多的空格啊,换行之类的            doGetHtml = doGetHtml.replaceAll("\r\n|\r|\n|\t|\b|~|\f", "");//去掉回车换行符            getProductList(maps, doGetHtml);            return maps;        } catch (Exception e) {            e.printStackTrace();        }        return maps;    }    /**     * 将html中的产品信息,取出来     * @param maps     * @param doGetHtml     */    private void getProductList(Map<String, Product> maps, String doGetHtml) {        if (doGetHtml != null) {            //解析到 document 文档            Document rootDocument = Jsoup.parse(doGetHtml);            // 获取到整个商品列表信息            Elements listElement = rootDocument.select("ul[class=gl-warp clearfix]")                    .select(".gl-item");            for (Element element : listElement) {                Product product = new Product();                Element childDiv = element.child(0);                String data_sku = element.attr("data-sku");                String p_name  = childDiv.select(".p-name").text();                String image_src = element.select(".p-img").select("a  img").attr("src");                String price = element.select(".p-price strong").select("i").text();                product.setId(Long.parseLong(data_sku));                product.setTitle(p_name);                product.setImage(AppConstants.HTTPS + image_src);                product.setPrice(price);                //将数据添加到整个列表里面呢                maps.put(data_sku,product);            }        }    }}


7.0 controller层准备

package com.jianqiao.controller;import com.fasterxml.jackson.databind.ObjectMapper;import com.jianqiao.constant.AppConstants;import com.jianqiao.pojo.Product;import com.jianqiao.service.ClawerService;import com.jianqiao.service.ProductServiceImpl;import com.jianqiao.vo.KeyWord;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor;import org.springframework.stereotype.Controller;import org.springframework.util.StringUtils;import org.springframework.web.bind.annotation.RequestMapping;import java.util.Map;import java.util.concurrent.ConcurrentHashMap;import java.util.concurrent.CountDownLatch;import java.util.concurrent.ExecutorService;import java.util.concurrent.Executors;/** * @Auther: Alone_XuXu * @Description: 描述信息 * @Date: Created in 6:39 - 27 - 11 -2017 * @Modified By: */@Controllerpublic class JDClawerController {    //设置总的记录shutdown    private static Long count = 0L;    //设置总页数    private static Integer totalPage = 0;    //最后得到的结果    private Map<String, Product> finalMaps = new ConcurrentHashMap<>();    //json 转换工具    private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();    @Autowired    private ThreadPoolTaskExecutor threadPoolTaskExecutor;    @Autowired    private ClawerService clawerService;    @Autowired    private ProductServiceImpl productService;    /**     * 爬取的京东数据     *     * @param keyWord 接受到的参数组合     */    @RequestMapping("/jd/clawer")    public void clawerJD(KeyWord keyWord) {        //我们先替换掉所有的参数信息先啊        String url = "https://search.jd.com/Search?keyword={keyword}&enc={enc}&qrst=1&rt=1&stop=1&vt=2&wq={wq}&page={page}&s=57&click=0";        String operationUrl = url.replace("{keyword}", keyWord.getKeyword());        operationUrl = operationUrl.replace("{enc}", keyWord.getEnc());        operationUrl = operationUrl.replace("{wq}", keyWord.getWq());        if(keyWord.getPage() != null){            operationUrl = operationUrl.replace("{page}", keyWord.getPage());        }else{            operationUrl = operationUrl.replace("{page}", "1");        }        totalPage = clawerService.getTotalPage(operationUrl);        Integer vtPage = totalPage * 2; //在京东有个问题,serach查询的时候,有个步长的概念,神知道他想做什么......        final CountDownLatch countDownLatch = new CountDownLatch(totalPage);//为了我们的线程可以计数,多少页我们就执行多少次        long startTime = System.currentTimeMillis();        //步长为2,等这里面所有线程执行结束        for (int i = 1; i < vtPage; i += 2) {            System.out.println("第" + i + "页");            final Map<String, Object> params = new ConcurrentHashMap<>();            params.put("keyword", keyWord.getKeyword());            params.put("enc", keyWord.getEnc());            params.put("wc", keyWord.getWq());            params.put("page", i + "");            threadPoolTaskExecutor.submit(new Runnable() {                @Override                public void run() {                    try {                        Map<String, Product> productByPage = clawerService.findProductByPage(AppConstants.BASE_URL,params);                        finalMaps.putAll(productByPage);                    } finally {                        countDownLatch.countDown();//执行一次计数一次                    }                }            });        }        //让主线程等待啊        try {            countDownLatch.await();        } catch (InterruptedException e) {            e.printStackTrace();        }        long endTime = System.currentTimeMillis();       //遍历一下先        for(Map.Entry<String,Product> entry : finalMaps.entrySet()){            productService.saveSelective(entry.getValue());        }        //在这里我们可以开启多线程了        System.out.println("消耗时间:" + (endTime - startTime));        //消耗时间:19094  这个是开了三个线程操作的时候的数据        //消耗时间:6337   这个是我开了十个线程的时候的数据    }}

系统常量类

package com.jianqiao.constant;/** * @Auther: Alone_XuXu * @Description: 描述信息 * @Date: Created in 6:46 - 27 - 11 -2017 * @Modified By: */public interface AppConstants {    //默认编码    String DEFAULT_CHARSET = "utf-8";    //需要爬取的网站入口    //https://search.jd.com/Search?keyword=笔记本电&enc=utf-8&qrst=1&rt=1&stop=1&vt=2&wq=笔记本电脑&page=3&s=57&click=0    String BASE_URL = "https://search.jd.com/Search";    String HTTPS = "https:";    /**     * 浏览器头信息     */    interface Header {        String ACCEPT = "Accept";        String ACCEPT_ENCODING = "Accept-Encoding";        String ACCEPT_LANGUAGE = "Accept-Language";        String CACHE_CONTROL = "Cache-Controle";        String COOKIE = "Cookie";        String HOST = "Host";        String PROXY_CONNECTION = "Proxy-Connection";        String REFERER = "Referer";        String USER_AGENT = "User-Agent";    }}


8.0 配置文件准备

8.1  web.xml

<?xml version="1.0" encoding="UTF-8"?><web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"         xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd"         version="3.1">    <display-name>Archetype Created Web Application</display-name>    <!-- 配置spring 容器启动监听器 -->    <context-param>        <param-name>contextConfigLocation</param-name>        <param-value>classpath*:spring/spring-*.xml</param-value>    </context-param>    <listener>        <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>    </listener>    <!-- 配置前端控制器 -->    <servlet>        <servlet-name>DispatcherServlet</servlet-name>        <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>        <init-param>            <param-name>contextConfigLocation</param-name>            <param-value>classpath:spring/springmvc-*.xml</param-value>        </init-param>        <load-on-startup>1</load-on-startup>    </servlet>    <servlet-mapping>        <servlet-name>DispatcherServlet</servlet-name>        <url-pattern>/</url-pattern>    </servlet-mapping>    <!-- 配置 post 请求乱码处理拦截器 -->    <filter>        <filter-name>CharacterEncodingFilter</filter-name>        <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>        <init-param>            <param-name>encoding</param-name>            <param-value>utf-8</param-value>        </init-param>        <init-param>            <param-name>forceRequestEncoding</param-name>            <param-value>true</param-value>        </init-param>        <init-param>            <param-name>forceResponseEncoding</param-name>            <param-value>true</param-value>        </init-param>    </filter>    <filter-mapping>        <filter-name>CharacterEncodingFilter</filter-name>        <url-pattern>/*</url-pattern>    </filter-mapping>    <!-- 配置springmvc rest 拦截器-->    <filter>        <filter-name>HiddenHttpMethodFilter</filter-name>        <filter-class>org.springframework.web.filter.HiddenHttpMethodFilter</filter-class>    </filter>    <filter-mapping>        <filter-name>HiddenHttpMethodFilter</filter-name>        <url-pattern>/*</url-pattern>    </filter-mapping></web-app>

8.2 mybatis

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE configuration        PUBLIC "-//mybatis.org//DTD Config 3.0//EN"        "http://mybatis.org/dtd/mybatis-3-config.dtd"><configuration>    <!-- 配置通用 mapper -->    <!--        如果同时使用通用 mapper 和 PageHelper        我们应该讲 PageHelper 插件配置在前面,否则不能正常启动    -->    <plugins>        <plugin interceptor="com.github.pagehelper.PageInterceptor">            <!-- 开挂设置分页合理化 -->            <property name="reasonable" value="true"/>        </plugin>        <plugin interceptor="com.github.abel533.mapperhelper.MapperInterceptor">            <!--主键自增回写方法,默认值MYSQL,详细说明请看文档 -->            <property name="IDENTITY" value="MYSQL"/>            <!--通用Mapper接口,多个通用接口用逗号隔开 -->            <property name="mappers" value="com.github.abel533.mapper.Mapper"/>        </plugin>    </plugins></configuration>

8.3 properties

jdbc相关

jdbc.username=rootjdbc.password=1230jdbc.url=jdbc:mysql://localhost:3306/clawerDB?rewriteBatchedStatements=true&useUnicode=true&characterEncoding=utf8jdbc.driver=com.mysql.jdbc.Driver


连接池相关参数配置
httpclient.maxTotal = 200httpclient.DefaultMaxPerRoute = 20httpclient.connectTimeout =1000httpclient.connectionRequestTimeout =500httpclient.socketTimeout =10000httpclient.staleConnectionCheckEnabled = true

8.4 spring相关

8.4.1 spring-beans.xml

<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"       xmlns:context="http://www.springframework.org/schema/context"       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd">        <!-- 这里配置服务相关 -->    <context:component-scan base-package="com.jianqiao.util"/>    <context:component-scan base-package="com.jianqiao.service"/>    <!-- 配置线程池 异步线程池 -->    <bean id="taskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">        <!-- 线程池维护线程的最少数量 -->        <property name="corePoolSize" value="10" />        <!-- 线程池维护线程的最大数量 -->        <property name="maxPoolSize" value="100" />        <!-- 线程池所使用的缓冲队列 mainExecutor.maxSize -->        <property name="queueCapacity" value="1000" />        <!-- 线程池维护线程所允许的空闲时间 -->        <property name="keepAliveSeconds" value="3000" />        <!-- 线程池对拒绝任务(无线程可用)的处理策略 AbortPolicy会抛出RejectedExecutionException异常。-->        <property name="rejectedExecutionHandler">            <bean class="java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy" />        </property>    </bean></beans>


8.4.2 spring-dao.xml

<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"       xmlns:content="http://www.springframework.org/schema/context"       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd">    <!-- 加载配置文件-->    <content:property-placeholder location="classpath:properties/*.properties"/>    <!-- 这里配置加载 Dao 服务相关 -->    <!-- 配置数据源 -->    <bean class="com.jolbox.bonecp.BoneCPDataSource" id="dataSource" destroy-method="close">        <!-- 数据库驱动 -->        <property name="driverClass" value="${jdbc.driver}"/>        <!-- 相应驱动的jdbcUrl -->        <property name="jdbcUrl" value="${jdbc.url}"/>        <!-- 数据库的用户名 -->        <property name="username" value="${jdbc.username}"/>        <!-- 数据库的密码 -->        <property name="password" value="${jdbc.password}"/>        <!-- 检查数据库连接池中空闲连接的间隔时间,单位是分,默认值:240,如果要取消则设置为0 -->        <property name="idleConnectionTestPeriod" value="60"/>        <!-- 连接池中未使用的链接最大存活时间,单位是分,默认值:60,如果要永远存活设置为0 -->        <property name="idleMaxAge" value="30"/>        <!-- 每个分区最大的连接数 -->        <property name="maxConnectionsPerPartition" value="150"/>        <!-- 每个分区最小的连接数 -->        <property name="minConnectionsPerPartition" value="5"/>    </bean>    <!-- 配置 Sql Session Factory-->    <bean class="org.springframework.jdbc.datasource.DataSourceTransactionManager" id="transactionManager">        <property name="dataSource" ref="dataSource"></property>    </bean></beans>

spring-httpclient.xml
<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"       xmlns:context="http://www.springframework.org/schema/context"       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd">    <!-- 加载外部的配置文件 -->    <context:property-placeholder location="classpath:properties/*.properties" />    <!-- 配置连接管理器 -->    <bean id="connectionManager"          class="org.apache.http.impl.conn.PoolingHttpClientConnectionManager">        <!-- 设置最大连接数 -->        <property name="maxTotal" value="${httpclient.maxTotal}" />        <!-- 设置每个主机地址的并发数 -->        <property name="defaultMaxPerRoute" value="${httpclient.DefaultMaxPerRoute}" />    </bean>    <!-- 创建HttpClientBuilder -->    <bean id="httpClientBuilder" class="org.apache.http.impl.client.HttpClientBuilder">        <!-- 设置连接管理器 -->        <property name="connectionManager" ref="connectionManager" />    </bean>    <!-- httpclient -->    <bean id="httpClient" class="org.apache.http.impl.client.CloseableHttpClient"          factory-bean="httpClientBuilder" factory-method="build" scope="prototype">    </bean>    <bean id="requestConfigBuilder" class="org.apache.http.client.config.RequestConfig.Builder">        <!-- 创建连接的最长时间 -->        <property name="connectTimeout" value="${httpclient.connectTimeout}"/>        <!-- 从连接池中获取到连接的最长时间 -->        <property name="connectionRequestTimeout" value="${httpclient.connectionRequestTimeout}"/>        <!-- 数据传输的最长时间 -->        <property name="socketTimeout" value="${httpclient.socketTimeout}"/>        <!-- 提交请求前测试连接是否可用 -->        <property name="staleConnectionCheckEnabled" value="${httpclient.staleConnectionCheckEnabled}"/>    </bean>    <!-- 配置请求参数 -->    <bean id="requestConfig" class="org.apache.http.client.config.RequestConfig"          factory-bean="requestConfigBuilder" factory-method="build"></bean>    <!-- 定时清理连接 -->    <bean class="com.jianqiao.util.IdleConnectionEvictor" destroy-method="shutdown">        <constructor-arg index="0" ref="connectionManager"/>    </bean></beans>


spring-mybatis.xml
<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">    <!-- 配置spring 和mybatis 整合 -->    <!--配置 SqlSessionFactory -->    <bean class="org.mybatis.spring.SqlSessionFactoryBean" id="sqlSessionFactory">        <property name="dataSource" ref="dataSource"/>        <!-- 配置 mybatis 全局配置文件-->        <property name="configLocation" value="classpath:mybatis/mybatis-config.xml"/>        <!--配置扫描mapper 目录以及子目录 所有xml 文件 这里我们使用通用mapper 所以用不上了-->        <!-- <property name="mapperLocations" value="classpath:mappers/**/*.xml"/>-->        <!--配置别名-->        <property name="typeAliasesPackage" value="com.jianqiao.pojo"/>    </bean>    <!-- 扫描mapper -->    <bean class="org.mybatis.spring.mapper.MapperScannerConfigurer">        <!-- 扫描mapper路径 -->        <property name="basePackage" value="com.jianqiao.mapper"/>    </bean></beans>


springmvc-config.xml
<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"       xmlns:context="http://www.springframework.org/schema/context"       xmlns:mvc="http://www.springframework.org/schema/mvc"       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/mvc http://www.springframework.org/schema/mvc/spring-mvc.xsd">    <!-- 为什么 这个配置文件只能在这个 spring mvc 的配置文件中才能在Controller中获取到呢 -->    <context:property-placeholder location="classpath:properties/*.properties"/>    <!-- 配置SPRING-MVC相关的内容 -->    <!-- 定义Controller的扫描包 -->    <context:component-scan base-package="com.jianqiao.controller"/>    <!-- 配置试图解析器 -->    <bean class="org.springframework.web.servlet.view.InternalResourceViewResolver">        <property name="prefix" value="/WEB-INF/views/"/>        <property name="suffix" value=".jsp"/>    </bean>    <!-- 注解驱动 -->    <mvc:annotation-driven/>    <mvc:default-servlet-handler/>    <!--配置文件上传解析器 -->    <bean class="org.springframework.web.multipart.commons.CommonsMultipartResolver" id="multipartResolver">        <property name="defaultEncoding" value="utf-8"/>        <property name="maxUploadSize" value="5242880"/>    </bean></beans>




作者注:

本文可以实现商品的基本搜索,并且保存到数据库

不足之处是,许多代码还需要做调整

最主要一点,jsoup解析的时候,解析图片的时候有时候会取不出来,希望看到这篇文章的人,也能帮我修复这个bug.谢谢


原创粉丝点击