httpclient+jsoup抓取数据
来源:互联网 发布:javascript 矩阵运算 编辑:程序博客网 时间:2024/05/03 05:39
post方式:
importorg.apache.http.HttpEntity;
importorg.apache.http.HttpResponse;
importorg.apache.http.NameValuePair;
importorg.apache.http.client.HttpClient;
importorg.apache.http.client.entity.UrlEncodedFormEntity;
importorg.apache.http.client.methods.HttpPost;
importorg.apache.http.impl.client.HttpClients;
importorg.apache.http.message.BasicNameValuePair;
importorg.apache.http.util.EntityUtils;
importorg.jsoup.Jsoup;
importorg.jsoup.nodes.Document;
importorg.jsoup.select.Elements;
importjava.io.IOException;
importjava.util.ArrayList;
importjava.util.List;
/**
* Created by chl on 2017/7/28.
*/
public classCatchDataUtils {
public staticInteger catchData(String url){
intnoConsumeNum =0;
try{
//创建client实例
HttpClient client = HttpClients.createDefault();
//创建httpget实例
HttpPost httpPost =newHttpPost(url);
List<NameValuePair> list =newArrayList<NameValuePair>();
list.add(newBasicNameValuePair("groupName","memberGroup"));
UrlEncodedFormEntity entity =newUrlEncodedFormEntity(list,"UTF-8");
httpPost.setEntity(entity);
//执行get请求
HttpResponse response = client.execute(httpPost);
String result ="";
if(response !=null) {
HttpEntity resEntity = response.getEntity();
if(resEntity != null){
result = EntityUtils.toString(resEntity,"UTF-8");
}
Document doc = Jsoup.parse(result);
Elements ps=doc.select("p");//选择器,选取特征信息
String data = ps.get(1).toString();
noConsumeNum = Integer.valueOf(data.substring(data.indexOf("=")+1,data.lastIndexOf(":")));
}
}catch(IOException e) {
e.printStackTrace();
}
returnnoConsumeNum;
}
}
get方式:
- public class StockUtils {
- //第一次获取网页源码
- public static String getHtmlByUrl(String url) throws IOException{
- String html = null;
- CloseableHttpClient httpClient = HttpClients.createDefault();//创建httpClient对象
- HttpGet httpget = new HttpGet(url);
- try {
- HttpResponse responce = httpClient.execute(httpget);
- int resStatu = responce.getStatusLine().getStatusCode();
- if (resStatu == HttpStatus.SC_OK) {
- HttpEntity entity = responce.getEntity();
- if (entity != null) {
- html = EntityUtils.toString(entity);//获得html源代码
- }
- }
- } catch (Exception e) {
- System.out.println("访问【"+url+"】出现异常!");
- e.printStackTrace();
- } finally {
- //释放连接
- httpClient.close();
- }
- return html;
- }
- }
Maven依赖:
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.3</version>
</dependency>
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.7.2</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.3</version>
</dependency>
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.7.2</version>
</dependency>
阅读全文
1 0
- httpclient+jsoup抓取数据
- HttpClient + Jsoup 网页数据抓取
- 使用HttpClient和Jsoup定向抓取数据
- httpClient及jsoup抓取解析网页数据
- 利用HttpClient和Jsoup实现从网站中抓取数据
- HttpClient+jsoup实现网页数据抓取和处理
- 使用HttpClient和Jsoup进行简单数据抓取、解析
- HttpClient+Jsoup 抓取网页信息
- Jsoup数据抓取
- 使用Jsoup抓取数据
- jsoup数据抓取学习
- Jsoup抓取数据
- Jsoup抓取数据
- jsoup 抓取网页数据
- Jsoup HttpClient 抓取网络上的图片
- 使用java开源工具httpClient及jsoup抓取解析网页数据
- 使用java开源工具httpClient及jsoup抓取解析网页数据
- 使用java开源工具httpClient及jsoup抓取解析网页数据
- linux- 系统磁盘的管理
- zookeeper服务
- 深入理解MapReduce的架构及其原理
- windows下实现c++版faster-rcnn
- linux内核分析笔记----调度
- httpclient+jsoup抓取数据
- BZOJ 2002 [HNOI2010] 弹飞绵羊 [LCT]
- Asp.net连接Oracle数据库的各种奇葩合集
- HTML input 标签 详细 方法 描述
- direct直连模式
- spark 一些算子的使用及优化
- 自制力才是你努力的第一步
- shiro realm实现
- vsftpd中关于ftpusers和user_list两个文件的说明以及vsftpd.conf中的userlist_enable和userlist_deny两个配置项的解释