图书检索功能实现---图书馆客户端

来源:互联网 发布:人事管理数据流程图 编辑:程序博客网 时间:2024/04/20 05:27

今天完成了图书的检索功能。相对来说,还是有点复杂,因为图书检索结果页面的Html并不是那么规范,解析时需要很大的耐心。

首先需要根据查询条件获取结果的HTML,查询条件可以有很多种,这里为了实用、方便,我特意限制了查询条件为:keyword、东校区、可借出

获取结果HTML的方法如下:

/** * 根据关键字检索图书 *  * 检索可以是没有登录的情况,也可以是登录后的情况。 目前是声明了一个新的HTTPclient,即不需要登录, * 如果想设置为在登陆后才可以检索,则需要使用全局的HTTPclient,而不能再声明一个 *  * @param keyword *            关键字 * @return 检索结果的html */public static String serchBook(String keyword) {HttpGet httpGet = null;String searchResultHtml = null;HttpClient httpclient = new DefaultHttpClient();HttpResponse response;/** * 字段顺序很重要 *  * 设置查询条件为:关键字、东校区、可借出 */List<NameValuePair> params = new ArrayList<NameValuePair>();params.add(new BasicNameValuePair("searchtype", "X"));params.add(new BasicNameValuePair("searcharg", keyword));// 查询关键字params.add(new BasicNameValuePair("searchscope", "1"));// 1代表东区params.add(new BasicNameValuePair("sortdropdown", "-"));params.add(new BasicNameValuePair("SORT", "DZ"));// 设置排序方式为按日期倒排params.add(new BasicNameValuePair("extended", "0"));params.add(new BasicNameValuePair("SUBMIT", "检索"));// 查询按钮params.add(new BasicNameValuePair("availlim", "1"));// 设置查询条件---可借出params.add(new BasicNameValuePair("searchlimits", ""));params.add(new BasicNameValuePair("searchorigarg", ""));// 设置上次查询的关键字及排序方式// 对参数编码String param = URLEncodedUtils.format(params, "UTF-8");System.out.println(param);try {// 将URL与参数拼接// http://innopac.lib.xjtu.edu.cn/search~S1*chx/String test_url = "http://innopac.lib.xjtu.edu.cn/search~S1*chx/";httpGet = new HttpGet(test_url + "?" + param);httpGet.setHeader("Host", "innopac.lib.xjtu.edu.cn");httpGet.setHeader("Referer", test_url + "?" + param);response = httpclient.execute(httpGet);int code = response.getStatusLine().getStatusCode();System.out.println("---------------searchbook------------------------");System.out.println(response.getStatusLine());if (code == 200) {if (response != null) {searchResultHtml = EntityUtils.toString(response.getEntity(), HTTP.UTF_8);return searchResultHtml;}}} catch (ClientProtocolException e) {e.printStackTrace();} catch (IOException e) {e.printStackTrace();} finally {httpGet.abort();}return "";}

这样便得到了检索结果的HTML,下面同样是使用jsoup对其进行解析,并进行封装。

首先来看看,页面上的显示状况:



根据所需解析的信息,我们需要两个封装类。

一个类封装书目信息,另一个封装馆藏信息。这两个类如下:


1.类BookInfo

package com.ali.login.bean;import java.util.List;/** * 搜索结果中书目的具体信息 *  * @author shuyan *  */public class BookInfo {private String imgLink;// 图片链接private String briefTitle;// Java JDK 7实例宝典 Java JDK 7 shi li bao dian / 韩雪,// 郭天娇编著private String year;// 2014 文字印刷资料private List<BookAddress> bookAddresses;// 书目馆藏信息private String reserveLink;// 预约链接public BookInfo() {super();}public BookInfo(String imgLink, String briefTitle, String year,List<BookAddress> bookAddresses, String reserveLink) {super();this.imgLink = imgLink;this.briefTitle = briefTitle;this.year = year;this.bookAddresses = bookAddresses;this.reserveLink = reserveLink;}public String getImgLink() {return imgLink;}public void setImgLink(String imgLink) {this.imgLink = imgLink;}public String getBriefTitle() {return briefTitle;}public void setBriefTitle(String briefTitle) {this.briefTitle = briefTitle;}public String getYear() {return year;}public void setYear(String year) {this.year = year;}public List<BookAddress> getBookAddresses() {return bookAddresses;}public void setBookAddresses(List<BookAddress> bookAddresses) {this.bookAddresses = bookAddresses;}public String getReserveLink() {return reserveLink;}public void setReserveLink(String reserveLink) {this.reserveLink = reserveLink;}@Overridepublic String toString() {return "BookInfo [imgLink=" + imgLink + ", briefTitle=" + briefTitle+ ", year=" + year + ", bookAddresses=" + bookAddresses+ ", reserveLink=" + reserveLink + "]";}}


2.BookAdress

package com.ali.login.bean;/** * 书目的馆藏信息 *  * @author shuyan *  */public class BookAddress {private String holdLand;// 馆藏地private String callNumber;// 索书号private String status;// 状态public BookAddress() {super();}public BookAddress(String holdLand, String callNumber, String status) {this.holdLand = holdLand;this.callNumber = callNumber;this.status = status;}public String getHoldLand() {return holdLand;}public void setHoldLand(String holdLand) {this.holdLand = holdLand;}public String getCallNumber() {return callNumber;}public void setCallNumber(String callNumber) {this.callNumber = callNumber;}public String getStatus() {return status;}public void setStatus(String status) {this.status = status;}@Overridepublic String toString() {return "BookAddress [holdLand=" + holdLand + ", callNumber="+ callNumber + ", status=" + status + "]";}}

有了这两个类,便可以对HTML进行解析封装了。

这里开始变得有点麻烦,因为这里的标签很是不规范。

代码如下:

/** * 处理查询结果的HTML *  * @param searchResultHtml *            html字符串 *  * @return 书目信息集合 */public static List<BookInfo> getSearchResult(String searchResultHtml) {List<BookInfo> bookInfos = new ArrayList<BookInfo>();Document document = Jsoup.parse(searchResultHtml);Elements items = document.getElementsByClass("briefCitRow");// 书目集合int i = 1;for (Element item : items) {BookInfo bookInfo = null;List<BookAddress> bookAddresses = new ArrayList<>();Element ele_par = item.select("a[href]").get(0);// http://202.117.24.227/bibimage/zycover.php?isbn=9787121217074String imgLink = ele_par.child(0).attr("src");// 图片链接Element ele_reserve = item.getElementsByClass("briefcitRequest").get(0);// 预约图书的链接Element ele_ahref = ele_reserve.select("a[href]").get(0);String reserveLink = ele_ahref.attr("href");// 预约链接// 需要添加host在前面// /availlim/search~S1*chx?/XJava&searchscope=1&SORT=DZ/XJava&searchscope=1&SORT=DZ&extended=0&SUBKEY=Java/1%2C2973%2C2973%2CC/requestbrowse~b3838346&FF=XJava&searchscope=1&SORT=DZ&1%2C1%2C// 注意有两个 class = briefcitDetailElements ele_briefcitDetails = item.getElementsByClass("briefcitDetail");// 先处理第一个String briefTitle = ele_briefcitDetails.get(0).getElementsByClass("briefcitTitle").get(0).text();// 书目简要描述// 处理第二个String year = ele_briefcitDetails.get(1).text();// 年份Elements ele_addresses = item.getElementsByClass("briefcitItems").get(0).getElementsByClass("bibItems").get(0).getElementsByClass("bibItemsEntry");// 书目的馆藏信息/** * 预约也在这里面处理 */for (Element ele_add : ele_addresses) {BookAddress address = null;// 这里有3个td标签Elements ele_tds = ele_add.getElementsByTag("td");String bookStore = ele_tds.get(0).text();String callNumber = ele_tds.get(1).text();String status = ele_tds.get(2).text();address = new BookAddress(bookStore, callNumber, status);bookAddresses.add(address);}bookInfo = new BookInfo(imgLink, briefTitle, year, bookAddresses,reserveLink);bookInfos.add(bookInfo);}return bookInfos;}


这样便得到了封装好的书目信息集合,测试如下:

public static void main(String[] args) {String searchResultHtml = LibraryUtil.serchBook("Java");List<BookInfo> bookInfos = getSearchResult(searchResultHtml);int i = 0;for (BookInfo bookInfo : bookInfos) {if(i<5){System.out.println(bookInfo.toString());}i++;}}

结果为:


如此便实现了图书的检索功能...

当然这里还有许多需要考虑的地方,如:检索结果总数,每页需要显示多少条记录,筛选无意义的结果,预约功能等...







2 0
原创粉丝点击