自己用LUCENE建立索引
来源:互联网 发布:广州车险业务保费数据 编辑:程序博客网 时间:2024/06/05 07:11
在spider搜索的网页基础上作的,依然连接mysql数据库
class LinkToDb ...{
protected Connection con;
protected PreparedStatement preCount;
protected PreparedStatement preSelect;
LinkToDb(String driver,String sqlurl)...{
try...{
Class.forName(driver);
con=DriverManager.getConnection(sqlurl);
preCount=con.prepareStatement("SELECT count(*) as qty FROM visited_tab;");
preSelect=con.prepareStatement("SELECT * FROM visited_tab;");
}
catch(Exception e)...{
}
}
public int GetTableNum()...{
int count=0;
try...{
ResultSet rs=preCount.executeQuery();
rs.next();
count=rs.getInt("qty");
}
catch(Exception e)...{
}
return count;
}
public ResultSet GetResult()...{
ResultSet rs=null;
try...{
rs=preSelect.executeQuery();
//rs.next();
}
catch(Exception e)...{
}
return rs;
}
protected Connection con;
protected PreparedStatement preCount;
protected PreparedStatement preSelect;
LinkToDb(String driver,String sqlurl)...{
try...{
Class.forName(driver);
con=DriverManager.getConnection(sqlurl);
preCount=con.prepareStatement("SELECT count(*) as qty FROM visited_tab;");
preSelect=con.prepareStatement("SELECT * FROM visited_tab;");
}
catch(Exception e)...{
}
}
public int GetTableNum()...{
int count=0;
try...{
ResultSet rs=preCount.executeQuery();
rs.next();
count=rs.getInt("qty");
}
catch(Exception e)...{
}
return count;
}
public ResultSet GetResult()...{
ResultSet rs=null;
try...{
rs=preSelect.executeQuery();
//rs.next();
}
catch(Exception e)...{
}
return rs;
}
GetResult()方法是获得数据库所有对象(不清楚一点,rs是引用还是类,要是类的话 如果数据库太大。。。)
建议类对象creatIndex ci=new creatIndex();
还有 IndexWriter writer=new IndexWriter(dir,new CJKAnalyzer(),true);用了cjkanalyzer呵呵,之后就用lucene建立索引
ci.createConnection();
count=ci.getTableNum();
if(count<1)...{
System.out.println("no record in database");
}
else...{
rs=ci.getResult();
while(rs.next())...{
Document doc=new Document();
doc.add(Field.Keyword("url",rs.getString("url")));
doc.add(Field.Text("title",rs.getString("title")));
doc.add(Field.UnStored("text",rs.getString("text")));
doc.add(Field.UnIndexed("encode",rs.getString("encode")));
doc.add(Field.UnIndexed("last_modify_time",rs.getString("last_modify_time")));
writer.addDocument(doc);
System.out.println(rs.getString("url")+" has been indexed");
}
writer.optimize();
writer.close();
System.out.println("complete");
}
count=ci.getTableNum();
if(count<1)...{
System.out.println("no record in database");
}
else...{
rs=ci.getResult();
while(rs.next())...{
Document doc=new Document();
doc.add(Field.Keyword("url",rs.getString("url")));
doc.add(Field.Text("title",rs.getString("title")));
doc.add(Field.UnStored("text",rs.getString("text")));
doc.add(Field.UnIndexed("encode",rs.getString("encode")));
doc.add(Field.UnIndexed("last_modify_time",rs.getString("last_modify_time")));
writer.addDocument(doc);
System.out.println(rs.getString("url")+" has been indexed");
}
writer.optimize();
writer.close();
System.out.println("complete");
}
其实搜索代码也作好了,由于spider没有使用网页分析算法,导致搜索出很多没必要的内容,想看看pagerank算法,改进一下spider
- 自己用LUCENE建立索引
- 用Lucene建立索引及查询示例
- 用Lucene建立索引及查询示例
- 用Lucene建立索引及查询示例
- 用Lucene建立索引及查询示例
- lucene--建立索引
- lucene入门-建立索引
- [lucene]索引建立
- lucene--建立索引
- 使用Lucene建立索引
- Lucene索引建立
- Lucene-建立索引
- Lucene索引的建立
- Lucene建立索引库
- Lucene---建立索引初步
- Lucene建立索引笔记
- lucene建立索引
- Lucene(Lucence)建立索引(字段)
- createStatement参数解释
- 帮你免于失业的十大软件技术
- C语言和JAVA利用什么技巧,能够返回多种状态?
- 奇虎聚客---2006最激动人心的互联网应用
- 教你如何对付手机被强制收取短信费
- 自己用LUCENE建立索引
- 使用SQL语句获取数据库中随机N个记录
- 项目时间——你会讨价还价吗?
- 取得WebBrowser网页内容
- javascript38种小技巧,推荐新手查看。
- 程序员该做的事情
- 将vsftp与apache结合方便管理网站
- 几个进程间通信(IPC:InterProcess Communication)问题-
- 关于Web用户控件的书写