regain的基本步骤
来源:互联网 发布:免费 网盘php源码 编辑:程序博客网 时间:2024/04/18 21:40
Technical Details
Technology
regain is based on Jakarta Lucene, a library for creating and searching search indices.
regain itself is 100% pure Java. The non-Java parts are plugins that read the formats Excel, Powerpoint and Word. For the formats Excel and Word however there are alternatives in 100% pure Java.
Searching with regain
The work of regain is split in two parts: The creation of the search index and the search on the search index.
The following image shows you an overwiew about how regain searches.
The creation of the search index
The crawler searches a website or a directory tree for documents. In the configuration you may specify what exactly should be crawled. From each document the actual text is extracted using so-called preparators. The text is added to the search index.
The search on the search index
After you've created a search index, you are able to perform searches. The search index is built in such way that searching is very fast.
And this already is the whole trick of search machines: The time you need for a full text search is moved from the actual search (where a user waits for the results) to the index creation (which runs automatically in the background) using a clever search index.
Rating the search results
The search results are rated after the relative frequency of the search terms in the document. If a search term appears very often in a document, it will appear more on the top. In doing so, the length of a document is considered as well: A document with 100 words that contains a search term 5 times will be rated as a better hit than a document with 1000 words containing the search term 10 times.
- regain的基本步骤
- 搜索的基本步骤
- edm的基本步骤
- 编译的基本步骤
- 在Ubuntu机器上搭建Tomcat和Regain服务器的过程
- 创建HttpServlet的基本步骤
- tarball安装的基本步骤
- 实用DWR的基本步骤
- JDBC应用程序的基本步骤
- 创建HttpServlet的基本步骤
- 网站制作的基本步骤
- 数据库设计的基本步骤
- 数据库设计的基本步骤
- SSH整合的基本步骤
- 使用Hibernate的基本步骤
- 网站优化的基本步骤
- Stage3D程序的基本步骤
- 实现Ajax的基本步骤
- DIY一张“时尚”的多媒体工具光盘
- 品味Thinking In C++(二)
- FTP资源
- flex开发心得体会经验
- 使用DrawDib
- regain的基本步骤
- 奇怪,使用MSCOM控件控制信号灯,相同的程序结果却不同?
- 06252005 阴
- 使用Hibernate+Oracle9i R2 处理Clob大文本数据
- 说自己的程序员,难以启齿吗?
- 夏季中暑防治完全手册
- 一定要独立,一定要争气
- 史上最玄的11大巧合
- 世界有多脏?