Indexing

来源：互联网发布：免费文字识别软件编辑：程序博客网时间：2024/05/16 10:15

Adding documents to an index
1.KeyWord, Unindexed, UnStored, Text
2.Heterogeneous Documents
3.Appendable Fields

Removing Documents from an index
1.delete()
2.hasDeletions()
3.isDeleted()

Undeleting Documents
1.undeleteAll()

Updating Documents in an index
1.update
2.updating by bataching deletions

Boosting Documents and Fields
1.setBoost()
2.Document and Field

Indexing Dates
1.Keyword()
2.DateField

Indexing numbers
1.WhitespaceAnalyzer StandardAnalyzer
2.SimpleAnalyzer StopAnalyzer

Indexing Fields used for sorting
1.Keyword
2.Integers, Floats and Strings

Indexing tuning
1.mergeFactor
2.maxMergeDocs
3.minMergeDocs

in-memory indexing:RAMDirctory

Batch indexing by using RAMDirectory as a buffer
1.Create an FSDirectory index
2.Create a RAMDirectory index
3.Add Documents to RAMDirectory index
4.Every so oftem, Flush everything buffered in RAMDirectory into FSDirectory
5.Go to step 3.

Parallelizing indexing by working with mutiple indexes

Limiting Field sizes:maxFieldLength

Optimizing an index only affects the speed of searches against that index, and doesn’t

affect the speed of indexing. Optimizing by minimizing the number of index files that need

to be opened

Concurreny rules
1.Any number of read-only operations may be executed concurrently.
2.Any number of read-only operations may be executed while an index is
being modified.
3.Only a single index-modifying operation may execute at a time.

Index locking
1.org,apache.lucene.lockDir
2.reader.isLocked()
3.reader.unlock()

debugging indexing
writer.infoStream = System.out