solr或lucene中出现there are more terms than documents in field "name", but it's impossible to sort on tokenized fields异常

来源:互联网 发布:爬虫数据 编辑:程序博客网 时间:2024/06/06 01:48

在使用solr的排序时出现了类似下面的异常:
there are more terms than documents in field "name", but it's impossible to sort on tokenized fields
name在solr中为text型字段
根据
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/search/Sort.html

 

Encapsulates sort criteria for returned hits.

 

The fields used to determine sort order must be carefully chosen. Documents must contain a single term in such a field, and the value of the term should

indicate the document's relative position in a given sort order. The field must be indexed, but should not be tokenized, and does not need to be stored

(unless you happen to want it back with the rest of your document data). In other words:

 

document.add (new Field ("byNumber", Integer.toString(x), Field.Store.NO, Field.Index.NOT_ANALYZED));


的描述,sort的字段是"should not be tokenized",而solr中的配置是对text字段进行了tokeniz了的因此会出现类似下面的异常:
there are more terms than documents in field "name", but it's impossible to sort on tokenized fields

 

这里是nabble上一个用户关于这方面的问题
http://old.nabble.com/Exception-when-field-sort.-td21302894.html