solr 配置clustering

来源:互联网 发布:蜂窝移动数据不能关闭 编辑:程序博客网 时间:2024/06/08 02:56

首先是在solrconfig.xml,添加

 <searchComponent
    name="clusteringComponent"
    enable="${solr.clustering.enabled:true}"
    class="org.apache.solr.handler.clustering.ClusteringComponent" >
    <!-- Declare an engine -->
    <lst name="engine">
      <!-- The name, only one can be named "default" -->
      <str name="name">default</str>
      <!--
           Class name of Carrot2 clustering algorithm. Currently available algorithms are:
          
           * org.carrot2.clustering.lingo.LingoClusteringAlgorithm
           * org.carrot2.clustering.stc.STCClusteringAlgorithm
          
           See http://project.carrot2.org/algorithms.html for the algorithm's characteristics.
        -->
      <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
      <!--
           Overriding values for Carrot2 default algorithm attributes. For a description
           of all available attributes, see: http://download.carrot2.org/stable/manual/#chapter.components.
           Use attribute key as name attribute of str elements below. These can be further
           overridden for individual requests by specifying attribute key as request
           parameter name and attribute value as parameter value.
        -->
      <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
    </lst>
    <lst name="engine">
      <str name="name">stc</str>
      <str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str>
    </lst>
  </searchComponent>
  <requestHandler name="/clustering" class="solr.SearchHandler">
     <lst name="defaults">
       <bool name="clustering">true</bool>
       <str name="clustering.engine">default</str>
       <bool name="clustering.results">true</bool>
       <!-- The title field -->
       <str name="carrot.title">name</str>
       <str name="carrot.url">id</str>
       <!-- The field to cluster on -->
       <str name="carrot.snippet">features</str>
       <!-- produce summaries -->
       <bool name="carrot.produceSummary">true</bool>
       <!-- the maximum number of labels per cluster -->
       <!--<int name="carrot.numDescriptions">5</int>-->
       <!-- produce sub clusters -->
       <bool name="carrot.outputSubClusters">false</bool>
    </lst>    
    <arr name="last-components">
      <str>clusteringComponent</str>
    </arr>
  </requestHandler>

 

 

然后在%solr_home%/lib目录下添加扩展包:

从下载的solr项目中将

dist/apache-solr-clustering-*.jar,

contrib/clustering目录下的所有jar包,

contrib/clustering/downloads 目录下的所有jar包

加入到%solr_home%/lib.

 

在加入扩展包时,遇到一个问题,就是下载的solr项目下contrib/clustering/downloads的目录下没有jar包,这个需要运行contrib/clustering目录下的 build.xml

所以先安装Ant,然后运行 cmd,进入doc界面,进入contrib/clustering目录,运行 ant命令

便会下载相应的jar 包,包括

simple-xml-1.7.3.jar,pcj-1.2.jar,colt-1.2.0.jar, nni.jar四个包

但是可能build.xml指定的下载nni.jar包时的路径有问题,所以没有下载成功。所以自已得去网下搜索下载它。

 

 

 

 

运行solr:

http://localhost:8080/solr/clustering?q=*:*&rows=10

 

 

原创粉丝点击