solr配数据库介绍

来源：互联网发布：vb as boolean 编辑：程序博客网时间：2024/04/29 02:45

本文就以mysql为例进行一个较详细的介绍。其使用到的是“dataimport”。

1、在conf\solrconfig.xml中添加，增加导入数据功能

1
2
3
4
5
 <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">   
  <lst name="defaults">   
   <str name="config">data-config.xml</str>   
  </lst>   
  </requestHandler>

2、在conf\目录下添加一个数据源data-config.xml,代码如下：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<dataConfig>
 
    <dataSource type="JdbcDataSource"
 
   driver="com.mysql.jdbc.Driver"
 
   url="jdbc:mysql://172.0.0.1:3306/cmntadmin"
 
   user="root"
 
   password=""/>
 
    <document name="content">
 
        <entity name="node" query="select id,username,creator from forbiduser">
 
            <field column="id" name="id" />
 
            <field column="username" name="name" />
 
            <field column="creator" name="contents" />
 
        </entity>
 
    </document>
 
</dataConfig>

这里配置了数据源的信息。entity的内容来自于“query”查询得到的结果。field对应查询出的字段信息：“column”对应数据库字段名、“name”必须对应“schema.xml”中配置的field值。

3、创建schema.xml语法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.5">
<fields>
    <!-- If you remove this field, you must _also_ disable the update log in solrconfig.xml
      or Solr won't start. _version_ and update log are required for SolrCloud
   --> 
   <field name="_version_" type="long" indexed="true" stored="true"/>
    
   <!-- points to the root document of a block of nested documents. Required for nested
      document support, may be removed otherwise
   -->
   <field name="_root_" type="string" indexed="true" stored="false"/>
   <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> 
    <field name="name" type="text_general" indexed="true" stored="true"/>
    <field name="contents" type="text_ik" indexed="true" stored="true"/>
 </fields>
 <!-- Field to use to determine and enforce document uniqueness. 
      Unless this field is marked with required="false", it will be a required field
   -->
 <uniqueKey>id</uniqueKey>
 <!-- DEPRECATED: The defaultSearchField is consulted by various query parsers when
  parsing a query string that isn't explicit about the field.  Machine (non-user)
  generated queries are best made explicit, or they can use the "df" request parameter
  which takes precedence over this.
  Note: Un-commenting defaultSearchField will be insufficient if your request handler
  in solrconfig.xml defines "df", which takes precedence. That would need to be removed.-->
 <defaultSearchField>contents</defaultSearchField>
<copyField source="name" dest="contents"/>
<solrQueryParser defaultOperator="OR"/>
<types>
 <fieldType name="string" class="solr.StrField" sortMissingLast="true" />
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
<fieldType name="text_ik" class="solr.TextField"> 
         <analyzer class="org.wltea.analyzer.lucene.IKAnalyzer"/> 
 </fieldType>
 
 </types>
</schema>

    schema.xml 里重要的字段:
    要有这个copyField字段SOLR才能检索多个字段的值(以下设置将同时搜索 id,name,contents中的值)<defaultSearchField>contents</defaultSearchField>
    copyField是用来复制你一个栏位里的值到另一栏位用. 如你可以将name里的东西copy到default里, 这样solr做检索时也会检索到name里的東西.
<copyField source="name" dest="contents"/>

4、导入相关jar包

因为本文使用mysql作为数据源，所以需要驱动包（mysql-connector.jar）；另外，使用dataimport功能还需要solr-dataimporthandler-4.7.2.jar和solr-dataimporthandler-extras-4.7.2.jar，这两个jar包不需要下载，在\dist目录下就有。

copy这三个jar包到tomcat下的solr工程下的lib目录下（webapps\solr\WEB-INF\lib）。

5、创建索引

重启tomcat。

A）、可以通过url的方式触发创建全量索引：

http://localhost:8080/solr/dataimport?command=full-import

B）、通过admin页面上的“dataimport”模块进行操作：

转自：http://flyingsnail.blog.51cto.com/5341669/1575075。谢谢分享，此处重点看了一下默认字段的修改

0 0