Solr文档学习--Documents, Fields, and Schema Design
来源:互联网 发布:电脑速录软件 编辑:程序博客网 时间:2024/05/16 09:07
Overview of Documents, Fields, and Schema Design
The fundamental premise of Solr is simple. You give it a lot of information, then later you can ask it questions and find the piece of information you want. The part where you feed in all the information is called or indexing up. When you ask a question, it’s called a query.
Solr所做的事情就是建索引和查询数据。
Solr’s Schema File
Solr stores details about the field types and fields it is expected to understand in a schema file. The name and location of this file may vary depending on how you initially configured Solr or if you modified it later.
- managed-schema is the name for the schema file Solr uses by default to support making Schema changes at runtime via the Schema API , or Schemaless Mode features. You may explicitly configure the managed schema features to use an alternative filename if you choose, but the contents of the files are managed schema features still updated automatically by Solr.
- schema.xml is the traditional name for a schema file which can be edited manually by users who use the ClassicIndexSchemaFactory.
- If you are using SolrCloud you may not be able to find any file by these names on the local filesystem. You will only be able to see the schema through the Schema API (if enabled) or through the Solr Admin UI’s Cloud Screens .
Solr对字段的定义在solr的模式定义文件里。
managed-schema
managed-schema 是solr的默认模式定义文件,可以在运行是通过Schema API 改变,或者使用Schemaless Mode特性。
我们建一个叫test的collection
会自动生成一个managed-schema的文件
文件开始有这么一段注释
<!-- Solr managed schema - automatically generated - DO NOT EDIT -->
也就是说我们不能通过编辑managed-schema来定义字段
我们可以通过Schema API来改变
先看一下当前schema的定义的fields
我们通过Schema API 来创建一个name的field
再回控制台看看
我们添加一条记录
查询一下
用程序操作一下
定义一个User
import org.apache.solr.client.solrj.beans.Field;import org.springframework.data.annotation.Id;public class User { @Id @Field private String id; @Field private String name; public String getId() { return id; } public void setId(String id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } @Override public String toString() { return "User [id=" + id + ", name=" + name + "]"; }}
主程序
User user = new User();user.setId("123456");user.setName("程高伟");saveSolrResource(user);SolrQuery query = new SolrQuery();query.setQuery("程高伟");QueryResponse rsp = client.query(query);List<User> userList = rsp.getBeans(User.class);System.out.println(userList);
结果
schema.xml
通过ClassicIndexSchemaFactory用户可以编辑schema.xml
在配置schema.xml 之前需要了解Solr的Field定义
SolrCloud 暂不讨论
Field Type Definitions and Properties
A field type definition can include four types of information:
- The name of the field type (mandatory)
- An implementation class name (mandatory)
- If the field type is TextField, a description of the field analysis for the field type
- Field type properties - depending on the implementation class, some properties may be mandatory.
一个field(下面可能说字段)可以包括以下的4中信息。
- 名字(必须的)
- 实现类名称(必须的)
- 如果field是TextField,还要关注field analysis的描述(用来分词的)
- field类型的其他属性,主要看实现类是要求
Field types are defined in schema.xml. Each field type is defined between fieldType elements.
字段类型在schema.xml文件中定义,在fieldType标签中定义
<fieldType name="ancestor_path" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/"/> </analyzer></fieldType>
第一行包括名字ancestor_path,实现类solr.TextField,中间的是analyzer。后面再写文章
The implementing class is responsible for making sure the field is handled correctly. In the class names in schema.xml , the string is shorthand for org.apache.solr.analysis or org.apache.solr.schema .Therefore, solr.TextField is really org.apache.solr.schema.TextField.
实现类用来确保该字段可以被正确的处理。在schema.xml的定义中solr是 org.apache.solr.analysis或org.apache.solr.schema的缩写,因此,solr.TextField实际上是org.apache.solr.schema.TextField.
属性
Field Types Included with Solr
solr自带的数据类型
The following table lists the field types that are available in Solr. The package org.apache.solr.schema includes all the classes listed in this table.
Field Properties by Use Case
属性使用案例
角标的含义
总体看一下schema.xml
<schema> <types> <fields> <uniqueKey> <copyField></schema>
总体感觉比较乱。
全当记录了。
参考文献
http://101.110.118.72/archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-6.1.pdf
- Solr文档学习--Documents, Fields, and Schema Design
- 文档(Documents), 字段(Fields), 及模式设计(Schema Design)
- solr schema.xml的fields节点
- solr Document,Fields,Schema设计概况
- solr配置schema.xml学习
- use mongoose to update documents with model and Schema
- solr add documents
- Solr Fields字段Copying Fields/Dynamic Fields
- Chromium Design Documents
- solr之fields配置
- Solr文档学习--Getting Started
- solr或lucene中出现there are more terms than documents in field "name", but it's impossible to sort on tokenized fields异常
- solr_专题:schema 之 fields
- 【deep learning学习笔记】Distributed Representations of Sentences and Documents
- Here Documents嵌入式文档
- Solr-----5、Solr Schema配置
- Solr学习笔记之在schema.xml中定义字段
- Solr 6.0 学习(三)Schema.xml 配置
- awk-example
- Qt/C++ 之重新认识
- rpm安装mysql
- tomcat生成证书与SSL配置
- iOS开发 - 兼容iOS 10 资料整理笔记
- Solr文档学习--Documents, Fields, and Schema Design
- 推荐4个Android引导页控件
- treemap
- Android APK的生成流程
- java提取字符串中的中文
- linux僵尸进程
- iOS开发- 注释插件VVDocumenter-Xcode in Xcode8
- [数据结构]队列的操作
- 聊聊DHCP服务器