Elasticsearch的安装/术语/索引/查询API/DSL
来源:互联网 发布:淘宝申诉进货发票凭证 编辑:程序博客网 时间:2024/06/06 04:51
倒排索引
Elasticsearch使用一种叫做倒排索引(inverted index)的结构来做快速的全文搜索。倒排索引由在文档中出现的唯一的单词列表,以及对于每个单词在文档中的位置组成。
下载安装
安装或升级JDK(Java SE Development Kit 8)
java -versionecho $JAVA_HOME
下载Elasticsearch程序压缩包并解压,以2.2.1为例:
curl -L -O https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.2.1/elasticsearch-2.2.1.tar.gztar -xvf elasticsearch-2.2.1.tar.gzcd elasticsearch-2.2.1
以demon方式elasticsearch启动(需要非root账户)
bin/elasticsearch -d
Logstash启动
bin/logstash -f config/jdbc.conf
其中jdbc.conf为配置文件。
名词解释
索引(index)
相当于SQL数据库的数据库(database)
类型(type)
相当于SQL数据库的数据表(table)
文档(document)
相当于SQL数据库的数据记录或行(record or row)
字段(Field)
相当于SQL数据库的数据列(column)
映像(mapping)
相当于SQL数据库的数据模式(schema)
权重(boosting)
用于增加权重,例如title^5表示分数(score)加到5倍
解析器(analyzer)
用于解析Index,包含一个Tokenizer。
Elasticsearch ships with a wide range of built-in analyzers, which can be used in any index without further configuration.
Analyzers are composed of a single Tokenizer and zero or more TokenFilters. The tokenizer may be preceded by one or more CharFilters. The analysis module allows you to register Analyzers under logical names which can then be referenced either in mapping definitions or in certain APIs.
POST _analyze{ "analyzer": "whitespace", "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."}
分词(tokenizer)
用于解析字符串,成为terms或tokens
A tokenizer receives a stream of characters, breaks it up into individual tokens (usually individual words), and outputs a stream of tokens.
Tokenizers are used to break a string down into a stream of terms or tokens. A simple tokenizer might split the string up into terms wherever it encounters whitespace or punctuation.
POST _analyze{ "tokenizer": "whitespace", "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."}
过滤器(filter)
建议(suggester)
用于搜索建议
一共有四种:
+ Term suggester
+ Phrase Suggester
+ Completion Suggester
+ Context Suggester
索引操作
创建索引
PUT test{ "settings" : { "number_of_shards" : 1 }, "mappings" : { "type1" : { "properties" : { "field1" : { "type" : "text" } } } }}
删除索引
DELETE /twitter
获取索引
GET /twitterGET twitter/_settings,_mappings
索引是否存在
HEAD twitter
打开关闭索引
POST /my_index/_closePOST /my_index/_open
设置mapping(Put Mapping)
PUT twitter { "mappings": { "tweet": { "properties": { "message": { "type": "text" } } } }}
获取mapping(Get Mapping)
GET /_mapping/tweet,kimchyGET /_all/_mapping/tweet,bookGET /twitter,kimchy/_mapping/field/messageGET /_all/_mapping/tweet,book/field/message,user.idGET /_all/_mapping/tw*/field/*.id
类型是否存在(Types Exists)
HEAD twitter/_mapping/tweet
更新索引设置(更新analysis需要先关闭索引)
POST /twitter/_closePUT /twitter/_settings{ "analysis" : { "analyzer":{ "content":{ "type":"custom", "tokenizer":"whitespace" } } }}POST /twitter/_open
获取设置(Get Settings)
GET /twitter,kimchy/_settings
解析(Analyze)
GET _analyze{ "tokenizer" : "keyword", "filter" : ["lowercase"], "char_filter" : ["html_strip"], "text" : "this is a <b>test</b>"}
解析器详解(Explain Analyze)
GET _analyze{ "tokenizer" : "standard", "filter" : ["snowball"], "text" : "detailed output", "explain" : true, "attributes" : ["keyword"] }
索引状态信息(Indices stats Segments Recovery Shard Stores)
GET /index1,index2/_statsGET twitter/_stats?level=shardsGET /index1,index2/_segmentsGET index1,index2/_recovery?humanGET index1,index2/_shard_stores?status=green
清空缓存(Clear Cache)
POST /twitter/_cache/clearPOST /kimchy,elasticsearch/_cache/clearPOST /_cache/clear
释放内存到到索引存储(Flush)
POST twitter/_flushPOST kimchy,elasticsearch/_flushPOST _flush
刷新(refresh)
POST /twitter/_refreshPOST /kimchy,elasticsearch/_refreshPOST /_refresh
强制合并(Force Merge) 可以减少分段(segment)数
POST /twitter/_forcemergePOST /kimchy,elasticsearch/_forcemergePOST /_forcemerge
查询API
search
GET /twitter/_search?q=user:kimchyGET /twitter/tweet,user/_search?q=user:kimchyGET /kimchy,elasticsearch/tweet/_search?q=tag:wowGET /_all/tweet/_search?q=tag:wowGET /_search?q=tag:wow
URI Search
GET twitter/tweet/_search?q=user:kimchy
q
The query string (maps to the query_string query, see Query String Query for more details).
请求主体查询
GET /twitter/tweet/_search{ "explain": true, "version": true, "query" : { "term" : { "user" : "kimchy" } }, "from" : 0, "size" : 10, "sort" : [ { "post_date" : {"order" : "asc"}}, "user", { "name" : "desc" }, { "age" : "desc" }, "_score" ], "_source": [ "obj1.*", "obj2.*" ], "script_fields" : { "test1" : { "script" : "params['_source']['message']" } }, "post_filter": { "term": { "color": "red" } }, "highlight" : { "pre_tags" : ["<tag1>", "<tag2>"], "post_tags" : ["</tag1>", "</tag2>"], "fields" : { "_all" : {} } }}
Inner hits
Query DSL
全文搜索(Full text queries)
match
标准全文查询
GET /_search{ "query": { "match" : { "message" : "this is a test" } }}
match_phrase
短语查询,支持分词
GET /_search { "query": { "match_phrase" : { "message" : { "query" : "this is a test", "analyzer" : "my_analyzer" } } } }
match_phrase_prefix
GET /_search{ "query": { "match_phrase_prefix" : { "message" : { "query" : "quick brown f", "max_expansions" : 10 } } }}
multi_match
支持多字段版本
GET /_search{ "query": { "multi_match" : { "query" : "this is a test", "fields" : [ "subject^3", "message" ] } }}
common_terms
停止符(stopwords)
GET /_search{ "query": { "common": { "body": { "query": "this is bonsai cool", "cutoff_frequency": 0.001 } } }}
query_string
查询解析
GET /_search{ "query": { "query_string" : { "default_field" : "content", "query" : "this AND that OR thus" } }}
simple_query_string
使用 SimpleQueryParser去解析查询语句
GET /_search{ "query": { "simple_query_string" : { "query": "\"fried eggs\" +(eggplant | potato) -frittata", "analyzer": "snowball", "fields": ["body^5","_all"], "default_operator": "and" } }}
Term级别查询
更底层的查询
- Term
- Terms
- range
- exists
- prefix
- wildcard
- regexp
- fuzzy
- type
- ids
GET /_search{ "query": { "constant_score" : { "filter" : { "terms" : { "user" : ["kimchy", "elasticsearch"]} } } }}
组合查询(Compound queries)
- constant_score
- bool
- dis_max
- function_score
- boosting
- indices
GET /_search{ "query": { "constant_score" : { "filter" : { "term" : { "user" : "kimchy"} }, "boost" : 1.2 } }}
联合查询(Joining queries)
- Nested
- Has Child
- Has Parent
- Parent Id
GET /_search{ "query": { "constant_score" : { "filter" : { "term" : { "user" : "kimchy"} }, "boost" : 1.2 } }}
- Elasticsearch的安装/术语/索引/查询API/DSL
- Elasticsearch java API (20)查询 DSL
- Elasticsearch java API (21)查询 DSL 项级别查询
- Elasticsearch java API (21)查询 DSL 复合查询
- Elasticsearch java API (22)查询 DSL Joining查询
- Elasticsearch java API (23)查询 DSL Geo查询
- Elasticsearch java API (24)查询 DSL Specialized(专业)查询
- Elasticsearch java API (25)查询 DSL Span(跨度)查询
- Elasticsearch DSL查询
- elasticsearch DSL java api总结
- Elasticsearch的DSL之比较重要的几个查询语句
- ElasticSearch的 Query DSL 和 Filter DSL
- ElasticSearch的 Query DSL 和 Filter DSL
- elasticsearch建立索引操作的API
- elasticsearch建立索引操作的API
- Elasticsearch——(API//索引//查询//聚合)简介
- elasticsearch api 创建索引
- Elasticsearch的Java API/查询/分页等
- Android Studio 技巧之 【Enter vs Tab for Code Completion】
- 程序员必须知道的一些开发社区
- css 文字换 以 超出胜率
- CK3
- IIC知识整理以及ADS1115
- Elasticsearch的安装/术语/索引/查询API/DSL
- java宝典---开发工具tomcat 7.0+myecplise10.6+jdk 1.6
- 芝麻二维码6大功能
- android二维码、条形码分分钟秒杀
- WOJ 29 Werewolf(树形DP+枚举)
- ARP报文类型(TCP/IP详解)
- python-map,reduce
- Css的一些效果代码(旋转,进度条,透明,固定)
- 解决MyEclipse修改文件后Building workspace时间过长