ElasticSearch的安装与使用必知问题

来源:互联网 发布:php artisan数据库 编辑:程序博客网 时间:2024/06/06 01:03

ElasticSearch 的安装与使用必知问题

关于elasticSearch入门的教程,网上有很多,我这里主要就Windows平台下必然出现的问题稍作讲解,期望刚接触es的童鞋能少花点时间。

特别写这篇入门说明,是因为这两个问题难到了大多入门学elasticSearch的童鞋。linux下使用者,在网上类似的问题也有,可以参考。

第一条:安装必读

安装前必读注意事项:jdk9对elasticSearch不太友好(版本太新),必须使用JDK8,本人使用的是JDK8u152(jdk-8u152-windows-x64.exe)。如果使用JDK9,使用elasticSearch-rtf(v5.1.1),会出现下面的错误,请特别注意。
elasticSearch6.0的版本则必须使用JDK9,否则官网下载的msi不能安装成功,原因还没有去仔细检查。

elasticSearch-rtf使用JDK9会出现的问题:

$> elasticsearchJava HotSpot(TM) 64-Bit Server VM warning: Option UseParNewGC was deprecated in version 9.0 and will likely be removed in a future release.Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.Exception in thread "main" java.lang.ExceptionInInitializerErrorat org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:190)at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)Caused by: java.lang.UnsupportedOperationException: Boot class path mechanism is not supportedat java.management/sun.management.RuntimeImpl.getBootClassPath(RuntimeImpl.java:99)at org.elasticsearch.monitor.jvm.JvmInfo.(JvmInfo.java:77)...

原因是JDK9不再支持UseConcMarkSweepGC,具体情况如下:

废弃的GC选项已被移除( JEP 214 )。 在 JDK 8( JEP 173 )中已经弃用了一些详细的 GC 选项和选项组合。这些将不会被识别,并将导致 JVM 在启动时中止。要注意的选项如下所示
-XX:-UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-Xincgc
-XX:+CMSIncrementalMode -XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode -XX:+UseConcMarkSweepGC -XX:-UseParNewGC
-XX:+UseCMSCompactAtFullCollection
-XX:+CMSFullGCsBeforeCompaction
-XX:+UseCMSCollectionPassing
在 JDK 9 中,concurrent-mark-sweep (iCMS) 的增量模式已被移除,目前的计划是在 JDK 10 中完全删除 CMS。。。

第二条:使用必读

注意事项之二,网上有无数小教程,如使用普通的命令,

$ curl -XPUT http://localhost:9200/test?pretty  -d '{  "settings": {  "number_of_shards" : 2, "number_of_replicas" : 0  } }'

但在windows下必然会报如下类似的错误信息,

{  "error" : "ElasticsearchParseException[failed to parse source for create index]; nested: JsonParseException[Unexpected end-of-input: expected close marker for OBJECT (from [Source: [B@401789d1; line: 1, column: 0])\n at [Source: [B@401789d1; line: 1, column: 3]]; ",  "status" : 400}curl: (6) Could not resolve host: settingscurl: (3) [globbing] unmatched brace in column 1curl: (6) Could not resolve host: number_of_shardscurl: (3) Bad URL, colon is first charactercurl: (6) Could not resolve host: 2,curl: (6) Could not resolve host: number_of_replicascurl: (3) Bad URL, colon is first charactercurl: (6) Could not resolve host: 0curl: (3) [globbing] unmatched close brace/bracket in column 1curl: (3) [globbing] unmatched close brace/bracket in column 1...

如上,这是由于windows内部对引号的识别引起的,windows的cmd不能识别单引号,也不能识别双引号,上面的这条指令,正确的写法是用双引号替代单引号,用转义双引号(\”)替代双引号

$ curl -XPUT http://localhost:9200/test?pretty  -d "{  \"settings\": {  \"number_of_shards\" : 2, \"number_of_replicas\" : 0  } }'

你要是不嫌麻烦,用三个连续的双引号代替双引号(用”“”代替”),也是可以的。有网友说在Linux下也会遇到类似的问题,有时候json也无法识别其中的参数,此时也需要经过转义才能使用,我还没尝试。

第三条:理解

基本概念的理解,就是要知道index, type, document, field这些名词到底啥意思,看下表。

传统关系型数据库(如 MySQL)与 Elasticsearch 对比

Relational DB Elasticsearch 释义 Databases Indices 索引(名词)即数据库 Tables Types 类型即表名 Rows Documents 文档即每行数据 Columns Fields 字段

第四条,进入正文

ElasticSearch使用入门教程

ElasticSearch的RESTful API通过tcp协议的9200端口提供,可通过任何趁手的客户端工具与此接口进行交互,这其中包括最为流行的curl。curl与ElasticSearch交互的通用请求格式如下面所示。

curl -X<VERB> '<PROTOCOL>://<HOST>/<PATH>?<QUERY_STRING>' -d '<BODY>'

其中各参数的解释如下,

VERB:HTTP协议的请求方法,常用的有GET、POST、PUT、HEAD以及DELETE;PROTOCOL:协议类型,http或https;HOST:ES集群中的任一主机的主机名;PORT:ES服务监听的端口,默认为9200;QUERY_STRING:查询参数,例如?pretty表示使用易读的JSON格式输出;BODY:JSON格式的请求主体;

例如,查看ElasticSearch工作正常与否的信息,用下面的指令

curl 'http://localhost:9200/?pretty'

我在使用时参考很多网上的资料,无法一一注明感谢,如下列出一个地址,是这里curl命令例程的提供者,大家可以上去看看,不过切记,该地址提供的指令在windows上是无法直接运行的,而且,有些运行结果也和原博给出的有很大出入。

(refer to https://www.cnblogs.com/austinspark-jessylu/p/6797060.html)

创建文档:

curl -XPUT "http://localhost:9200/music?pretty"

上面这条没有任何新奇的地方,返回

{  "acknowledged" : true}

继续添加数据。

注意下面这样会出错,原因参见第二条使用必读。

curl -XPUT "http://localhost:9200/music/songs/1" -d '{ "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }'

返回:

{"error":"MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=0, length=1): [39]]; ","status":400}curl: (3) [globbing] unmatched brace in column 1curl: (6) Could not resolve host: namecurl: (6) Could not resolve host: Deck the Halls,curl: (6) Could not resolve host: yearcurl: (6) Could not resolve host: 1885,curl: (6) Could not resolve host: lyricscurl: (6) Could not resolve host: Fa la la la lacurl: (3) [globbing] unmatched close brace/bracket in column 1

正确的情况下,只能这样用

$ curl -XPUT "http://localhost:9200/music/songs/1" -d "{ \"name\": \"Deck the Halls\", \"year\": 1885, \"lyrics\": \"Fa la la la la\" }"返回,{"_index":"music","_type":"songs","_id":"1","_version":1,"created":true}Administrator@WIN10-711171017 MSYS ~$ curl -XGET "http://localhost:9200/music/songs/1"{"_index":"music","_type":"songs","_id":"1","_version":1,"found":true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }}

如果这样用,是不是好看一些?

$ curl -XGET "http://localhost:9200/music/songs/1?pretty"返回,{  "_index" : "music",  "_type" : "songs",  "_id" : "1",  "_version" : 1,  "found" : true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }}

可能你已经注意到了,http://localhost:9200/music/songs/1?pretty这段,有没有双引号都能正确被识别。

查看文档

要查看该文档,可使用简单的 GET 命令,正确的情况下,是这样使用的,

$ curl -XGET "http://localhost:9200/music/songs/1"返回{"_index":"music","_type":"songs","_id":"1","_version":1,"found":true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }}

同样,可以再看看?pretty的格式化作用,

$ curl -XGET "http://localhost:9200/music/songs/1?pretty"返回{  "_index" : "music",  "_type" : "songs",  "_id" : "1",  "_version" : 1,  "found" : true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }}

更新文档

命令:

curl -XPUT "http://localhost:9200/music/lyrics/1" -d '{ "name": "Deck the Halls", "year": 1886, "lyrics": "Fa la la la la" }'

当然正式使用时要转义双引号,否则windows必报错。

curl -XPUT http://localhost:9200/music/lyrics/1 -d "{ \"name\": \"Deck the Halls\", \"year\": 1886, \"lyrics\": \"Fa la la la la\" }"

然后change the data year from 1886 to 1887

curl -XPUT http://localhost:9200/music/lyrics/1 -d "{ \"name\": \"Deck the Halls\", \"year\": 1887, \"lyrics\": \"Fa la la la la\" }"

删除文档(但暂时不要删除)

试试下面这条命令,注意使用时一定要转义,简单的情况下,此种情况后面不再一一说明。

curl -XDELETE "http://localhost:9200/music/lyrics/1"

从文件插入文档

命令:

curl -XPUT http://localhost:9200/music/lyrics/2 -d @caseyjones.json

添加一首针对传统歌曲 “Ballad of Casey Jones” 的文档。将清单 1 复制到一个名为 caseyjones.json 的文件中。将该文件放在任何方便对它运行 cURL 命令的地方。我这里运行的目录是D:\cmder,所以该文件的是在D:\cmder\caseyjones.json。

清单 1. “Ballad of Casey Jones” 的 JSON 文档(caseyjones.json)
{  "artist": "Wallace Saunders",  "year": 1909,  "styles": ["traditional"],  "album": "Unknown",  "name": "Ballad of Casey Jones",  "lyrics": "Come all you rounders if you want to hearThe story of a brave engineerCasey Jones was the rounder's name....Come all you rounders if you want to hearThe story of a brave engineerCasey Jones was the rounder's nameOn the six-eight wheeler, boys, he won his fameThe caller called Casey at half past fourHe kissed his wife at the station doorHe mounted to the cabin with the orders in his handAnd he took his farewell trip to that promis'd landChorus:Casey Jones--mounted to his cabinCasey Jones--with his orders in his handCasey Jones--mounted to his cabinAnd he took his... land"}

运行过程和结果如下,

d:\cmderλ> curl -XPUT http://localhost:9200/music/lyrics/2 -d @caseyjones.json{"_index":"music","_type":"lyrics","_id":"2","_version":1,"created":true}d:\cmder
清单 2. “Walking Boss” JSON(walking.json)
{  "artist": "Clarence Ashley",  "year": 1920,  "name": "Walking Boss",  "styles": ["folk","protest"],  "album": "Traditional",  "lyrics": "Walkin' bossWalkin' bossWalkin' bossI don't belong to youI belongI belongI belongTo that steel driving crewWell you work one dayWork one dayWork one dayThen go lay around the shanty two"}

将此文档推送到索引中:

curl -XPUT "http://localhost:9200/music/lyrics/3" -d @walking.json

运行过程和结果如下

d:\cmderλ curl -XPUT http://localhost:9200/music/lyrics/3 -d @walking.json{"_index":"music","_type":"lyrics","_id":"3","_version":1,"created":true}d:\cmderλ

跑了这么久,我们截个图看下到底什么情况,

图片在此

这里写图片描述

搜索 REST API

文档 URL 有一个内置的 _search 端点用于此用途。在歌词中找到所有包含单词 you 的歌曲:

curl -XGET "http://localhost:9200/music/lyrics/_search?q=lyrics:'you'"

q 参数表示一个查询。

运行过程和结果如下,

d:\cmderλ curl -XGET "http://localhost:9200/music/lyrics/_search?q=lyrics:'you'"{"took":21,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}d:\cmderλ

使用其他比较符

例如,找到所有 1900 年以前编写的歌曲:

curl -XGET http://localhost:9200/music/lyrics/_search?q=year:"<1900"

此查询将返回完整的 “Casey Jones” 和 “Walking Boss” 文档。

d:\cmderλ curl -XGET http://localhost:9200/music/lyrics/_search?q=year:"<1900"{"took":20,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"music","_type":"lyrics","_id":"1","_score":1.0, "_source" : { "name": "Deck the Halls", "year": 1887, "lyrics": "Fa la la la la" }}]}}

限制字段

要限制您在结果中看到的字段,可将 fields 参数添加到您的查询中:

curl -XGET "http://localhost:9200/music/lyrics/_search?q=year:>1900&fields=year"

运行结果如下

d:\cmderλ curl -XGET "http://localhost:9200/music/lyrics/_search?q=year:>1900&fields=year"{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"music","_type":"lyrics","_id":"2","_score":1.0,"fields":{"year":[1909]}},{"_index":"music","_type":"lyrics","_id":"3","_score":1.0,"fields":{"year":[1920]}}]}}

有没有注意到,在两次查询中我刻意使用了不同的格式?前面说过,http这段,有没有双引号都能正确被识别。

使用更高级别的基本DSL的查询

DSL就是Domain Specified Language的意思。我们建一个文件,query.json,

{    "query" : {        "match" : {            "album" : "Traditional"        }    }}

使用命令,

curl -XGET "http://localhost:9200/music/lyrics/_search" -d @query.json

运行结果如下

d:\cmderλ curl -XGET http://localhost:9200/music/lyrics/_search -d @query.json{"took":17,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"music","_type":"lyrics","_id":"3","_score":0.30685282, "_source" : {  "artist": "Clarence Ashley",  "year": 1920,  "name": "Walking Boss",  "styles": ["folk","protest"],  "album": "Traditional",  "lyrics": "Walkin' bossWalkin' bossWalkin' bossI don't belong to youI belongI belongI belongTo that steel driving crewWell you work one dayWork one dayWork one dayThen go lay around the shanty two"}}]}}

到这先休息一下,打字敲的有点累了。

原创粉丝点击