测试使用-批量往es索引中添加数据,es的使用小结。

来源：互联网发布：java方法中布尔型变量编辑：程序博客网时间：2024/06/05 04:15

# encoding:utf8from datetime import datetimefrom elasticsearch import Elasticsearchimport elasticsearch.helpersimport randomes = Elasticsearch(['172.18.1.22:9200', '172.18.1.23:9200', '172.18.1.24:9200', '172.18.1.25:9200', '172.18.1.26:9200'])es.indices.create(index='test_index', ignore=400)#es.index(index="skynet_social_twitter_v6", doc_type="test-type", id=42, body={"any": "data", "timestamp": datetime.now()})package = []for i in range( 10 ):    row = {        "@timestamp":datetime.now().strftime( "%Y-%m-%dT%H:%M:%S.000+0800" ),        "count" : random.randint(  1, 100 )    }    package.append( row )actions = [    {        '_op_type': 'index',        '_index': "test_index",          '_type': "test-type",          '_source': d    }    for d in package]    elasticsearch.helpers.bulk( es, actions )

他人博客总结的：他人总结的es使用小结

给索引取别名，这样告诉使用者别名就ok了。

 curl -XPOST 'http://172.18.1.22:9200/_aliases' -d {    "actions": [        {"add": {"index": "info-test", "alias": "wyl"}}    ]}

移除别名：

curl -XPOST 'http://localhost:9200/_aliases' -d {    "actions": [        {"remove": {"index": "test1", "alias": "alias1"}}    ]}

重命名一个别名就是一个简单的remove然后add的操作，也是使用相同的API。这个操作是原子的。

重命名:

curl -XPOST 'http://localhost:9200/_aliases' -d '{    "actions": [        {"remove": {"index": "test1", "alias": "alias1"}},        {"add": {"index":"test1", "alias": "alias2"}}    ]}'

将一个别名同多个的索引关联起来：

curl -XPOST 'http://localhost:9200/_aliases' -d '{    "actions": [        {"add": {"index": "test1", "alias":"alias1"}},        {"add": {"index": "test2", "alias":"alias1"}}    ]}'

向一个指向多个索引的别名去索引数据会引发一个错误。

1、查看集群的所有节点

http://172.24.5.149:9200/_cat/nodes?v

2、查看集群的健康情况

http://172.24.5.149:9200/_cat/health?v

3、查看集群中所有的索引

http://172.24.5.149:9200/_cat/indices?v

4、删除info-test索引

curl -XDELETE 'http://172.24.5.149:9200/info-test'

5、创建info-test索引

curl -XPUT 'http://172.24.5.149:9200/info-test'

6、向索引中插入一个ID为1的文档

    curl -XPUT "localhost:9200/info-test/people/1?    {        "name": "John Doe"    }"

7、在没有ID的情况下向索引中插入文档，ES会随机生成一个ID：

    curl -XPOST "localhost:9200/info-test/people?    {        "name": "John Doe"     }"

8、根据ID查询文档

 curl -XGET 'localhost:9200/info-test/people/1?

9、更新ID为1的文档，将name字段的值改为Jane Doe

curl -XPOST "localhost:9200/info-test/people/1/_update?        {          "doc": { "name": "Jane Doe" }        }"

10、更新ID为1的文档，将name字段的值改为Jane Doe，同时加上age字段

 curl -XPOST "localhost:9200/info-test/people/1/_update?        {          "doc": { "name": "Jane Doe", "age": 20 }        }

11、通过脚本来执行，给ID为1的文档的age属性值加5

 curl -XPOST "localhost:9200/info-test/people/1/_update?        {          "script" : "ctx._source.age += 5"        }"

在上面的例子中，ctx._source指向当前要被更新的文档。

12、删除ID为2的文档

curl -XDELETE "localhost:9200/info-test/people/2?"可以设置超时时间curl -XDELETE 'http://localhost:9200/twitter/tweet/1?timeout=5m'

13、删除名字中包含“John”的所有文档

  curl -XDELETE "localhost:9200/info-test/people/_query?        {          "query": { "match": { "name": "John" } }        }

14、批量插入ID为1和ID为2的文档

 curl -XPOST 'localhost:9200/info-test/people/_bulk? {"index":{"_id":"1"}}{"name": "John Doe" }{"index":{"_id":"2"}}{"name": "Jane Doe" }'

15、批量更新ID为1的文档，删除ID为2的文档

   curl -XPOST 'localhost:9200/customer/external/_bulk?        {"update":{"_id":"1"}}        {"doc": { "name": "John Doe becomes Jane Doe" } }        {"delete":{"_id":"2"}}'

16、搜索info-test索引中的所有文档

curl 'localhost:9200/info-test/_search?q=*'

17、使用POST请求体搜索info-test索引中的所有文档

      curl -XPOST 'localhost:9200/info-test/_search?            {              "query": { "match_all": {} }            }'

18、使用POST请求体搜索info-test索引中的所有文档，但只要求返回一个文档（默认返回10个）

        curl -XPOST 'localhost:9200/info-test/_search?            {              "query": { "match_all": {} },        "size": 1            }'

19、使用POST请求体搜索info-test索引中的所有文档，返回第11到第20个文档

  curl -XPOST 'localhost:9200/info-test/_search?        {          "query": { "match_all": {} },          "from": 10,          "size": 10        }'

如果不指定from的值，它默认就是0。

20、使用POST请求体搜索info-test索引中的所有文档并按照name属性降序排列

    curl -XPOST 'localhost:9200/info-test/_search?        {          "query": { "match_all": {} },          "sort": { "name": { "order": "desc" } }        }'

21、使用POST请求体搜索info-test索引中的所有文档，但是只要求返回部分字段

   curl -XPOST 'localhost:9200/info-test/_search?        {          "query": { "match_all": {} },          "_source": ["age", "name"]        }'

22、使用POST请求体搜索info-test索引中age属性值为20的文档

  curl -XPOST 'localhost:9200/info-test/_search?        {          "query": { "match": { "age": 20 } }        }

23、使用POST请求体搜索info-test索引中address属性值包含mill lane的文档.（Jane Doe相当于一个短语）

   curl -XPOST 'localhost:9200/info-test/_search?        {          "query": { "match_phrase": { "address": "mill lane" } }        }'

24、使用POST请求体搜索info-test索引中address属性值包含”mill”和”lane”的文档

     curl -XPOST 'localhost:9200/info-test/_search?        {          "query": {            "bool": {              "must": [                { "match": { "address": "mill" } },                { "match": { "address": "lane" } }              ]            }          }        }'must：and。 should: or。 must_not:非。

25、使用POST请求体搜索info-test索引中balance的属性值在2000大于等于20000并且小于等于30000的文档

   curl -XPOST 'localhost:9200/info-test/_search?        {          "query": {            "filtered": {              "query": { "match_all": {} },              "filter": {                "range": {                  "balance": {                    "gte": 20000,                    "lte": 30000                  }                }              }            }          }        }'

26、使用POST请求体搜索info-test索引中的文档，并按照state属性分组
curl -XPOST ‘localhost:9200/info-test/_search?

 {          "size": 0,          "aggs": {            "group_by_state": {              "terms": {                "field": "state"              }            }          }        }'

响应（其中一部分）是：

"hits" : {            "total" : 1000,            "max_score" : 0.0,            "hits" : [ ]          },          "aggregations" : {            "group_by_state" : {              "buckets" : [ {                "key" : "al",                "doc_count" : 21              }, {                "key" : "tx",                "doc_count" : 17              }, {                "key" : "id",                "doc_count" : 15              }, {                "key" : "ma",                "doc_count" : 15              }, {                "key" : "md",                "doc_count" : 15              }, {                "key" : "pa",                "doc_count" : 15              }, {                "key" : "dc",                "doc_count" : 14              }, {                "key" : "me",                "doc_count" : 14              }, {                "key" : "mo",                "doc_count" : 14              }, {                "key" : "nd",                "doc_count" : 14              } ]            }          }        }

27、在先前聚合的基础上，现在这个例子计算了每个州的账户的平均余额

curl -XPOST 'localhost:9200/bank/_search?        {          "size": 0,          "aggs": {            "group_by_state": {              "terms": {                "field": "state"              },              "aggs": {                "average_balance": {                  "avg": {                    "field": "balance"                  }                }              }            }          }        }'

28、基于前面的聚合，现在让我们按照平均余额进行排序：

  curl -XPOST 'localhost:9200/bank/_search?pretty' -d '        {          "size": 0,          "aggs": {            "group_by_state": {              "terms": {                "field": "state",                "order": {                  "average_balance": "desc"                }              },              "aggs": {                "average_balance": {                  "avg": {                    "field": "balance"                  }                }              }            }          }        }'

29、使用年龄段（20-29，30-39，40-49）分组，然后在用性别分组，然后为每一个年龄段的每一个性别计算平均账户余额：

 curl -XPOST 'localhost:9200/bank/_search?pretty' -d '        {          "size": 0,          "aggs": {            "group_by_age": {              "range": {                "field": "age",                "ranges": [                  {                    "from": 20,                    "to": 30                  },                  {                    "from": 30,                    "to": 40                  },                  {                    "from": 40,                    "to": 50                  }                ]              },              "aggs": {                "group_by_gender": {                  "terms": {                    "field": "gender"                  },                  "aggs": {                    "average_balance": {                      "avg": {                        "field": "balance"                      }                    }                  }                }              }            }          }        }'

30、给已有的mapping新增一个字段

POST /information/_mapping/email1{  "properties": {    "name": {      "type": "text",      "index": "analyzed"    }  }}

31、设置索引的setting

PUT /atom/_settings{  "settings": {       "index.mapping.total_fields.limit": 4000},  "index": {    "refresh_interval": "30s",    "number_of_replicas":"0"  }}

32、查看指定type的mapping（如果不指定type，则查看index下面所有type的mapping）

GET /atom/_mapping/人类

33、条件更新_update_by_query

POST /index/type/_update_by_query?conflicts=proceed{  "script": {    "inline": "ctx._source.ontology_type=(params.tag)",    "lang": "painless",    "params": {      "tag": "event"    }  },  "query": {    "match_all": {}  }}

34、查询某个type下面的所有数据

POST /atom/欧洲排球锦标赛/_search{  "query": {    "match_all": {}  }}

35、创建文档的时候带版本号

PUT twitter/tweet/1?version=2{    "message" : "elasticsearch now has versioning support, double cool!"}

version类型：internal、external or external_gt、external_gte

36、创建文档的时候带op_type参数

PUT twitter/tweet/1?op_type=create{    "user" : "kimchy",    "post_date" : "2011-11-15T14:12:12",    "message" : "trying out Elasticsearch"}

或者

PUT twitter/tweet/1/_create{    "user" : "kimchy",    "post_date" : "2011-11-15T14:12:12",    "message" : "trying out Elasticsearch"}

37、创建文档的时候自动生成id字段

POST twitter/tweet/{    "user" : "kimchy",    "post_date" : "2009-11-15T14:12:12",    "message" : "trying out Elasticsearch"}

38、创建文档的时候指定路由字段

POST twitter/tweet?routing=kimchy{    "user" : "kimchy",    "post_date" : "2011-11-15T14:12:12",    "message" : "trying out Elasticsearch"}

39、创建文档时设置超时时间

PUT twitter/tweet/1?timeout=5m{    "user" : "kimchy",    "post_date" : "2011-11-15T14:12:12",    "message" : "trying out Elasticsearch"}

40、查询时不要source字段

GET twitter/tweet/0?_source=false

41、查询时选择source中的字段

GET twitter/tweet/0?_source_include=*.id&_source_exclude=entities

或者

GET twitter/tweet/0?_source=*.id,retweeted

42、只获取source里面的字段

GET twitter/tweet/1/_source

也可以选择source里面的部分字段

GET twitter/tweet/1/_source?_source_include=*.id&_source_exclude=entities'

43、自定义routing

GET twitter/tweet/2?routing=user1

创建文档的时候指定了routing的话，查询时候也要带上routing

44、给指定的type创建mapping

POST /information/_mapping/email1{  "properties": {    "name": {      "type": "text",      "index": "analyzed"    }  }}

45、delete_by_query

POST atom_v3/news/_delete_by_query?conflicts=proceed{  "query": {     "match": {      "docType": "news"    }  }}

46、强制合并索引的segment

POST atom_v3/_forcemerge?max_num_segments=5

47、查看某个索引的segments

http://172.24.8.83:9200/atom_v3/_segments

或者

http://172.24.8.83:9200/_cat/segments/atom_v3

48、创建索引的同时创建mapping

PUT my_index{  "mappings": {    "user": {      "_all": {        "enabled": false      },      "properties": {        "title": {          "type": "text"        },        "name": {          "type": "text"        },        "age": {          "type": "integer"        }      }    },    "blogpost": {      "_all": {        "enabled": false      },      "properties": {        "title": {          "type": "text"        },        "body": {          "type": "text"        },        "user_id": {          "type": "keyword"        },        "created": {          "type": "date",          "format": "strict_date_optional_time||epoch_millis"        }      }    }  }}

49、reindex:index之间的数据导入

POST _reindex{  "source": {    "index": "twitter"  },  "dest": {    "index": "new_twitter"  }}

阅读全文

0 0