关于fielddata数据占用内存过大的解决方法

来源:互联网 发布:数据的分析 编辑:程序博客网 时间:2024/06/06 01:24

参考文章

Support in the Wild: My Biggest Elasticsearch Problem at Scale

http://blog.csdn.net/jiao_fuyou/article/details/50478198

Understanding Fielddata

By default, fielddata is loaded on demand, which means that you will not see it until you are using it. Also, by being loaded per segment, it means that new segments that get created will slowly add to your overall memory usage until the field’s fielddata is evicted from memory. Eviction happens in only a few ways:

1、Deleting the index or indices that contains it.
2、Closing the index or indices that contains it.
3、Segment fielddata is removed when segments are removed (e.g., background merging).
* This usually just means that the problem is moving rather than going away.

4、Restarting the node containing the fielddata.
5、Clearing the relevant fielddata cache.
6、Automatically evicting the fielddata to make room for other fielddata.
* This will not happen with default settings.

Use Doc Values

The solution to this fielddata problem is to avoid it altogether. Fortunately, you can avoid the use of fielddata bymanually mapping all of your fields to use doc values. Without repeating too much from the guide, doc values offload this burden by writing the fielddata to disk at index time, thereby allowing Elasticsearch to load the values outside of your Java heap as they are needed.

You might be asking: How do I enable these doc values? It’s as simple as adding this to every field’s mapping, exceptanalyzed string fields:

"doc_values" : true

Unfortunately, you must do this before you index any data. This means that, for any existing index that is not already using doc values, you cannot flip the switch to enable them. You would have to reindex.

Performance Considerations for Elasticsearch Indexing

http://blog.csdn.net/jiao_fuyou/article/details/50480663

Limiting Memory Usage

https://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html#fielddata-size

Enabled Doc Values

https://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

控制fielddata允许内存大小

控制fielddata允许内存大小,达到HEAP 40% 自动清理旧cache

curl -XPUT http://localhost:9200/_cluster/settings{    "persistent" : {        "indices.fielddata.cache.size" : "40%"    }}

如果旧缓存不需要,如日志,只查询最新一天数据,那可以强制清除旧缓存:

POST /_cache/clear

开启字段doc_values

对fielddata战用很大内存的字段,设置:”doc_values” : true,ES2.0以后,该属性默认开启。

PUT /music/_mapping/song{  "properties" : {    "tag": {      "type":       "string",      "index" :     "not_analyzed",      "doc_values": true     }  }}
0 0