elasticsearch 设置 —— 基本配置

来源：互联网发布：mac 获取鼠标坐标编辑：程序博客网时间：2024/05/11 23:07

configuration 配置

environment variables 环境变量

通过脚本，Elasticsearch 会将启动脚本中的JAVA_OPTS 选项传递给 JVM 来启动elasticsearch. 其中最重要的一个参数是 -Xmx ，此参数用于控制系统分配给elasticsearch 进程的最大内存量。另外 -Xms用于控制系统分配给elasticsearch进程的最小内存量(通常情况下，分配的内存越多越好).

多数情况下，尽量保持 JAVA_OPTS的默认配置，通过使用 ES_JAVA_OPTS环境变量来设置或改变现有的JVM设置。ES_HEAP_SIZE 环境变量用于设置分配给elasticsearch java进程的堆内存量。通常情况下，它将会把最大最小值设置为同一个值，尽管这两个值可以分别设置(通过 ES_MIN_MEM，默认为256m,和ES_MAX_MEM 默认为1gb)。

建议将内存的大小限制设置为相同值。

elasticsearch 启动脚本:

#!/bin/sh# OPTIONS:#   -d: daemonize, start in the background#   -p <filename>: log the pid to a file (useful to kill it later)# CONTROLLING STARTUP:## This script relies on few environment variables to determine startup# behavior, those variables are:##   ES_CLASSPATH -- A Java classpath containing everything necessary to run.#   JAVA_OPTS    -- Additional arguments to the JVM for heap size, etc#   ES_JAVA_OPTS -- External Java Opts on top of the defaults set### Optionally, exact memory values can be set using the following values, note,# they can still be set using the `ES_JAVA_OPTS`. Sample format include "512m", and "10g".##   ES_HEAP_SIZE -- Sets both the minimum and maximum memory to allocate (recommended)## As a convenience, a fragment of shell is sourced in order to set one or# more of these variables. This so-called `include' can be placed in a# number of locations and will be searched for in order. The lowest# priority search path is the same directory as the startup script, and# since this is the location of the sample in the project tree, it should# almost work Out Of The Box.## Any serious use-case though will likely require customization of the# include. For production installations, it is recommended that you copy# the sample to one of /usr/share/elasticsearch/elasticsearch.in.sh,# /usr/local/share/elasticsearch/elasticsearch.in.sh, or# /opt/elasticsearch/elasticsearch.in.sh and make your modifications there.## Another option is to specify the full path to the include file in the# environment. For example:##   $ ES_INCLUDE=/path/to/in.sh elasticsearch -p /var/run/es.pid## Note: This is particularly handy for running multiple instances on a# single installation, or for quick tests.## If you would rather configure startup entirely from the environment, you# can disable the include by exporting an empty ES_INCLUDE, or by# ensuring that no include files exist in the aforementioned search list.# Be aware that you will be entirely responsible for populating the needed# environment variables.# Maven will replace the project.name with elasticsearch below. If that# hasn't been done, we assume that this is not a packaged version and the# user has forgotten to run Maven to create a package.IS_PACKAGED_VERSION='elasticsearch'if [ "$IS_PACKAGED_VERSION" != "elasticsearch" ]; then    cat >&2 << EOFError: You must build the project with Maven or download a pre-built packagebefore you can run Elasticsearch. See 'Building from Source' in README.textileor visit http://www.elasticsearch.org/download to get a pre-built package.EOF    exit 1fiCDPATH=""SCRIPT="$0"# SCRIPT may be an arbitrarily deep series of symlinks. Loop until we have the concrete path.while [ -h "$SCRIPT" ] ; do  ls=`ls -ld "$SCRIPT"`  # Drop everything prior to ->  link=`expr "$ls" : '.*-> \(.*\)$'`  if expr "$link" : '/.*' > /dev/null; then    SCRIPT="$link"  else    SCRIPT=`dirname "$SCRIPT"`/"$link"  fidone# determine elasticsearch homeES_HOME=`dirname "$SCRIPT"`/..# make ELASTICSEARCH_HOME absoluteES_HOME=`cd "$ES_HOME"; pwd`# If an include wasn't specified in the environment, then search for one...if [ "x$ES_INCLUDE" = "x" ]; then    # Locations (in order) to use when searching for an include file.    for include in /usr/share/elasticsearch/elasticsearch.in.sh \                   /usr/local/share/elasticsearch/elasticsearch.in.sh \                   /opt/elasticsearch/elasticsearch.in.sh \                   ~/.elasticsearch.in.sh \                   "`dirname "$0"`"/elasticsearch.in.sh; do        if [ -r "$include" ]; then            . "$include"            break        fi    done# ...otherwise, source the specified include.elif [ -r "$ES_INCLUDE" ]; then    . "$ES_INCLUDE"fiif [ -x "$JAVA_HOME/bin/java" ]; then    JAVA="$JAVA_HOME/bin/java"else    JAVA=`which java`fiif [ ! -x "$JAVA" ]; then    echo "Could not find any executable java binary. Please install java in your PATH or set JAVA_HOME"    exit 1fiif [ -z "$ES_CLASSPATH" ]; then    echo "You must set the ES_CLASSPATH var" >&2    exit 1fi# Special-case path variables.case `uname` in    CYGWIN*)        ES_CLASSPATH=`cygpath -p -w "$ES_CLASSPATH"`        ES_HOME=`cygpath -p -w "$ES_HOME"`    ;;esaclaunch_service(){    pidpath=$1    daemonized=$2    props=$3    es_parms="-Delasticsearch"    if [ "x$pidpath" != "x" ]; then        es_parms="$es_parms -Des.pidfile=$pidpath"    fi    # The es-foreground option will tell Elasticsearch not to close stdout/stderr, but it's up to us not to daemonize.    if [ "x$daemonized" = "x" ]; then        es_parms="$es_parms -Des.foreground=yes"        exec "$JAVA" $JAVA_OPTS $ES_JAVA_OPTS $es_parms -Des.path.home="$ES_HOME" -cp "$ES_CLASSPATH" $props \                org.elasticsearch.bootstrap.Elasticsearch        # exec without running it in the background, makes it replace this shell, we'll never get here...        # no need to return something    else        # Startup Elasticsearch, background it, and write the pid.        exec "$JAVA" $JAVA_OPTS $ES_JAVA_OPTS $es_parms -Des.path.home="$ES_HOME" -cp "$ES_CLASSPATH" $props \                    org.elasticsearch.bootstrap.Elasticsearch <&- &        return $?    fi}# Parse any long getopt options and put them into properties before calling getopt below# Be dash compatible to make sure running under ubuntu worksARGV=""while [ $# -gt 0 ]do    case $1 in      --*=*) properties="$properties -Des.${1#--}"           shift 1           ;;      --*) properties="$properties -Des.${1#--}=$2"           shift 2           ;;      *) ARGV="$ARGV $1" ; shift    esacdone# Parse any command line options.args=`getopt vdhp:D:X: $ARGV`eval set -- "$args"while true; do    case $1 in        -v)            "$JAVA" $JAVA_OPTS $ES_JAVA_OPTS $es_parms -Des.path.home="$ES_HOME" -cp "$ES_CLASSPATH" $props \                    org.elasticsearch.Version            exit 0        ;;        -p)            pidfile="$2"            shift 2        ;;        -d)            daemonized="yes"            shift        ;;        -h)            echo "Usage: $0 [-d] [-h] [-p pidfile]"            exit 0        ;;        -D)            properties="$properties -D$2"            shift 2        ;;        -X)            properties="$properties -X$2"            shift 2        ;;        --)            shift            break        ;;        *)            echo "Error parsing argument $1!" >&2            exit 1        ;;    esacdone# Start up the servicelaunch_service "$pidfile" "$daemonized" "$properties"exit $?

system configuration 系统配置

file descriptors 文件描述符

确保增加机器中可打开的文件描述符个数,建议在32k~64k。为了能检测进程可打开的文件描述符的个数，在es启动时添加参数 -Des.max-open-files 并设置为 true ,这样可以显示进程可以打开的文件描述符的个数。
或者，你也可以检索节点的max_file_descriptors信息，通过使用 Node Info API：

curl localhost:9200/_nodes/process?pretty

memory settings 内存设置

Linux 内核会为文件系统缓存分配尽可能多的内存，它会急切的将未使用的应用程序的内存换出。这样就可能导致elasticsearch进程内存被换出。内存换入换出对elasticsearch来说是非常有害于性能和稳定性的，所以我们应该尽量避免。有三个选项可供使用：
禁用交换

最简单的方法是完全禁用内存交换，通常Elasticsearch是在一个机器上运行的唯一服务，它的内存使用量由ES_HEAP_SIZE环境变量控制。应该没有必要启用交换。在Linux系统中，你可以暂时禁用交换：
<span style="font-size:14px;"> sudo swapoff -a</span>
也可以永久的禁用交换，编辑/etc/fstab文件，注释掉包含swap词的所有行。
配置 swappiness
通过将 vim.swapniess 设置为0可以使系统内核在一般情况下不将es进程占用的内存交换，但是在紧急情况下允许交换。
在3.5 -rc1 以及以上的内核中，如果将swapniess 设置为1 会导致OOM直接杀死进程，而不会交换。这种情况下应该将swapniess设置为1，以保证在紧急情况下仍能进行交换。
mlockall

这种配置方法仅适用于 Linux/Unix系统。使用 mlockall锁住elasticsearch进程使用的内存空间。这样也可以禁止此内存空间被换出。如果采用这种方式的话需要在 config/elasticsearch.yml 文件中添加：
bootstrap.mlockall: true
在启动elasticsearch后你可以通过查看mlockall域来查看内存是否被锁住：
curl localhost:9200/_nodes/process?pretty
如果看到mlockall选项为false的话，说明此设置没有应用成功，通常情况下是因为启动elasticsearch的用户没有锁住内存的权限，这时可以切换到root重新启动。另外一种原因就是系统的临时目录/tmp挂载时启用了noexec选项，这时为elasticsearch重新指定临时目录就可以了：
./bin/elasticsearch -Djna.tmpdir=/path/to/new/dir
mlockall 可能会导致JVM或者shell会话退出，当它尝试去分配更多内存（已经超出了可用内存）的时候。

elasticsearch设置

elasticsearch的配置文件在 ES_HOME/config 目录下，此目录下有两个配置文件 elasticsearch.yml 用于配置elasticsearch的各个模块，logging.yml用于配置elasticsearch日志。

配置文件格式为 YMAL。

paths 路径设置

在实际应用中，你几乎肯定会想更改数据文件存储路径和日志文件存储路径：

path:  logs: /var/log/elasticsearch  data: /var/data/elasticsearch

cluster name 集群名称

不要忘记给你的集群一个名称，此名称用于唯一标识集群并且自动发现并添加节点：

cluster:  name: <NAME OF YOUR CLUSTER>

node name 节点名称

您可能还需要为每个节点设置名称，例如设置为主机名。默认情况下elasticsearch会随机选取节点名称。

node:  name: <NAME OF YOUR NODE>

在内部，上述配置都会被组合成名称空间表示形式，例如 node.name, path.logs,cluster.name 等。这意味着你可以使用其它类格式的配置文件，例如JSON格式的。如果配置文件为JSON格式的，那么只需要将elasticsearch.yml 改为elasticsearch.json

并按照如下方式配置：

configuration styles 配置风格

{    "network" : {        "host" : "10.0.0.4"    }}

这也意味着，它很容易从外部传递参数进行配置，例如：

$ elasticsearch -Des.network.host=10.0.0.4

另一种方式是将 es.default 前缀代替 es. 前缀，这意味着默认配置将会被使用，如果配置文件中没有显式配置的话。还有一种选择是在配置文件中使用${...}符号，它将被解析为环境变量值，例如：

{    "network" : {        "host" : "${ES_NET_HOST}"    }}

配置文件的位置可以通过系统属性指定在外部：

$ elasticsearch -Des.config=/path/to/config/file

index settings 索引设置

在集群中创建索引时可以提供自己的设置。例如，以下代码创建一个基于内存存储的索引而不是默认存储在文件系统中的索引一个(提交数据格式可以是YMAL 或者 JSON)：

$ curl -XPUT http://localhost:9200/kimchy/ -d \'index :    store:        type: memory'

索引的设置，也可以在节点级别中完成，这样会使该节点中的索引都会存储在内存，除非该索引被显式配置，在配置文件中：

index :    store:        type: memory

换句话说，索引级别的配置可以覆盖节点级别的配置。也可以通过如下方式设置：

$ elasticsearch -Des.index.store.type=memory

logging 日志

在elasticsearch内部，使用log4j来生成日志，可以按照YMAL格式来简化log4j的配置。

0 0