实时同步MySQL数据到Elasticsearch

来源:互联网 发布:齐鲁证券网上交易软件 编辑:程序博客网 时间:2024/05/16 13:54

CDC工具可选择Oracle GoldenGate或者Zendesk开源的Maxwell,MySQL->GoldenGate或Maxwell->Kafka->Logstash->Elasticsearch

如果用商业软件,数据源端安装OracleGoldenGate for MySQL,目标端安装OracleGoldenGate for Big Data. 

而用开源软件Maxwell配置会更简单些,只需编辑文件config.properties

host=localhost

user=root

password=xxxxxxx

kafka_topic=test_topic

启动Zookeeper
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties
启动Kafka
bin/kafka-server-start.sh config/server.properties
启动Maxwell
bin/maxwell --user='root' --password='xxxxxxx' --host='localhost' --producer=kafka --kafka.bootstrap.servers=localhost:9092

Logstash配置如下:

input {

    kafka {

        zk_connect => "localhost:2181"

        group_id => "logstash"

        topic_id => "test_topic"

        codec => json {charset => ["ISO-8859-1"]}

        reset_beginning => false

        consumer_threads => 5

        decorate_events => true

    }

}


filter {

  mutate {   

       remove_field => ["database","table","ts","xid","commit","old","kafka"]

       rename => ["[data][id]","id"]

       rename => ["[data][first_name]","first_name"]

       rename => ["[data][last_name]","last_name"]

       rename => ["[data][age]","age"]

       rename => ["[data][about]","about"]

       rename => ["[data][interests]","interests"]

   }

   translate {

       field => "type"

       destination => "op_type"

       dictionary => [ 

         "insert", "index",

         "update", "update",

         "delete", "delete"

       ]

    }

}


output {

    elasticsearch {

        hosts => ["localhost:9200"]

        index => "megacorp"       

        document_id => "%{id}"

        document_type => "employee"

        action => "%{op_type}"

        workers => 1

        flush_size => 20000

        idle_flush_time => 10

        template_overwrite => true

    }

    stdout {}

}

启动Logstash和Elasticsearch,在MySQL数据库中增删改数据,Elasticsearch就会实时看到索引的变化。

0 0