ELK学习6_Kafka->Logstash->Elasticsearch数据流操作
来源:互联网 发布:英国大城市知乎 编辑:程序博客网 时间:2024/05/16 16:56
Logstash配置过程
在Logstash中建立input和output的条件:
- [hadoop@Slave1 ~]$ cd /usr/local/
- [hadoop@Slave1 local]$ cd logstash/
- [hadoop@Slave1 logstash]$ ls
- bin CONTRIBUTORS Gemfile.jruby-1.9.lock LICENSE vendor
- CHANGELOG.md Gemfile lib NOTICE.TXT
- [hadoop@Slave1 logstash]$ mkdir -p conf
- [hadoop@Slave1 logstash]$ ls
- bin conf Gemfile lib NOTICE.TXT
- CHANGELOG.md CONTRIBUTORS Gemfile.jruby-1.9.lock LICENSE vendor
- [hadoop@Slave1 logstash]$ cd conf
- [hadoop@Slave1 conf]$ ls
- [hadoop@Slave1 conf]$ touch kafkaInput_esOutPut.conf
- [hadoop@Slave1 conf]$ ls
- kafkaInput_esOutPut.conf
- [hadoop@Slave1 conf]$
- [hadoop@Slave1 conf]$ vim kafkaInput_esOutPut.conf
对kafkaInput_esOutPut.conf进行编辑,本机的具体内容如下:
- input {
- kafka {
- zk_connect => "192.168.154.158:2181,192.168.154.159:2181,192.168.154.160:2181"
- group_id => "test-consumer-group"
- topic_id => "logStash"
- reset_beginning => false # boolean (optional), default: false
- consumer_threads => 5 # number (optional), default: 1
- decorate_events => true # boolean (optional), default: false
- }
- }
-
- filter{
- mutate{
- #以:号分割message内容,分割后以数据方式显示。
- #比如abc:efg => message[0] = abc message[1]=efg
- split => ["message",","]
- }
- #第一个数据的内容中ORA-xxxxx这种格式,则这条内容是ora错误。添加二个字段
- mutate{
- add_field => {
- "source_Ip" => "%{[message][0]}"
- "source_Port" => "%{[message][1]}"
- "dest_Ip" => "%{[message][2]}"
- "dest_Port" => "%{[message][3]}"
- }
- }
- }
-
- output {
- elasticsearch {
-
- host => "localhost"
- }
- }
分别ssh链接Slave2和Slave3,将kafkaInput_esOutPut.conf拷贝到这两台机器上:
创建conf目录过程:
- [hadoop@Slave1 conf]$ ssh Slave2
- Last login: Wed Oct 14 10:58:06 2015 from slave1
- [hadoop@Slave2 ~]$ cd /usr/local/logstash/
- [hadoop@Slave2 logstash]$ mkdir -p conf
- [hadoop@Slave2 logstash]$ ls
- bin conf Gemfile lib NOTICE.TXT
- CHANGELOG.md CONTRIBUTORS Gemfile.jruby-1.9.lock LICENSE vendor
- [hadoop@Slave2 logstash]$ exit
- logout
- Connection to Slave2 closed.
- [hadoop@Slave1 conf]$ ssh Slave3
- Last login: Wed Oct 14 10:59:01 2015 from slave2
- [hadoop@Slave3 ~]$ cd /usr/local/logstash/
- [hadoop@Slave3 logstash]$ mkdir -p conf
- [hadoop@Slave3 logstash]$ ls
- bin conf Gemfile lib NOTICE.TXT
- CHANGELOG.md CONTRIBUTORS Gemfile.jruby-1.9.lock LICENSE vendor
- [hadoop@Slave3 logstash]$ exit
- logout
- Connection to Slave3 closed.
传输文件过程:
- [hadoop@Slave1 conf]$ scp kafkaInput_esOutPut.conf Slave2:/usr/local/logstash/conf/
- kafkaInput_esOutPut.conf 100% 1063 1.0KB/s 00:00
- [hadoop@Slave1 conf]$ scp kafkaInput_esOutPut.conf Slave3:/usr/local/logstash/conf/
- kafkaInput_esOutPut.conf 100% 1063 1.0KB/s 00:00
- [hadoop@Slave1 conf]$ ssh Slave2
- Last login: Tue Oct 27 23:46:19 2015 from slave1
- [hadoop@Slave2 ~]$ cd /usr/local/logstash/conf/
- [hadoop@Slave2 conf]$ ls
- kafkaInput_esOutPut.conf
- [hadoop@Slave2 conf]$
Kafka操作过程
在三台机器上启动zookeeper:
关闭防火墙:
- [hadoop@Slave1 bin]$ su
- Password:
- [root@Slave1 bin]# service iptables stop
- iptables: Setting chains to policy ACCEPT: filter [ OK ]
- iptables: Flushing firewall rules: [ OK ]
- iptables: Unloading modules: [ OK ]
- [root@Slave1 bin]# exit
- exit
- [hadoop@Slave1 bin]
启动:
- [hadoop@Slave1 bin]$ ./zkServer.sh start
- JMX enabled by default
- Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
- Starting zookeeper ... STARTED
在其他三台机器上进行相同操作后,查看结果:
- [hadoop@Slave1 bin]$ ./zkServer.sh status
- JMX enabled by default
- Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
- Mode: leader
- [hadoop@Slave1 bin]$
在三台机器上启动Kafka,以Slave1为例:
- [hadoop@Slave1 bin]$ cd /usr/local/kafka/
- [hadoop@Slave1 kafka]$ bin/kafka-server-start.sh config/server.properties
新建名为logStash的topic:
- [hadoop@Slave1 ~]$ cd /usr/local/kafka/
- [hadoop@Slave1 kafka]$ cd bin
- [hadoop@Slave1 bin]$ sh kafka-topics.sh --create --topic logStash --replication-factor 1 --partitions 1 --zookeeper Slave1:2181
- Created topic "logStash".
- [hadoop@Slave1 bin]$
启动Logstash
在三台机器上,进行启动:
- [hadoop@Slave1 ~]$ cd /usr/local/logstash/
- [hadoop@Slave1 logstash]$ ls
- bin conf Gemfile lib NOTICE.TXT
- CHANGELOG.md CONTRIBUTORS Gemfile.jruby-1.9.lock LICENSE vendor
- [hadoop@Slave1 logstash]$ cd bin
- [hadoop@Slave1 bin]$ ls
- logstash logstash.lib.sh plugin.bat rspec.bat
- logstash.bat plugin rspec setup.bat
启动过程中,显示的内容如下,会出现一些警告:
- [hadoop@Slave2 bin]$ ./logstash agent -f ../conf/kafkaInput_esOutPut.conf
- log4j, [2015-10-28T21:52:07.116] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-3 for topic logStash
- log4j, [2015-10-28T21:52:07.118] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-2 for topic logStash
- log4j, [2015-10-28T21:52:07.119] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-0 for topic logStash
- log4j, [2015-10-28T21:52:07.119] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-4 for topic logStash
- log4j, [2015-10-28T21:52:07.120] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-1 for topic logStash
- log4j, [2015-10-28T21:52:33.934] WARN: org.elasticsearch.bootstrap: JNA not found. native methods will be disabled.
- log4j, [2015-10-28T21:53:09.347] WARN: org.elasticsearch.discovery: [logstash-Slave2-4244-11624] waited for 30s and no initial state was set by the discovery
- log4j, [2015-10-28T21:53:35.632] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-3 for topic logStash
- log4j, [2015-10-28T21:53:35.633] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-2 for topic logStash
- log4j, [2015-10-28T21:53:35.634] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-0 for topic logStash
- log4j, [2015-10-28T21:53:35.634] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-4 for topic logStash
- log4j, [2015-10-28T21:53:35.634] WARN: kafka.consumer.RangeAssignor: No broker partitions consumed by consumer thread test-consumer-group_Slave2-1446094310356-56dfbfa7-1 for topic logStash
- Failed to install template: waited for [30s] {:level=>:error}
- Logstash startup completed
发送并接收数据
启动刚才建立的topic:
- [hadoop@Slave1 ~]$ cd /usr/local/kafka/
- [hadoop@Slave1 kafka]$ ls
- bin config libs LICENSE logs NOTICE
- [hadoop@Slave1 kafka]$ bin/kafka-console-producer.sh --broker-list Slave1:9092 --topic logStash
启动ES:
- [hadoop@Slave1 ~]$ cd /usr/local/elasticsearch/
- [hadoop@Slave1 elasticsearch]$ bin/elasticsearch -f
- getopt: invalid option -- 'f'
- [2015-10-29 00:47:27,084][INFO ][node ] [Clown] version[1.7.3], pid[5208], build[05d4530/2015-10-15T09:14:17Z]
- [2015-10-29 00:47:27,131][INFO ][node ] [Clown] initializing ...
- [2015-10-29 00:47:27,920][INFO ][plugins ] [Clown] loaded [], sites []
- [2015-10-29 00:47:28,548][INFO ][env ] [Clown] using [1] data paths, mounts [[/ (/dev/sda2)]], net usable_space [9.7gb], net total_space [17.4gb], types [ext4]
- [2015-10-29 00:47:43,711][INFO ][node ] [Clown] initialized
- [2015-10-29 00:47:43,729][INFO ][node ] [Clown] starting ...
- [2015-10-29 00:47:46,089][INFO ][transport ] [Clown] bound_address {inet[/0:0:0:0:0:0:0:0:9301]}, publish_address {inet[/192.168.154.158:9301]}
- [2015-10-29 00:47:46,606][INFO ][discovery ] [Clown] elasticsearch/v-jkBhkxSheape14hvMAHw
- [2015-10-29 00:47:50,712][INFO ][cluster.service ] [Clown] new_master [Clown][v-jkBhkxSheape14hvMAHw][Slave1][inet[/192.168.154.158:9301]], reason: zen-disco-join (elected_as_master)
- [2015-10-29 00:47:50,985][INFO ][http ] [Clown] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.154.158:9200]}
- [2015-10-29 00:47:50,986][INFO ][node ] [Clown] started
- [2015-10-29 00:47:51,345][INFO ][gateway ] [Clown] recovered [0] indices into cluster_state
- [2015-10-29 00:47:51,346][INFO ][cluster.service ] [Clown] added {[logstash-Slave1-4083-11624][loTUXdCXRVC_WzqzhD3PWg][Slave1][inet[/192.168.154.158:9300]]{data=false, client=true},}, reason: zen-disco-receive(join from node[[logstash-Slave1-4083-11624][loTUXdCXRVC_WzqzhD3PWg][Slave1][inet[/192.168.154.158:9300]]{data=false, client=true}])
- [2015-10-29 00:47:54,185][INFO ][cluster.metadata ] [Clown] [logstash-2015.10.29] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [logs]
- [2015-10-29 00:47:56,201][INFO ][cluster.metadata ] [Clown] [logstash-2015.10.29] update_mapping [logs] (dynamic)
- [2015-10-29 00:47:57,166][INFO ][cluster.metadata ] [Clown] [logstash-2015.10.29] update_mapping [logs] (dynamic)
检查ES是否启动成功:
- [hadoop@Slave1 ~]$ curl -X GET http://localhost:9200
- {
- "status" : 200,
- "name" : "Clown",
- "cluster_name" : "elasticsearch",
- "version" : {
- "number" : "1.7.3",
- "build_hash" : "05d4530971ef0ea46d0f4fa6ee64dbc8df659682",
- "build_timestamp" : "2015-10-15T09:14:17Z",
- "build_snapshot" : false,
- "lucene_version" : "4.10.4"
- },
- "tagline" : "You Know, for Search"
- }
- [hadoop@Slave1 ~]$
在刚才启动的topic里发送数据:
(数据的格式是源IP,源端口,目的IP,目的端口;为了简便,发送1,1,1,1)
- [hadoop@Slave1 kafka]$ bin/kafka-console-producer.sh --broker-list Slave1:9092 --topic logStash
- [2015-10-29 00:39:33,085] WARN Property topic is not valid (kafka.utils.VerifiableProperties)
- 1,1,1,1
查看接收的数据:
- [hadoop@Slave1 ~]$ curl -XGET 'localhost:9200/logstash-2015.10.27/_search'
- {"error":"IndexMissingException[[logstash-2015.10.27] missing]","status":404}[hadoop@Slave1 ~]$ curl -XGET 'localhost:9200/logstash-2015.10.29/_search'
- {"took":260,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"logstash-2015.10.29","_type":"logs","_id":"AVCykUgg6gAQTB_SuF_V","_score":1.0,"_source":{"message":["1","1","1","1"],"tags":["_jsonparsefailure"],"@version":"1","@timestamp":"2015-10-29T07:39:50.871Z","kafka":{"msg_size":7,"topic":"logStash","consumer_group":"test-consumer-group","partition":0,"key":null},"source_Ip":"1","source_Port":"1","dest_Ip":"1","dest_Port":"1"}},{"_index":"logstash-2015.10.29","_type":"logs","_id":"AVCykUGv6gAQTB_SuF_U","_score":1.0,"_source":{"message":[],"tags":["_jsonparsefailure"],"@version":"1","@timestamp":"2015-10-29T07:39:46.345Z","kafka":{"msg_size":0,"topic":"logStash","consumer_group":"test-consumer-group","partition":0,"key":null},"source_Ip":"%{[message][0]}","source_Port":"%{[message][1]}","dest_Ip":"%{[message][2]}","dest_Port":"%{[message][3]}"}}]}}[hadoop@Slave1 ~]$
- [hadoop@Slave1 ~]$ curl -XGET 'localhost:9200/logstash-2015.10.29/_search?pretty'
- {
- "took" : 26,
- "timed_out" : false,
- "_shards" : {
- "total" : 5,
- "successful" : 5,
- "failed" : 0
- },
- "hits" : {
- "total" : 2,
- "max_score" : 1.0,
- "hits" : [ {
- "_index" : "logstash-2015.10.29",
- "_type" : "logs",
- "_id" : "AVCykUgg6gAQTB_SuF_V",
- "_score" : 1.0,
- "_source":{"message":["1","1","1","1"],"tags":["_jsonparsefailure"],"@version":"1","@timestamp":"2015-10-29T07:39:50.871Z","kafka":{"msg_size":7,"topic":"logStash","consumer_group":"test-consumer-group","partition":0,"key":null},"source_Ip":"1","source_Port":"1","dest_Ip":"1","dest_Port":"1"}
- }, {
- "_index" : "logstash-2015.10.29",
- "_type" : "logs",
- "_id" : "AVCykUGv6gAQTB_SuF_U",
- "_score" : 1.0,
- "_source":{"message":[],"tags":["_jsonparsefailure"],"@version":"1","@timestamp":"2015-10-29T07:39:46.345Z","kafka":{"msg_size":0,"topic":"logStash","consumer_group":"test-consumer-group","partition":0,"key":null},"source_Ip":"%{[message][0]}","source_Port":"%{[message][1]}","dest_Ip":"%{[message][2]}","dest_Port":"%{[message][3]}"}
- } ]
- }
- }
- [hadoop@Slave1 ~]$
参考资料:http://blog.csdn.net/xuguokun1986/article/details/49452101
对这篇博客的内容进行了扩展。
来源:http://blog.csdn.net/wang_zhenwei/article/details/49493131