flume-Spark整合-push方式

来源:互联网 发布:黑龙江省统计局数据 编辑:程序博客网 时间:2024/06/06 07:28

第一种sparkStreaming 整合Flume

flume采用 netcat-memory-avro架构

本地测试

1:本地启动sprakStreaming服务,(0.0.0.0 10000)

2. 服务器中启动flume agent

3. telnet往端口中输入数据,观察本地idea控制台输出数据

服务器测试

mvn打包:mvn clean package -DskipTests

上传至服务器

先启动spark

spark-submit \--class com.tuzhihai.flumespark.FlumePushSpark \--master local[2] \--packages org.apache.spark:spark-streaming-flume_2.11:2.2.0 \/root/soft_down/lib/sparklearn-1.0.jar \192.168.145.128 10000

后启动flume

flume-ng agent \  --name netcat-memory-avro \  --conf $FLUME_HOME/conf \  --conf-file $FLUME_HOME/conf/netcat-memory-avro.conf \  -Dflume.root.logger=INFO,console 

在端口输入数据

telnet 192.168.145.128 9990

观察flume控制台

push方式为什么要先启动spark,后启动flume?

因为采用的是flume-Push,要push到一个服务器里,首先这个服务里得存在是不?所以要先启动spark这个接收数据的服务器,再启动flume这个采集数据的工具



flume-push-stream.conf
flume-ng agent \  --name netcat-memory-avro \  --conf $FLUME_HOME/conf \  --conf-file $FLUME_HOME/conf/netcat-memory-avro.conf \  -Dflume.root.logger=INFO,console # example netcat-memory-avronetcat-memory-avro.sources = netcat-sourcenetcat-memory-avro.sinks = avro-sinknetcat-memory-avro.channels = memory-channel# Describe/configure the sourcenetcat-memory-avro.sources.netcat-source.type = netcatnetcat-memory-avro.sources.netcat-source.bind = 192.168.145.128netcat-memory-avro.sources.netcat-source.port = 9999# Describe/ the sinknetcat-memory-avro.sinks.avro-sink.type = avronetcat-memory-avro.sinks.avro-sink.hostname = 192.168.145.128netcat-memory-avro.sinks.avro-sink.port = 10000# Use a channel which buffers events in memorynetcat-memory-avro.channels.memory-channel.type = memory# Bind the source and sink to the channelnetcat-memory-avro.sources.netcat-source.channels = memory-channelnetcat-memory-avro.sinks.avro-sink.channel = memory-channel
原创粉丝点击