flume-Spark整合-push方式

来源：互联网发布：黑龙江省统计局数据编辑：程序博客网时间：2024/06/06 07:28

第一种sparkStreaming 整合Flume

flume采用 netcat-memory-avro架构

本地测试

1：本地启动sprakStreaming服务，（0.0.0.0 10000）

2. 服务器中启动flume agent

3. telnet往端口中输入数据，观察本地idea控制台输出数据

服务器测试

mvn打包：mvn clean package -DskipTests

上传至服务器

先启动spark

spark-submit \--class com.tuzhihai.flumespark.FlumePushSpark \--master local[2] \--packages org.apache.spark:spark-streaming-flume_2.11:2.2.0 \/root/soft_down/lib/sparklearn-1.0.jar \192.168.145.128 10000

后启动flume

flume-ng agent \  --name netcat-memory-avro \  --conf $FLUME_HOME/conf \  --conf-file $FLUME_HOME/conf/netcat-memory-avro.conf \  -Dflume.root.logger=INFO,console

在端口输入数据

telnet 192.168.145.128 9990

观察flume控制台

push方式为什么要先启动spark,后启动flume?

因为采用的是flume-Push，要push到一个服务器里，首先这个服务里得存在是不？所以要先启动spark这个接收数据的服务器，再启动flume这个采集数据的工具

flume-push-stream.conf

flume-ng agent \  --name netcat-memory-avro \  --conf $FLUME_HOME/conf \  --conf-file $FLUME_HOME/conf/netcat-memory-avro.conf \  -Dflume.root.logger=INFO,console # example netcat-memory-avronetcat-memory-avro.sources = netcat-sourcenetcat-memory-avro.sinks = avro-sinknetcat-memory-avro.channels = memory-channel# Describe/configure the sourcenetcat-memory-avro.sources.netcat-source.type = netcatnetcat-memory-avro.sources.netcat-source.bind = 192.168.145.128netcat-memory-avro.sources.netcat-source.port = 9999# Describe/ the sinknetcat-memory-avro.sinks.avro-sink.type = avronetcat-memory-avro.sinks.avro-sink.hostname = 192.168.145.128netcat-memory-avro.sinks.avro-sink.port = 10000# Use a channel which buffers events in memorynetcat-memory-avro.channels.memory-channel.type = memory# Bind the source and sink to the channelnetcat-memory-avro.sources.netcat-source.channels = memory-channelnetcat-memory-avro.sinks.avro-sink.channel = memory-channel

阅读全文

0 0