Spark学习七:spark streaming与flume集成
来源:互联网 发布:电信网络制式有哪些 编辑:程序博客网 时间:2024/05/23 02:05
Spark学习七:spark streaming与flume集成
标签(空格分隔): Spark
一,启动flume
flume-conf.properties文件
agent002.sources = sources002agent002.channels = channels002agent002.sinks = sinks002## define sourcesagent002.sources.sources002.type = execagent002.sources.sources002.command = tail -F /opt/app/apache-flume-1.5.0-bin/monitor/log.input## define channelsagent002.channels.channels002.type = memoryagent002.channels.channels002.capacity = 10000agent002.channels.channels002.transactionCapacity = 10000agent002.channels.channels002.byteCapacityBufferPercentage = 20agent002.channels.channels002.byteCapacity = 800000##define sinksagent002.sinks.sinks002.type = avroagent002.sinks.sinks002.hostname=study.com.cnagent002.sinks.sinks002.port=9999##relationshipagent002.sources.sources002.channels = channels002agent002.sinks.sinks002.channel = channels002
bin/flume-ng agent --conf conf --name agent002 --conf-file conf/flume-conf.properties -Dflume.root.logger=INFO,console
二,spark stream开发和运行
1,pom.xml添加依赖的配置
groupId = org.apache.spark artifactId = spark-streaming-flume_2.10 version = 1.3.0
2,准备需要的jar包
3,启动spark本地应用模式(添加响应的jar包)
bin/spark-shell \--jars /opt/app/spark-1.3.0-bin-2.5.0/externaljars/spark-streaming-flume_2.10-1.3.0.jar,/opt/app/spark-1.3.0-bin-2.5.0/externaljars/flume-avro-source-1.5.0.jar,/opt/app/spark-1.3.0-bin-2.5.0/externaljars/flume-ng-sdk-1.5.0.jar \--master local[2]
4,flume001.scala
import org.apache.spark._import org.apache.spark.streaming._import org.apache.spark.streaming.StreamingContext._import org.apache.spark.streaming.flume._val ssc = new StreamingContext(sc, Seconds(5))val stream = FlumeUtils.createStream(ssc, "study.com.cn", 9999)// val eventsCount = stream.count.map(cnt => "Recevied " + cnt + " flume events.")// eventsCount.print()val wordCountStream = stream.map(x => new String(x.event.getBody.array())).flatMap(_.split(" ")).map((_, 1)).reduceByKey(_ + _)wordCountStream.print()ssc.start()ssc.awaitTermination()
5,执行应用
scala > :load /opt/app/spark-1.3.0-bin-2.5.0/test/flume001.scala
6,测试
echo "hadoop hive spark" >>log.input
0 0
- Spark学习七:spark streaming与flume集成
- Spark学习八:spark streaming与flume和kafka集成
- Spark学习笔记-Streaming集成Flume
- Spark Streaming与Flume集成小测试:PUSH的方式
- Spark学习笔记-Streaming-Flume
- SODBASE CEP学习进阶篇(七):SODBASE CEP与Spark streaming集成
- Spark streaming and flume
- Spark-streaming 连接flume
- flume kafka spark streaming
- Spark Streaming和Flume集成指南V1.4.1
- Spark Streaming和Flume集成指南V1.4.1
- Spark Streaming和Flume集成指南V1.4.1
- SODBASE CEP学习进阶篇(七)续:SODBASE CEP与Spark streaming集成-低延迟规则管理
- Spark Streaming 与 Kafka 集成分析
- Spark Streaming与Kafka集成编程
- Kafka+Spark Streaming集成
- Spark Streaming+Flume对接实验
- flume spark streaming配置详解
- HTTP必知必会——断点续传原理
- Application Session Cookie ViewState Cache Hidden 区别
- Spark学习五:spark sql
- JAVA——泛型
- Spark学习六:spark streaming
- Spark学习七:spark streaming与flume集成
- Spark学习八:spark streaming与flume和kafka集成
- alsa_latency
- 雪碧图实现原理及应用
- CSDN,我来啦~
- markdown转html格式
- 数据不平衡问题的处理
- iOS多线程开发——NSThread浅析
- Test_latency.c