本地windows跑Scala程序调用Spark
来源:互联网 发布:云计算p层 编辑:程序博客网 时间:2024/05/16 02:44
应用场景
spark是用scala写的一种极其强悍的计算工具,spark内存计算,提供了图计算,流式计算,机器学习,即时查询等十分方便的工具,所以利用scala来进行spark编程是十分必要的,下面简单书写一个spark连接mysql读取信息的例子。
操作流程
按照windows搭建Scala开发环境博文,搭建scala开发环境,实际已经将Spark环境部署完成了,所以直接可以用scala语言写一些spark相关的程序!
package epoint.com.cn.test001import org.apache.spark.sql.SQLContextimport org.apache.spark.SparkConfimport org.apache.spark.SparkContextimport org.apache.spark.rdd.RDDobject SparkConnMysql { def main(args: Array[String]) { println("Hello, world!") val conf = new SparkConf() conf.setAppName("wow,my first spark app") conf.setMaster("local") val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) val url = "jdbc:mysql://192.168.114.67:3306/user" val table = "user" val reader = sqlContext.read.format("jdbc") reader.option("url", url) reader.option("dbtable", table) reader.option("driver", "com.mysql.jdbc.Driver") reader.option("user", "root") reader.option("password", "11111") val df = reader.load() df.show() }}
运行结果:
Hello, world!Using Spark's default log4j profile: org/apache/spark/log4j-defaults.propertiesSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/D:/kettle7.1/inceptor-driver.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]17/11/21 11:43:53 INFO SparkContext: Running Spark version 1.6.117/11/21 11:43:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable17/11/21 11:43:56 INFO SecurityManager: Changing view acls to: lenovo17/11/21 11:43:56 INFO SecurityManager: Changing modify acls to: lenovo17/11/21 11:43:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(lenovo); users with modify permissions: Set(lenovo)17/11/21 11:43:59 INFO Utils: Successfully started service 'sparkDriver' on port 55824.17/11/21 11:43:59 INFO Slf4jLogger: Slf4jLogger started17/11/21 11:43:59 INFO Remoting: Starting remoting17/11/21 11:43:59 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.114.67:55837]17/11/21 11:43:59 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55837.17/11/21 11:43:59 INFO SparkEnv: Registering MapOutputTracker17/11/21 11:43:59 INFO SparkEnv: Registering BlockManagerMaster17/11/21 11:43:59 INFO DiskBlockManager: Created local directory at C:\Users\lenovo\AppData\Local\Temp\blockmgr-16383e3c-7cb6-43c7-b300-ccc1a1561bb417/11/21 11:43:59 INFO MemoryStore: MemoryStore started with capacity 1129.9 MB17/11/21 11:44:00 INFO SparkEnv: Registering OutputCommitCoordinator17/11/21 11:44:00 INFO Utils: Successfully started service 'SparkUI' on port 4040.17/11/21 11:44:00 INFO SparkUI: Started SparkUI at http://192.168.114.67:404017/11/21 11:44:00 INFO Executor: Starting executor ID driver on host localhost17/11/21 11:44:00 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55844.17/11/21 11:44:00 INFO NettyBlockTransferService: Server created on 5584417/11/21 11:44:00 INFO BlockManagerMaster: Trying to register BlockManager17/11/21 11:44:00 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55844 with 1129.9 MB RAM, BlockManagerId(driver, localhost, 55844)17/11/21 11:44:00 INFO BlockManagerMaster: Registered BlockManager17/11/21 11:44:05 INFO SparkContext: Starting job: show at SparkConnMysql.scala:2517/11/21 11:44:05 INFO DAGScheduler: Got job 0 (show at SparkConnMysql.scala:25) with 1 output partitions17/11/21 11:44:05 INFO DAGScheduler: Final stage: ResultStage 0 (show at SparkConnMysql.scala:25)17/11/21 11:44:05 INFO DAGScheduler: Parents of final stage: List()17/11/21 11:44:05 INFO DAGScheduler: Missing parents: List()17/11/21 11:44:05 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25), which has no missing parents17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 5.2 KB, free 5.2 KB)17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.5 KB, free 7.7 KB)17/11/21 11:44:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55844 (size: 2.5 KB, free: 1129.9 MB)17/11/21 11:44:06 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:100617/11/21 11:44:06 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25)17/11/21 11:44:06 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks17/11/21 11:44:06 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 1922 bytes)17/11/21 11:44:06 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)17/11/21 11:44:06 INFO JDBCRDD: closed connection17/11/21 11:44:06 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 3472 bytes result sent to driver17/11/21 11:44:06 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 224 ms on localhost (1/1)17/11/21 11:44:06 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/11/21 11:44:06 INFO DAGScheduler: ResultStage 0 (show at SparkConnMysql.scala:25) finished in 0.261 s17/11/21 11:44:06 INFO DAGScheduler: Job 0 finished: show at SparkConnMysql.scala:25, took 1.467252 s+---+----+----+------------+------------------+---------+-------+| id|name| age| phone| email|startdate|enddate|+---+----+----+------------+------------------+---------+-------+| 11| 徐心三| 24| 2423424| 2423424@qq.com| null| null|| 33| 徐心七| 23| 23232323| 13131@qe| null| null|| 55| 徐彬| 22| 15262301036|徐彬757661238@ww.com| null| null|| 44| 徐成|3333| 23423424332| 2342423@qq.com| null| null|| 66| 徐心四| 23|242342342423| 徐彬23424@qq.com| null| null|| 11| 徐心三| 24| 2423424| 2423424@qq.com| null| null|| 33| 徐心七| 23| 23232323| 13131@qe| null| null|| 55| 徐彬| 22| 15262301036|徐彬757661238@ww.com| null| null|| 44| 徐成|3333| 23423424332| 2342423@qq.com| null| null|| 66| 徐心四| 23|242342342423| 徐彬23424@qq.com| null| null|| 88| 徐心八| 123| 131231312| 123123@qeqe| null| null|| 99| 徐心二| 23| 13131313| 1313133@qeq.com| null| null||121| 徐心五| 13| 123131231| 1231312@qq.com| null| null||143| 徐心九| 23| 234234| 徐彬234@wrwr| null| null|+---+----+----+------------+------------------+---------+-------+only showing top 14 rows17/11/21 11:44:06 INFO SparkContext: Invoking stop() from shutdown hook17/11/21 11:44:06 INFO SparkUI: Stopped Spark web UI at http://192.168.114.67:404017/11/21 11:44:06 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!17/11/21 11:44:06 INFO MemoryStore: MemoryStore cleared17/11/21 11:44:06 INFO BlockManager: BlockManager stopped17/11/21 11:44:06 INFO BlockManagerMaster: BlockManagerMaster stopped17/11/21 11:44:06 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.17/11/21 11:44:06 INFO SparkContext: Successfully stopped SparkContext17/11/21 11:44:07 INFO ShutdownHookManager: Shutdown hook called17/11/21 11:44:07 INFO ShutdownHookManager: Deleting directory C:\Users\lenovo\AppData\Local\Temp\spark-7877d903-f8f7-4efb-9e0c-7a11ac147153
阅读全文
1 0
- 本地windows跑Scala程序调用Spark
- 本地windows跑Python程序调用Spark
- spark入门实战windows本地测试程序
- windows环境下本地运行spark程序
- Windows 上面搭建 Spark + Pycharm/idea scala/python 本地编写Spark程序,测试通过后再提交到Linux集群上
- scala 开发spark程序
- spark scala 数据处理程序
- Spark实战----(1)使用Scala开发本地测试的Spark WordCount程序
- c# windows程序调用本地输入法
- 用scala 写spark程序
- Spark本地开发环境scala版本
- 【Windows】【Scala + Spark】【Eclipse】单机开发环境搭建 - 及示例程序
- spark在windows本地调试
- 本地执行Spark程序示例
- windows下 eclipse for scala+spark+mongodb
- java调用本地程序
- javascript调用本地程序
- 调用本地程序
- Android 编译命令
- php类和对象: clone 克隆
- volatile和synchronized比较以及线程安全中的应用
- 《spring实战》读书笔记-简化java开发
- hibernate五大核心接口
- 本地windows跑Scala程序调用Spark
- U3D-NGUI-1
- mysql数据库内容替换
- leaflet加载地图出现瓦片乱序的问题
- python爬虫脚本下载YouTube视频
- Python进程间通信,使用multiprocessing.connection的Listener和Client实现
- 使用百度地图SDK定位当前位置并显示在地图上
- 英国智能机器人技术和自主系统研究发展概况
- 【项目管理】Jenkins+Maven+Git项目持续构建之搭建JDK/Maven基础环境