spark定制之二:start.scala中的DSL
来源:互联网 发布:幼儿教师网络研修总结 编辑:程序博客网 时间:2024/05/29 03:09
通过类的隐式转换可以让写脚本变得更容易(写入"select * from testperson" hqlgo直接处结果,"select * from testperson" saveto somefile直接写入文件):
def getRegisterString(rddname:String,classname:String,tablename:String,tabledef:String) : String = { val members = tabledef.split(",").map(_.trim.split(" ").filter(""!=)).map(x => (x(0).trim,x(1).trim.head.toString.toUpperCase+x(1).trim.tail)) val classmemberdef = members.map(x => (x._1+":"+x._2)).mkString(",") val convertstr = members.map(x => x._2).zipWithIndex.map(x => "t("+x._2+").to"+x._1).mkString(",") return s""" case class ${classname}(${classmemberdef}) val schemardd = ${rddname}.map(_.split("${FIELD_SEPERATOR}")).map(t=>${classname}(${convertstr})) hive.registerRDDAsTable(schemardd,"${tablename}") """}org.apache.spark.repl.Main.interp.command("""class CommandTranslator(cmd:String) extends java.io.Serializable { def hqlgo()(implicit f: SchemaRDD => MySchemaRDD) = { lastrdd = hql(cmd) lastrdd.go() } def hqlsaveto(output: String)(implicit f: SchemaRDD => MySchemaRDD) = { lastrdd = hql(cmd) lastrdd.saveto(output) } def defineas(tabledef:String) = { if( tabledef != "" ) { org.apache.spark.repl.Main.interp.command( getRegisterString(cmd,cmd.toUpperCase,cmd,tabledef) ) } else { org.apache.spark.repl.Main.interp.command( "hive.registerRDDAsTable(${cmd},\"${cmd}\")" ) } } def from(filepath:String) { if( cmd.startsWith("create table ") ) { val tablename = cmd.substring(13).trim().split(" ")(0) val leftstr = cmd.substring(13).trim().substring(tablename.length).trim() val tabledef = leftstr.substring(1,leftstr.length-1).trim() val realfile = AutoFileUtil.regularFile(filepath) org.apache.spark.repl.Main.interp.command( "val "+tablename+" = sc.textFile(\""+realfile+"\")" ) new CommandTranslator(tablename).defineas(tabledef) } else { println("usage:") println("\"create table sometablename (field1 string,field2 int...)\" from \"somefile or hdfs:somepath\"") } }}object CommandTranslator { implicit def stringToTranslator(cmd:String) = new CommandTranslator(cmd) def show(tabledata:Array[org.apache.spark.sql.Row]) = { tabledata.foreach( x => println(x.mkString("\t"))) }}""")def auto = CommandTranslatorimport CommandTranslator._def help = { println("""example: "select * from testperson" hqlgo "select * from testperson" hqlsaveto "hdfs://somedir" "select * from testperson" hqlsaveto "somelocalfile" "create table sometable (name string,age int,weight double)" from "hdfs:/test/testperson" auto show hqlresult "somerdddata" defineas "(name string,age int)" if you want to see the help of enveronment, please type :help """)}
0 0
- spark定制之二:start.scala中的DSL
- spark定制之四:完整的start.scala
- spark定制之六:sql版start.scala
- spark catalyst中的DSL 解析
- Spark之Scala学习之路(二)
- Spark 之Scala
- scala学习之:Scala中泛型类、泛型函数、泛型在Spark中的广泛应用
- scala学习之:scala多重界定代码实战及其在Spark中的应用
- Spark定制班第11课:Spark Streaming源码解读之Driver中的ReceiverTracker架构
- spark定制之三:MySchemaRDD
- spark定制之五:使用说明
- Scala + Spark +Maven之Helloworld
- 9-spark之Scala语言
- Spark版本定制第2天:通过案例对SparkStreaming透彻理解之二
- 利用C++、scala等语言的运算符重载功能定制领域特定语言(DSL,Domain-Specific Language )
- scala 并行集合在spark中的应用
- spark中的scala的广播变量
- 9.JavaScript,Scala,spark中的闭包
- 如何回避 EXPORT_SYMBOL_GPL
- 关于RTP打包h264的时间戳要注意的问题
- A* Pathfinding for Beginners
- Linux服务器文件描述符最大值修改
- linux awk命令详解
- spark定制之二:start.scala中的DSL
- accept返回的socket的端口号和连接socket一样的!!! socket绑定信息结构
- EL,OGNL两种表达式用处的不同
- linux中彻底删除用户
- Java培训,让你拥有不一样的经历
- Genymotion常见问题整合与解决方案
- 解决ubuntu gedit中文乱码
- 第一篇博客,非技术文章
- Javascript中闭包的个人理解