spark Sql
来源:互联网 发布:t95e6数据 编辑:程序博客网 时间:2024/06/17 02:29
package org.apache.spark.sqlimport org.apache.spark.{SparkConf, SparkContext}object SLA_parquetSQL { def main(args: Array[String]) { val sc = new SparkContext(new SparkConf().setAppName("SLA Filter")) val sqlContext = new SQLContext(sc) val suffix = args(0) sqlContext.parquetFile("/user/hive/warehouse/sla_parquet.db/e60001_shipment_exported_" + suffix).registerTempTable("e60001_shipment_exported") sqlContext.parquetFile("/user/hive/warehouse/sla_parquet.db/e62005_shipment_shipped_and_closed_" + suffix).registerTempTable("e62005_shipment_shipped_and_closed") sqlContext.parquetFile("/user/hive/warehouse/sla_parquet.db/e62006_shipment_canceled_and_closed_" + suffix).registerTempTable("e62006_shipment_canceled_and_closed") val e60001_shipment_exported = sqlContext.sql("select ordernumber, type_id, event_time from e60001_shipment_exported").map(line => (line(0), (line(1).toString, line(2).toString.substring(0, 19)))) val e62005_shipment_shipped_and_closed = sqlContext.sql("select ordernumber, type_id, event_time from e62005_shipment_shipped_and_closed").map(line => (line(0), (line(1).toString, line(2).toString.substring(0, 19)))) val e62006_shipment_canceled_and_closed = sqlContext.sql("select ordernumber, type_id, event_time from e62006_shipment_canceled_and_closed").map(line => (line(0), (line(1).toString, line(2).toString.substring(0, 19)))) val un = e60001_shipment_exported.union(e62005_shipment_shipped_and_closed).union(e62006_shipment_canceled_and_closed) un.groupByKey.filter(kv => FilterSLA.filterSLA(kv._2.toSeq)).map(kv => kv._1 + "\t" + Utils.flatValues(kv._2.toSeq)).saveAsTextFile(args(1)) }}
本文出自 “点滴积累” 博客,请务必保留此出处http://tianxingzhe.blog.51cto.com/3390077/1718636
0 0
- Spark Streaming+Spark SQL
- spark sql
- Spark SQL
- Spark SQL
- spark-sql
- spark sql
- spark sql
- spark sql
- Spark-Sql
- Spark SQL
- Spark SQL
- spark Sql
- spark-sql
- spark sql
- Spark Sql
- spark sql
- spark sql
- spark sql
- hive ETL之电商零售行业-推荐系统sql
- hive ETL之业绩报表sql
- hbase REST API
- hbase手动compact与split
- spark访问hbase
- spark Sql
- 匿名函数和闭包
- crontab/cron详解
- java动态代理实现Proxy和InvocationHandler cglib
- Java String.intern()方法学习
- Samba服务器rpm安装
- Hash树(散列树)和Trie树(字典树、前缀树)
- sqlserver实现树形结构递归查询(无限极分类)
- RTC实时时钟讲解及编程