sparksql实现多表关联查询
来源:互联网 发布:centos查看gcc 编辑:程序博客网 时间:2024/05/21 12:00
1.准备数据以json格式为例
student表
{"sid":"S001","sanme":"zhangsan","age":"12","gender":"female"}
{"sid":"S002","sanme":"lisi","age":"13","gender":"male"}
{"sid":"S003","sanme":"wangwu","age":"14","gender":"male"}{"sid":"S004","sanme":"zhaoliu","age":"15","gender":"female"}
course表
{"cid":"C001","cname":"football"}
{"cid":"C002","cname":"music"}
{"cid":"C003","cname":"art"}
student_course表
{"id":"1","sid":"S001","cid":"C001"}
{"id":"2","sid":"S002","cid":"C001"}
{"id":"3","sid":"S002","cid":"C002"}
{"id":"4","sid":"S003","cid":"C003"}
{"id":"5","sid":"S003","cid":"C001"}
{"id":"6","sid":"S004","cid":"C003"}
{"id":"7","sid":"S004","cid":"C002"}
2.打开spark客户端
命令:./spark-shell --master =local
这种创建的表是永久表,前后三个双引号是因为里面有字符串,这样可以实现转义,这些表是存在spark的一个内置数据库(derby数据库)当中。
sqlContext.sql("""create table student_course using org.apache.spark.sql.json options(path "hdfs://192.168.1.110:9000/data/sparksql/student_course.json")""")
sqlContext.sql("""create table course using org.apache.spark.sql.json options(path "hdfs://192.168.1.110:9000/data/sparksql/course.json")""")
sqlContext.sql("""create table student using org.apache.spark.sql.json options(path "hdfs://192.168.1.110:9000/data/sparksql/student.json")""")
查看表
sqlContext.sql("show tables").show()
实现关联查询:
sqlContext.sql("select * from course c,student s,student_course sc where c.cid=sc.cid and s.sid=sc.sid").show()
如下数据:
+----+--------+---+------+--------+----+----+---+----+
| cid| cname|age|gender| sanme| sid| cid| id| sid|
+----+--------+---+------+--------+----+----+---+----+
|C001|football| 12|female|zhangsan|S001|C001| 1|S001|
|C001|football| 13| male| lisi|S002|C001| 2|S002|
|C001|football| 14| male| wangwu|S003|C001| 5|S003|
|C002| music| 13| male| lisi|S002|C002| 3|S002|
|C003| art| 14| male| wangwu|S003|C003| 4|S003|
+----+--------+---+------+--------+----+----+---+----+
- sparksql实现多表关联查询
- Java实现SparkSQL查询Hive表数据
- 实现关联表查询
- 实现关联表查询
- Mybatis实现关联表查询(一对多关联)
- ORMlite多表关联查询实现
- 通过表关联实现查询
- MyBatis实现关联表查询
- Mybatis实现关联表查询
- MyBatis实现关联表查询
- mybatis-实现关联表查询
- mybatis实现关联表查询
- mybatis实现表关联查询
- MyBatis实现关联表查询
- MyBatis实现关联表查询
- MyBatis实现关联表查询
- 五、实现关联表查询
- 关联查询(多表查询)
- 分页JS
- cube上显示一张贴图
- 读取mongodb 显示日k 指定日期近500条 不能超过800条
- JS this 的使用 隐藏显示某段DIV
- 安卓之ANR
- sparksql实现多表关联查询
- 使用JDBC连接数据库
- mysql获取所有表的所有字段名
- 最大公约数-----欧几里德算法
- java多线程
- ECMAScript6 let 与 var
- Mac的AppStore里出现未安装app的更新提示
- 软件测试基础
- 寻路 A星