spark通过jdbc访问postgresql数据库
来源:互联网 发布:sql 字符串函数 编辑:程序博客网 时间:2024/04/29 15:15
1.首先要有可用的jdbc
2.把下载好的jar文件放在$SPARK_HOME/lib下面
3.启动sparck shell
4.创建DataFrame对象
5.查看schema
6.简单计算
[hadoop@db1 bin]$ locate jdbc|grep postgres/mnt/hd01/www/html/deltasql/clients/java/dbredactor/lib/postgresql-8.2-507.jdbc4.jar/usr/lib/ruby/gems/1.8/gems/railties-3.2.13/lib/rails/generators/rails/app/templates/config/databases/jdbcpostgresql.yml/usr/src/postgis-2.0.0/java/jdbc/src/org/postgresql/usr/src/postgis-2.0.0/java/jdbc/src/org/postgresql/driverconfig.properties/usr/src/postgis-2.0.0/java/jdbc/stubs/org/postgresql/usr/src/postgis-2.0.0/java/jdbc/stubs/org/postgresql/Connection.java/usr/src/postgis-2.0.0/java/jdbc/stubs/org/postgresql/PGConnection.java/usr/src/postgis-2.1.0/java/jdbc/src/org/postgresql/usr/src/postgis-2.1.0/java/jdbc/src/org/postgresql/driverconfig.properties/usr/src/postgis-2.1.0/java/jdbc/stubs/org/postgresql/usr/src/postgis-2.1.0/java/jdbc/stubs/org/postgresql/Connection.java/usr/src/postgis-2.1.0/java/jdbc/stubs/org/postgresql/PGConnection.java没有合适的,就在管网下载:https://jdbc.postgresql.org/download/postgresql-9.4-1205.jdbc4.jar
2.把下载好的jar文件放在$SPARK_HOME/lib下面
3.启动sparck shell
[hadoop@db1 bin]$ SPARK_CLASSPATH=$SPARK_HOME/lib/postgresql-9.4-1205.jdbc4.jar $SPARK_HOME/bin/spark-shell...Please instead use: - ./spark-submit with --driver-class-path to augment the driver classpath - spark.executor.extraClassPath to augment the executor classpath15/11/04 17:53:05 WARN SparkConf: Setting 'spark.executor.extraClassPath' to '/usr/src/data-integration/lib/postgresql-9.3-1102-jdbc4.jar' as a work-around.15/11/04 17:53:05 WARN SparkConf: Setting 'spark.driver.extraClassPath' to '/usr/src/data-integration/lib/postgresql-9.3-1102-jdbc4.jar' as a work-around.15/11/04 17:53:06 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.15/11/04 17:53:07 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.Spark context available as sc.15/11/04 17:53:09 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/11/04 17:53:09 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/11/04 17:53:25 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.015/11/04 17:53:25 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException15/11/04 17:53:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable15/11/04 17:53:29 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/11/04 17:53:29 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)SQL context available as sqlContext.
4.创建DataFrame对象
scala> val df = sqlContext.load("jdbc", Map("url" -> "jdbc:postgresql://localhost:5434/cd03?user=cd03&password=cd03", "dbtable" -> "test_trans"))warning: there were 1 deprecation warning(s); re-run with -deprecation for detailsdf: org.apache.spark.sql.DataFrame = [trans_date: string, trans_prd: int, trans_cust: int]标准格式:val jdbcDF = sqlContext.read.format("jdbc").options( Map("url" -> "jdbc:postgresql:dbserver", "dbtable" -> "schema.tablename")).load()
5.查看schema
scala> df.printSchema()root |-- trans_date: string (nullable = true) |-- trans_prd: integer (nullable = true) |-- trans_cust: integer (nullable = true)
6.简单计算
scala> df.filter(df("trans_cust")>9999999).select("trans_date","trans_prd").show+----------+---------+|trans_date|trans_prd|+----------+---------+| 2015-5-20| 2007|| 2015-7-24| 5638|| 2015-5-19| 8182|| 2015-2-24| 11391|| 2015-8-13| 17341|| 2015-2-22| 10996|| 2015-1-17| 15284|| 2015-1-8| 16090|| 2015-1-25| 13528|| 2015-1-17| 9498|| 2015-9-25| 7235|| 2015-8-19| 4084|| 2015-4-24| 16637|| 2015-5-27| 13829|| 2015-0-13| 13956|| 2015-3-19| 11974|| 2015-10-5| 1185|| 2015-3-28| 9412|| 2015-6-13| 15203|| 2015-2-14| 10087|+----------+---------+only showing top 20 rows
0 0
- spark通过jdbc访问postgresql数据库
- 关于android通过jdbc直接访问postgresql数据库
- 通过JDBC访问数据库
- 通过 JDBC 访问数据库
- 通过JDBC访问数据库
- 通过JDBC访问数据库
- 通过JDBC访问数据库
- 通过JDBC访问数据库
- 通过unixODBC访问PostgreSQL数据库
- 通过JDBC访问MySql数据库
- PostgreSQL的JDBC访问
- JDBC之通过PreparedStatement对象访问数据库
- 通过JDBC访问Sybase ASE 15.5数据库
- 通过JDBC访问数据库元信息
- 通过JDBC访问数据库的基本步骤
- java通过jdbc访问Access数据库
- Android数据库:通过JDBC直接访问MySql
- Java通过JDBC访问MySQL数据库实例
- 态度
- Spring MVC+Hibernate框架项目开发流程
- YII framewoke
- spring为ApplicationContext提供有三种实现
- 抽奖概率
- spark通过jdbc访问postgresql数据库
- ORA-12638: 身份证明检索失败 的解决办法
- C# 封装,继承,多态
- UIWebView使用简介
- centos CDH 离线安装步骤
- socket选项表
- java.io.EOFException
- Fortran 学习Hello World
- Android Studio 2.0 FileOpUtils Not found