spark连接mongodb
来源:互联网 发布:考研调剂知乎 编辑:程序博客网 时间:2024/05/29 12:35
hadoop和mongodb的连接器
<dependency>
<groupId>org.mongodb.mongo-hadoop</groupId>
<artifactId>mongo-hadoop-core</artifactId>
<version>1.4.2</version>
</dependency>
java连接mongodb连接器
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>2.13.0</version>
</dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>2.13.0</version>
</dependency>
2.使用示例
import com.mongodb.hadoop.MongoOutputFormat;import org.apache.hadoop.conf.Configuration;import org.apache.spark.api.java.JavaPairRDD;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.api.java.JavaSparkContext;import org.apache.spark.api.java.function.Function;import org.bson.BSONObject;import scala.Tuple2;import java.util.Date;import java.util.List;/** * Created by Administrator on 2015/12/8. */public class ConnectMongo { public static void main(String args[]){ JavaSparkContext sc =new JavaSparkContext("local","test"); Configuration config =new Configuration(); //解释 主机:端口号/数据库名.Collection名 config.set("mongo.input.uri","mongodb://127.0.0.1:27017/lang.sanlu"); config.set("mongo.output.uri", "mongodb://127.0.0.1:27017/lang.output"); //读取 JavaPairRDD<Object, BSONObject> mongoRDD = sc.newAPIHadoopRDD(config, com.mongodb.hadoop.MongoInputFormat.class, Object.class, BSONObject.class); //BasonObject-> text JavaRDD<text> result = mongoRDD.map( new Function<Tuple2<Object, BSONObject>, text>() { public text call(Tuple2<Object, BSONObject> v1) throws Exception { String title = (String) v1._2().get("title"); Date date =(Date) v1._2().get("date"); List<String> paragraph = (List<String>) v1._2().get("paragraph"); return new text(title,date,paragraph); } } ); //copy lang.sanlu to lang.output mongoRDD.saveAsNewAPIHadoopFile("file:///copy",Object.class, Object.class, MongoOutputFormat.class, config); }}
0 0
- spark连接mongodb
- Spark连接MongoDB
- spark连接mongodb(权限认证)示例
- spark mongodb
- spark mongodb
- MongoDB Spark
- 【MongoDB】【Spark】在MongoDB上使用Spark
- [ mongoDB ] - MongoDB 连接池
- [ mongoDB ] - MongoDB 连接池
- mongodb连接
- MongoDB - 连接
- MongoDB - 连接
- MongoDB - 连接
- spark/hadoop整合mongodb
- spark streaming+mongodb(geo)
- Spark操作mongodb
- spark+mongodb + quartz
- spark读取mongodb
- ios博客浏览工具
- Android开发中一些被冷落但却很有用的类和方法
- 5位数逆序排列
- 关于cgywin下执行找不到make命令的问题
- Android Studio多工程引用同一个library项目配置方法
- spark连接mongodb
- 自定义控件
- 偶尔回顾一下C99先于C11(三)
- Linux系统下的文件I/O操作
- MongoDB Java使用指南
- iOS开发之UICollectionView
- 订餐系统——TreeView显示目录结构
- iOS开发-------3D Touch之ApplicationShortcutItem
- SSH2框架搭建