spark应用提交
来源:互联网 发布:思迅进销存软件视频 编辑:程序博客网 时间:2024/05/21 12:46
refer: http://m.blog.chinaunix.net/uid-26733228-id-5301647.html
在使用spark的时候,难免写spark程序进行数据分析。根据spark的文档,我们提应用程序的方式是使用对应的spark-submit脚本进行,但是在实际的使用中往往需要程序代码提交用于分析的应用。
查找相关文档,得到如下程序例子:
- public class MyLanucher {
- public static void main(String[] args) throws IOException, InterruptedException {
- SparkLauncher launcher = new SparkLauncher();
- launcher.setAppResource("count.jar"); // 要启动的spark应用包
- launcher.setMainClass("JavaNetworkWordCount");
- launcher.addAppArgs(args);
- launcher.setMaster("yarn-cluster"); // 在yarn-cluster上启动,也可以再local上
- launcher.setConf(SparkLauncher.DRIVER_MEMORY, "512m");
- launcher.setConf(SparkLauncher.EXECUTOR_MEMORY, "512m");
- launcher.setConf(SparkLauncher.EXECUTOR_CORES, "4");
- Process process = launcher.launch();
- InputStream stdInput = process.getInputStream();
- InputStream errInput = process.getErrorStream();
- process.waitFor();
- // 获取dirver进程的输出
- System.out.println("---------------- read msg -----------------");
- dumpInput(stdInput);
- System.out.println("-------------- read err msg ---------------");
- dumpInput(errInput);
- System.out.println("launcher over");
- }
- private static void dumpInput(InputStream input) throws IOException {
- byte[] buff = new byte[1024];
- while (true) {
- int len = input.read(buff);
- if (len < 0) {
- break;
- }
- System.out.println(new String(buff, 0, len));
- }
- }
- }
编译运行该代码:
- * 1. 环境变量:SPARK_HOME=/home/longlong/workspace/spark/spark-1.5.0-bin-hadoop2.6
- * 2. 环境变量:YARN_CONF_DIR=/home/longlong/workspace/hadoop-2.7.1/etc/hadoop
- * 3. 编译:javac -cp spark-assembly-1.5.0-hadoop2.6.0.jar MyLanucher.java
- * 4. 打包:jar -cf launcher.jar MyLanucher.class
- * 5. 启动:java -cp spark-assembly-1.5.0-hadoop2.6.0.jar:launcher.jar MyLanucher ip port
counter的代码如下:
- public final class JavaNetworkWordCount {
- private static final Pattern SPACE = Pattern.compile(" ");
- @SuppressWarnings({ "serial", "resource" })
- public static void main(String[] args) {
- if (args.length < 2) {
- System.err.println("Usage: JavaNetworkWordCount ");
- System.exit(1);
- }
- // Create the context with a 1 second batch size
- SparkConf sparkConf = new SparkConf().setAppName("JavaNetworkWordCount");
- JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1));
- // Create a JavaReceiverInputDStream on target ip:port and count the
- // words in input stream of \n delimited text (eg. generated by 'nc')
- // Note that no duplication in storage level only for running locally.
- // Replication necessary in distributed scenario for fault tolerance.
- JavaReceiverInputDStream<String> lines = ssc.socketTextStream(args[0], Integer.parseInt(args[1]),
- StorageLevels.MEMORY_AND_DISK_SER);
- JavaDStream<String> words = lines.flatMap(new FlatMapFunction<String, String>() {
- @Override
- public Iterable<String> call(String x) {
- return Arrays.asList(SPACE.split(x));
- }
- });
- JavaPairDStream<String, Integer> wordCounts = words.mapToPair(new PairFunction<String, String, Integer>() {
- @Override
- public Tuple2<String, Integer> call(String s) {
- return new Tuple2<String, Integer>(s, 1);
- }
- }).reduceByKey(new Function2<Integer, Integer, Integer>() {
- @Override
- public Integer call(Integer i1, Integer i2) {
- return i1 + i2;
- }
- });
- wordCounts.print();
- ssc.start();
- ssc.awaitTermination();
- }
- }
该例子来自于spark自带的example
0 0
- Spark提交应用失败
- spark应用提交
- Spark应用提交指南(spark-submit)
- Spark 同步提交应用/多文件输出
- spark提交应用的全流程分析
- IDE的使用,打包spark应用提交
- Spark提交应用(Submitting Applications)
- Spark on yarn 提交应用的方式
- spark提交应用的全流程分析
- spark之13:提交应用的方法(spark-submit)
- spark之13:提交应用的方法(spark-submit)
- spark提交
- 使用REST API提交、查看和结束Spark应用
- 用SparkSubmit.main(args) 提交应用到spark
- Spark Cluster与Application中的重要概念以及如何提交应用spark-submit
- Spark源码走读(一) —— Spark应用提交流程
- 提交spark应用程序spark-submit
- spark 提交任务到spark
- Plus One
- JavaSE 学习参考:位运算符
- Hdu2717 Catch That Cow(BFS) ---Java版
- MySQL笔记四之 字符串函数
- spring3.x第一章 Spring概述
- spark应用提交
- 时隔近5个月 Note 7的燃损真相浮出水面
- P1423 小玉在游泳
- [Leetcode] 2. Add Two Numbers
- Same Tree
- 适配器模式
- android上位机连接hc-06蓝牙模块(以蓝牙智能小车为例)
- 在Eclipse开发平台上运行基于JavaEE的项目程序
- 解决MYSQL解压版不能启动服务