eclipse IDEA maven scala spark 搭建 成功运行 sparkContext
来源:互联网 发布:mac注销用户快捷键 编辑:程序博客网 时间:2024/06/05 10:58
整了好几天,把eclipse弄能用.. 期间报各种错,进度也被耽误了…archetype和pom部分引用他人的,可惜调试的太多,没有记录下作者,这里歉意+感谢.
环境:
Hadoop–>2.6.4
Scala–>2.11.8
Spark–>2.2.0
IDE,
eclipseEE + scalaIDE插件–>oxygen:pom有报错,但是可用
scalaIDE–>4.7-RC:目前spark的本地/集群都可执行.
IDEA–>还有些问题,可运行,不完美.补充在最后,和eclipseEE有点像.
注意:
1,版本的一致.scala和spark的版本要对应,不然可能报class.Product$class错,报找不到类或者…错误好多,没头绪..
如:
下面pom的配置中的spark.version和scala.version和scala.binary.version还有scala Library Container中版本的匹配,
2,pom中添加scala-maven-plugin插件依赖,就不需要再添加scala的dependency,除非有特殊需求;同时要注意导入的Scala Library Container中的版本问题
3,善用maven 的update project和project 的clean还有项目右键中configure菜单中的功能
4,本方案为初级方案.有些问题还是未能解决,使用eclipse的话,就用scalaIDE版的吧.IDEA可用的话,编辑scala还是很顺手的,就是括号,引号的tab跳转不舒服
eclipse部分:
具体如下:同时适用于eclipseEE和scalaIDE,具体有说明.
创建maven Project,
next
http://repo1.maven.org/maven2/archetype-catalog.xml
remote archetype创建好后,修改
src/main/java–>src/main/scala
src/test/java–>src/test/scala修改pom.xml.使用上面远程的archetype,会自动生成指定的pom,这里不使用.
<properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <spark.version>2.2.0</spark.version> <scala.version>2.11.8</scala.version> <scala.binary.version>2.11</scala.binary.version> <hadoop.version>2.6.4</hadoop.version> </properties> <dependencies> <!-- ==================================spark================================--> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.binary.version}</artifactId> <version>${spark.version}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming_2.10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_${scala.binary.version}</artifactId> <version>${spark.version}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.binary.version}</artifactId> <version>${spark.version}</version> </dependency> <!-- ==================================hadoop================================--> <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency> <!-- ==================================other================================--> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> </dependency> </dependencies> <!-- maven官方 http://repo1.maven.org/maven2/ 或 http://repo2.maven.org/maven2/ (延迟低一些) --> <repositories> <repository> <id>central</id> <name>Maven Repository Switchboard</name> <layout>default</layout> <url>http://repo2.maven.org/maven2</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories> <build> <sourceDirectory>src/main/scala</sourceDirectory> <testSourceDirectory>src/test/scala</testSourceDirectory> <plugins> <plugin> <!-- MAVEN 编译使用的JDK版本 --> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.7.0</version> <configuration> <source>1.8</source> <target>1.8</target> <encoding>UTF-8</encoding> </configuration> </plugin> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.3.1</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> <configuration> <args> <arg>-make:transitive</arg> <arg>-dependencyfile</arg> </args> </configuration> </execution> </executions> </plugin> </plugins> </build>当当运行时出现” -make:transitive”错误时,注释掉这个<arg>.同时在dependencies添加(可选): <!-- https://mvnrepository.com/artifact/org.specs2/specs2-junit_2.11 --> <dependency> <groupId>org.specs2</groupId> <artifactId>specs2-junit_2.11</artifactId> <version>3.9.4</version> <scope>test</scope> </dependency>
注意:
1. 这里在eclipseEE在市场中下载了scalaIDE插件后,依然会报错,不用管,能用.
2. pom中的依赖包的版本注意与自己的版本对应.
3. 恰当使用项目右键中maven的update Projcet功能和eclipse菜单栏projcet的clean和build Automatically.多试试.
4. 在使用maven 的update Projcet后成功识别pom中的scala插件配置后,项目右键会有scala的选项.在使用eclipseEE的maven的update Projcet后没有反应时,使用项目右键菜单中configure,有”add Scala Nature”,可给项目添加scala库容器
5. 当下载scalaIDE插件或者使用最新的scalaIDE时,默认图中的(build-in)3个版本,要添加自己的scala版本时,在Window–Perferences–>
6. 修改scala版本时,在scala Library Container 右键–> build path–> configure build path –> Libraries标签中,remove当前版本,add Library –> Scala Library–> 选择刚才添加的版本即可.
5,创建object对象,测试运行.注意eclipseEE在执行sparkjob时,我这里需要设置head memory,在的run configurations..中scala application的对应任务下
测试代码:
import org.apache.spark.SparkConf import org.apache.spark.SparkContext object a { def main(args: Array[String]): Unit = { val conf = new SparkConf() //conf.setMaster("spark://mini2:7077") conf.setMaster("local[4]") conf.setAppName("test") val sc = new SparkContext(conf) val a = sc.parallelize(List(1,2,3), 2) println(a.count()) //3 sc.stop() } }
**
IDEA部分
**
1,创建maven项目,修改
src/main/java-->src/main/scalasrc/test/java-->src/test/scala
2,改pom,配置内容与上面有些不同–
试了下,两个pom的部分不能互换,起码eclipse的部分不能放在IDEA这个版本里, 配置具体什么含义,会者不难.不会的人就照着这个来吧,起码能跑起来
<properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <scala.version>2.11.8</scala.version> <spark.version>2.2.0</spark.version> <hadoop.version>2.6.4</hadoop.version></properties><dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency></dependencies><build> <sourceDirectory>src/main/scala</sourceDirectory> <testSourceDirectory>src/test/scala</testSourceDirectory> <plugins> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.2.2</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> <configuration> <args> <!--<arg>-make:transitive</arg>--> <arg>-dependencyfile</arg> <arg>${project.build.directory}/.scala_dependencies</arg> </args> </configuration> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.1.0</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> </execution> </executions> </plugin> </plugins></build>
===>这里遇到了-make:transitive问题,注释掉这行,没有添加junit依赖,也可运行…………….
3,在Projcet Structure中Global Library修改本项目的scala版本,类似eclipse添加.见eclipse部分
4,善用侧边栏的Maven Project的左上角的刷新和项目右键的build和rebuild功能
5,我这里运行时,报出和eclipseEE相同的内存不足,在run configurations…中设置,如图:
eclipse+插件版本我没找到类似的设置,只能一个obj设置一个了;
scalaIDE不存在内存问题.
6,,修改scala编译顺序,javathenscala
- eclipse IDEA maven scala spark 搭建 成功运行 sparkContext
- Eclipse+maven+scala+spark环境搭建
- Eclipse+maven+scala+spark环境搭建
- eclipse + maven + scala+spark环境搭建
- Intellij IDEA使用Maven搭建spark开发环境(scala)
- 分别用Eclipse和IDEA搭建Scala+Spark开发环境
- win7下eclipse+spark+scala+maven环境搭建及实例
- hadoop spark环境搭建及idea scala maven集成开发spark任务
- windows搭建spark运行环境(windows scala,hadoop,spark安装,idea使用配置等)
- Scala学习笔记-1用Eclipse和IDEA搭建Scala+Spark开发环境
- IDEA MAVEN SPARK SCALA打包办法
- idea+maven+scala创建wordcount,打包jar并在spark on yarn上运行
- idea+maven+scala创建wordcount,打包jar并在spark on yarn上运行
- IDEA SPARK SCALA 搭建简单的helloworld
- 2017.10最新Spark、IDEA、Scala环境搭建
- Intellij IDEA开发环境搭建,scala配置及打包,jar包在spark中的运行
- scala中spark错误 Error initializing SparkContext
- eclipse构建maven scala 项目 编译成功
- Ajax 菜谱(template引擎)
- LimeSDR + Osmo-TRX + OpenBTS搭建简易GSM基站测试
- bzoj4291 [PA2015]Kieszonkowe
- 缓存管理方案 AutoLoadCache (spring注解管理缓存,可与redis,mencache等对接)
- HDOJ2021
- eclipse IDEA maven scala spark 搭建 成功运行 sparkContext
- Java集合深入学习:HashMap的实现原理
- 专二、mcg-helper万能代码生成工具介绍
- Python:入门到实现网络爬虫 Day2
- 译文 | 从开发到部署 Docker容器命令总结
- Linux文件寻址
- win10 +keras +gpu
- Multiple dex files define问题另一种解法
- How to make a automated testing for web applications