spark入门(4)----scala扫盲(1)
来源:互联网 发布:电子技术仿真软件 编辑:程序博客网 时间:2024/05/17 09:28
scala方法和函数区别
注意:方法的返回值类型可以不写,编译器可以自动推断出来,但是对于递归函数,必须指定返回类型
//定义方法 def m2(f:(Int,Int)=>Int) = f(2,6) //定义函数 val f2 = (x:Int,y:Int) => x-y val a = m2(f2) println("the result is: "+a) //将方法转化为函数(神奇的“_”将m2这个方法变为了函数) val f1 = m2 _ println(f1(f2))
数组、映射、元组、集合
//初始化一个长度为8的定长数组,其所有元素均为0 val arr1 = new Array[Int](8) //直接打印定长数组,内容为数组的hashcode值 println(arr1) //将数组转换成数组缓冲,就可以看到原数组中的内容了 //toBuffer会将数组转换长数组缓冲 println(arr1.toBuffer) //注意:如果new,相当于调用了数组的apply方法,直接为数组赋值 //初始化一个长度为1的定长数组 val arr2 = Array[Int](10) println(arr2.toBuffer) //定义一个长度为3的定长数组 val arr3 = Array("hadoop", "storm", "spark") //使用()来访问元素 println(arr3(2)) ////////////////////////////////////////////////// //变长数组(数组缓冲) //如果想使用数组缓冲,需要导入import scala.collection.mutable.ArrayBuffer包 val ab = ArrayBuffer[Int]() //向数组缓冲的尾部追加一个元素 //+=尾部追加元素 ab += 1 //追加多个元素 ab += (2, 3, 4, 5) //追加一个数组++= ab ++= Array(6, 7) //追加一个数组缓冲 ab ++= ArrayBuffer(8,9) //打印数组缓冲ab //在数组某个位置插入元素用insert ab.insert(0, -1, 0) //删除数组某个位置的元素用remove ab.remove(8, 2) println(ab) }
遍历数组
//初始化一个数组 val arr = Array(1,2,3,4,5,6,7,8) //增强for循环 for(i <- arr) println(i) //好用的until会生成一个Range //reverse是将前面生成的Range反转 for(i <- (0 until arr.length).reverse) println(arr(i)) }
数组转换
yield关键字将原始的数组进行转换会产生一个新的数组,原始的数组不变
//定义一个数组 val arr = Array(1,2,3,4,5,6,7,8) //用yield关键字生成一个新的数组 val res = for( e <- arr if e%2==0 ) yield e*2 println(res.toBuffer) //map方法更好用 val res2 = arr.filter(_%2==0).map(_*2) println(res2.toBuffer)
数组转换
//定义一个数组 val arr = Array(1,2,3,4,5,6,7,8) //用yield关键字生成一个新的数组 val res = for( e <- arr if e%2==0 ) yield e*2 println(res.toBuffer) //map方法更好用 val res2 = arr.filter(_%2==0).map(_*2) println(res2.toBuffer)
数组常用算法
val arr = Array(2,5,1,4,3) //求和 println(arr.sum) //秋最大值 println(arr.max) //排序 println(arr.sorted.toBuffer)
哈希表(映射)
注意:在Scala中,有两种Map,一个是immutable包下的Map,该Map中的内容不可变;另一个是mutable包下的Map,该Map中的内容可变
//第一种:-> val scores1 = Map("tom"->85,"jetty"->99,"kitty"->90) println(scores1) //第二种:元组 val scores = Map(("tom",85),("jetty",99),("kitty",90)) println(scores) //获取 println(scores("jetty")) //getOrElse val o1= scores.getOrElse("tian",0) println(o1) //修改scala.collection.mutable.Map中的值 val scores2 = scala.collection.mutable.Map("tom"->80,"jim"->40) scores2("jim")=50 println(scores2)
常用函数
//创建一个List val lst0 = List(1,7,9,8,0,3,5,4,6,2) //将lst0中每个元素乘以10后生成一个新的集合 val tem = lst0.map(_*10) println(tem) //将lst0中的偶数取出来生成一个新的集合 val tem1 = lst0.filter( x=> x%2==0) println(tem1) //将lst0排序后生成一个新的集合 val tem2 = lst0.sorted println(tem2) //反转顺序 println(lst0.reverse) //将lst0中的元素4个一组,类型为Iterator[List[Int]] val tem3 = lst0.grouped(4) println(tem3.toList) //将Iterator转换成List val tem4 = lst0.grouped(4).toList println(tem4) //将多个list压扁成一个List println(tem4.flatten) val lines = List("hello tom hello jerry", "hello jerry", "hello kitty") //先按空格切分,在压平 val line = lines.flatMap(_.split(" ")).map((_,1)) .groupBy(_._1).mapValues(_.foldLeft(0)(_+_._2)) println(line.toList.sortBy(_._2).reverse) //并行计算求和 lst0.par println(lst0.par.reduce(_+_)) //化简:reduce //将非特定顺序的二元操作应用到所有元素// reduce() //按照特定的顺序// reduceLeft() //折叠:有初始值(无特定顺序)// fold()() //折叠:有初始值(有特定顺序)// floldLeft() //聚合 val arr = List(List(1, 2, 3), List(3, 4, 5), List(2), List(0)) val result = arr.aggregate(0)(_+_.sum,_+_) println(result) val l1 = List(5,6,4,7) val l2 = List(1,2,3,4) //求并集 val temp1 = l1.union(l2)// val temp1 = l1 union l2 println(temp1) //求交集 val temp2 = l1.intersect(l2) println(temp2) //求差集 val temp3 = l1.diff(l2) println(temp3)################################ val lines = List("wo shi ni hao","wo shi shi tian jun","ha ha ha ha")// val line = lines.map(_.split(" ")).flatten val line = lines.flatMap(_.split(" ")) val words = line.map((_,1)).groupBy(_._1) //方法一: val total = words.map(t=>(t._1,t._2.size)) val result1 = total.toList.sortBy(_._2).reverse println(result1) //方法二: val result2 = words.mapValues(_.size) println(result2) //方法三: val total3=words.mapValues(_.foldLeft(0)(_+_._2)) println(total3)
List((ha,4), (shi,3), (wo,2), (hao,1), (ni,1), (jun,1), (tian,1))Map(tian -> 1, ha -> 4, jun -> 1, shi -> 3, ni -> 1, wo -> 2, hao -> 1)Map(tian -> 1, ha -> 4, jun -> 1, shi -> 3, ni -> 1, wo -> 2, hao -> 1)
scala和java混合开发的pom文件
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>tianjun.cmcc.org</groupId> <artifactId>mytest</artifactId> <packaging>jar</packaging> <version>1.0-SNAPSHOT</version> <name>A Camel Scala Route</name> <url>http://www.myorganization.org</url> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> </properties> <dependencyManagement> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.11</version> </dependency> <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-parent</artifactId> <version>2.18.1</version> <scope>import</scope> <type>pom</type> </dependency> </dependencies> </dependencyManagement> <dependencies> <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-core</artifactId> </dependency> <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-scala</artifactId> </dependency> <!-- scala --> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>2.11.7</version> </dependency> <dependency> <groupId>org.scala-lang.modules</groupId> <artifactId>scala-xml_2.11</artifactId> <version>1.0.4</version> </dependency> <!-- logging --> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-api</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-slf4j-impl</artifactId> <scope>runtime</scope> </dependency> <!-- testing --> <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-test</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>repository.junit</groupId> <artifactId>junit</artifactId> <version>4.11</version> </dependency> <!-- https://mvnrepository.com/artifact/commons-collections/commons-collections --> <dependency> <groupId>commons-collections</groupId> <artifactId>commons-collections</artifactId> <version>3.2.2</version> </dependency> </dependencies> <build> <defaultGoal>install</defaultGoal> <sourceDirectory>src/main/scala</sourceDirectory> <testSourceDirectory>src/test/scala</testSourceDirectory> <plugins> <!-- the Maven compiler plugin will compile Java source files --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.5.1</version> <configuration> <source>1.8</source> <target>1.8</target> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-resources-plugin</artifactId> <version>3.0.1</version> <configuration> <encoding>UTF-8</encoding> </configuration> </plugin> <!-- the Maven Scala plugin will compile Scala source files --> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.2.2</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> </plugin> <!-- configure the eclipse plugin to generate eclipse project descriptors for a Scala project --> <!--<plugin>--> <!--<groupId>org.apache.maven.plugins</groupId>--> <!--<artifactId>maven-eclipse-plugin</artifactId>--> <!--<version>2.10</version>--> <!--<configuration>--> <!--<projectnatures>--> <!--<projectnature>org.scala-ide.sdt.core.scalanature</projectnature>--> <!--<projectnature>org.eclipse.jdt.core.javanature</projectnature>--> <!--</projectnatures>--> <!--<buildcommands>--> <!--<buildcommand>org.scala-ide.sdt.core.scalabuilder</buildcommand>--> <!--</buildcommands>--> <!--<classpathContainers>--> <!--<classpathContainer>org.scala-ide.sdt.launching.SCALA_CONTAINER</classpathContainer>--> <!--<classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>--> <!--</classpathContainers>--> <!--<excludes>--> <!--<exclude>org.scala-lang:scala-library</exclude>--> <!--<exclude>org.scala-lang:scala-compiler</exclude>--> <!--</excludes>--> <!--<sourceIncludes>--> <!--<sourceInclude>**/*.scala</sourceInclude>--> <!--<sourceInclude>**/*.java</sourceInclude>--> <!--</sourceIncludes>--> <!--</configuration>--> <!--</plugin>--> <!-- allows the route to be run via 'mvn exec:java' --> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>exec-maven-plugin</artifactId> <version>1.5.0</version> <configuration> <mainClass>tianjun.cmcc.org.MyRouteMain</mainClass> </configuration> </plugin> </plugins> </build></project>
阅读全文
0 0
- spark(4)--scala入门扫盲(2)
- spark入门(4)----scala扫盲(1)
- spark(1)-入门spark之scala sbt wordcount实验
- Spark(七) -- Scala快速入门
- Spark Streaming开发入门——WordCount(Java&Scala)
- scala、spark资料收集(入门及调优)
- Spark Scala IntelliJ IDEA开发环境搭建(入门)
- Spark Streaming开发入门——WordCount(Java&Scala)
- spark入门(1)
- Spark-特征选择(scala)
- spark-jdbc-oracle(scala)
- sql扫盲(4)
- spark入门知识讲解和基础数据操作编程(统一用scala编程实例)
- Spark学习使用笔记 - Scala篇(4)- 对象、包
- Spark学习使用笔记 - Scala篇(1)
- Scala基础入门(一)Scala 简介
- Spark扫盲
- scala入门学习(1)基础篇
- strcpy的使用需要注意dest的长度
- 推荐算法之 slope one 算法
- jxbrowser 监听所有网络请求 jxbrowser 系列教程2
- Ubuntu 14.04安装和使用python3.5
- 480. Sliding Window Median
- spark入门(4)----scala扫盲(1)
- Input实现下拉框多选
- python 版本切换脚本(切记根据自己的需求更改,出错自己负责)
- sql server如何对表进行查看锁和解锁
- SIGABRT的可能原因
- 颜色格式转换: 最简单的基于FFmpeg的libswscale的示例(YUV转RGB)
- Java基础面试题
- Mac OS X系统下adb调试Android手机的方法
- 【scala】Scala中lazy关键字的使用和理解