第82讲:Scala中List的ListBuffer实现高效的遍历计算

来源:互联网 发布:ubuntu 15.04 下载 编辑:程序博客网 时间:2024/06/05 04:59

我们通过对比下列4组程序,对比,发现优缺点

第一组: 递归

代码

 def main(args: Array[String]) {    val data = 1 to 20000    val currntTime =System.currentTimeMillis()    increase(data.toList)    println("used time=" + (System.currentTimeMillis() - currntTime))  }  def increase(list:List[Int]):List[Int] = list match {       case List() => List()       case head2 :: tail => (head2 + 1) :: increase(tail)  }

运行结果:

Exception in thread “main” java.lang.StackOverflowError
at scala.collection.LinearSeqOptimizedclass.lengthCompare(LinearSeqOptimized.scala:261)atscala.collection.immutable.List.lengthCompare(List.scala:84)atcom.ifly.edu.scala.list.ListBufferInternals.increase(ListBuffer_Internals.scala:19)
at com.ifly.edu.scala.list.ListBuffer_Internals.increase(ListBufferInternals.scala:20)atcom.ifly.edu.scala.list.ListBufferInternals.increase(ListBuffer_Internals.scala:20)
at com.ifly.edu.scala.list.ListBuffer_Internals.increase(ListBufferInternals.scala:20)atcom.ifly.edu.scala.list.ListBufferInternals.increase(ListBuffer_Internals.scala:20)
at

优点: 简单
缺点: 当数据过大时,不停创建堆栈,内存消耗大

第二组: 循环
代码:

  def main(args: Array[String]) {    val data = 1 to 20000    val currntTime =System.currentTimeMillis()    increase_for(data.toList)    println("used time=" + (System.currentTimeMillis() - currntTime))  }  //循环  def increase_for(list:List[Int]) :List[Int] = {    var result = List[Int]()    for(element <- list){      result = result::: List(element)    }    result  }

运行结果
数据大小为20000

used time=2611
Process finished with exit code 0

数据大小为2000000

used time= NIL (运行很长时间,没有结果),难以忍受
Process finished with exit code 0

优点: 规避递归,数据多少不受什么影响
缺点: 产生很多临时List结果,当数据过大时,效率降低严重

第三组: For 循环 结合map处理
代码

  //list 的map function  def increase_for2(list:List[Int]) :List[Int] ={    println("list map ")    list map(el => el +1)  }

运行结果
数据大小:2000000

list map
used time=2268
Process finished with exit code 0

数据大小:2000000

used time=2268
Process finished with exit code 0数据大小:2000000

数据大小:2000000

used time=48356

Process finished with exit code 0

优点: 不产生中间结果,比使用List的::: 方法快
缺点:

第四组: 使用ListBuffer

代码:

  def main(args: Array[String]) {    val data = 1 to 2000000    val currntTime =System.currentTimeMillis()    increase_ListBuffer(data.toList)    println("used time=" + (System.currentTimeMillis() - currntTime))  }  //listBuffer  def increase_ListBuffer(list:List[Int]) :List[Int]={    import scala.collection.mutable.ListBuffer    var result = ListBuffer[Int]()    for(element <- list){      result += element+1    }    result.toList  }

运行结果
数据大小为2000000

used time=2284
Process finished with exit code 0

数据大小为20000000

Exception in thread “main” java.lang.OutOfMemoryError: GC overhead limit exceeded
at scala.collection.mutable.ListBuffer.pluseq(ListBuffer.scala:168)
at scala.collection.mutable.ListBuffer.pluseq(ListBuffer.scala:45)
at scala.collection.generic.Growable

anonfun
pluspluseq1.apply(Growable.scala:48)atscala.collection.generic.Growableanonfunpluspluseq1.apply(Growable.scala:48)
at scala.collection.immutable.Range.foreach(Range.scala:141)
at scala.collection.generic.Growableclass.pluspluseq(Growable.scala:48)
at scala.collection.mutable.ListBuffer.pluspluseq(ListBuffer.scala:176)atscala.collection.mutable.ListBuffer.pluspluseq(ListBuffer.scala:45)
at scala.collection.TraversableLikeclass.to(TraversableLike.scala:629)atscala.collection.AbstractTraversable.to(Traversable.scala:105)atscala.collection.TraversableOnceclass.toList(TraversableOnce.scala:257)
at scala.collection.AbstractTraversable.toList(Traversable.scala:105)
at com.ifly.edu.scala.list.ListBuffer_Internals$.main(ListBuffer_Internals.scala:11)
at com.ifly.edu.scala.list.ListBuffer_Internals.main(ListBuffer_Internals.scala)

优点: 数据在一定量的情况,效率非常高
缺点:

小结

ListBuffer 既可以规避递归,也可以 规避 创建中间结果,效率可靠

参考资料:

百度网盘:http://pan.baidu.com/share/home?uk=4013289088#category/type=0
微信号:18610086859
DT大数据微信公众账号:DT_Spark
DT大数据梦工厂交流群:① 462923555 ②418110145 ③437123764

0 0
原创粉丝点击