题外篇.1-简单测试spark源代码

来源:互联网 发布:初学尤克里里软件 编辑:程序博客网 时间:2024/06/05 19:04

事先预备:

安装好在文章1中提到的软件/工具;

在自己ubuntu机器上将spark源码git clone到自己的机器上,最好建一个分支以便日后修改源代码向社区提交.参考 :http://blog.csdn.net/huanggang028/article/details/38399497;

将saprk源码倒入 IntelliJ IDE;


从一个例子说起:

spark的源码目录中test模块和主模块是分开的,如spark core中src/mian下面是core模块的实现,src/test是测试模块.

举一个例子来看测试过程.

源代码路径:             src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala

源代码测试类路径: src/main/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala

我们主要关注测试的方法,具体测试内容可以自行调试.

部分代码:  class AsyncRDDActionsSuite extends SparkFunSuite with BeforeAndAfterAll with Timeouts

这是一个测试类,用来测试RDD的同步行为.我们看到它继承了3个类:SparkFunSuite,BeforeAndAfterAll,Timeouts.

         SparkFunSuite: 但凡是个测试类就要继承SparkFunSuite.看看SparkFunSuite的实现:

import java.io.Fileimport org.scalatest.{BeforeAndAfterAll, FunSuite, Outcome}import org.apache.spark.internal.Loggingimport org.apache.spark.util.AccumulatorContext/** * Base abstract class for all unit tests in Spark for handling common functionality. */abstract class SparkFunSuite  extends FunSuite  with BeforeAndAfterAll  with Logging {// scalastyle:on  protected override def afterAll(): Unit = {    try {      // Avoid leaking map entries in tests that use accumulators without SparkContext      AccumulatorContext.clear()    } finally {      super.afterAll()    }  }  // helper function  protected final def getTestResourceFile(file: String): File = {    new File(getClass.getClassLoader.getResource(file).getFile)  }  protected final def getTestResourcePath(file: String): String = {    getTestResourceFile(file).getCanonicalPath  }  /**   * Log the suite name and the test name before and after each test.   * Subclasses should never override this method. If they wish to run   * custom code before and after each test, they should mix in the   * {{org.scalatest.BeforeAndAfter}} trait instead.   */  final protected override def withFixture(test: NoArgTest): Outcome = {    val testName = test.text    val suiteName = this.getClass.getName    val shortSuiteName = suiteName.replaceAll("org.apache.spark", "o.a.s")    try {      logInfo(s"\n\n===== TEST OUTPUT FOR $shortSuiteName: '$testName' =====\n")      test()    } finally {      logInfo(s"\n\n===== FINISHED $shortSuiteName: '$testName' =====\n")    }  }}
我们看到,除了重写 afterAll(),getTestResourceFile(),getTestResourcePath(),withFixture()四个方法,就简单的继承org.scalatest.{BeforeAndAfterAll, FunSuite}和一个log类.

       BeforeAndAfterAll,Timeouts:实际上这两个类和BeforeAndAfterAll, FunSuite都来自scalaTest.


什么是scalaTest?

先看官方简介:http://www.scalatest.org/

我们大概总接下:ScalaTest是一个优秀Scala系统测试工具:).


具体怎么测试:

http://spark.apache.org/developer-tools.html

./built/sbt->project core->testOnly org.apache.spark.rdd.AsyncRDDActionsSuite

你也可以自己修改AsyncRDDActionsSuite.scal中的内容来验证AsyncRDDActions的方法是否成功

原创粉丝点击