Spark成长之路(7)-Hypothesis testing
来源:互联网 发布:delphi登陆淘宝联盟 编辑:程序博客网 时间:2024/05/20 23:39
Hypothesis testing
样例
import org.apache.spark.ml.linalg.{Vector, Vectors}import org.apache.spark.ml.stat.ChiSquareTestimport org.apache.spark.sql.SparkSessionobject HypothesisTestingExample { def main(args: Array[String]): Unit = { val spark = SparkSession.builder.appName("HypothesisTestingExample").getOrCreate() spark.sparkContext.setLogLevel("WARN") val data = Seq( (0.0, Vectors.dense(0.5, 10.0)), (0.0, Vectors.dense(1.5, 20.0)), (1.0, Vectors.dense(1.5, 30.0)), (0.0, Vectors.dense(3.5, 30.0)), (0.0, Vectors.dense(3.5, 40.0)), (1.0, Vectors.dense(3.5, 40.0)) ) import spark.implicits._ val df = data.toDF("label", "features") val chi = ChiSquareTest.test(df, "features", "label").head println("pValues = " + chi.getAs[Vector](0)) println("degreesOfFreedom = " + chi.getSeq[Int](1).mkString("[", ",", "]")) println("statistics = " + chi.getAs[Vector](2)) }}
结果
pValues = [0.6872892787909721,0.6822703303362126]degreesOfFreedom = [2,3]statistics = [0.75,1.5]
阅读全文
0 0
- Spark成长之路(7)-Hypothesis testing
- Hypothesis Testing
- Null hypothesis significance testing
- 统计学 假设检验(Hypothesis Testing)
- 假设检验(Hypothesis Testing)
- 假设检验(Hypothesis Testing)
- Hypothesis Testing(Chapter 5)[@XiruChen]
- Spark成长之路(1)-搭建环境
- Spark成长之路(5)-消息队列
- Spark成长之路(6)-Correlation
- Spark成长之路(8)-TFIDF
- Spark成长之路(9)-Word2Vec
- Spark成长之路(10)-CountVectorizer
- Spark成长之路(11)-ngram
- Spark成长之路(12)-Gradient Descent
- The Most Simple Introduction to Hypothesis Testing
- Spark成长之路(2)-RDD中分区依赖系统
- Spark成长之路(3)-再谈RDD的Transformations
- MFC之GDI GDI+ 一键绘制正弦曲线图
- Codeforces 581C
- HttpURLConnection的小Demo
- 动态内存开辟
- spark构建图graphx
- Spark成长之路(7)-Hypothesis testing
- 爬取代理ip
- 新建jsp报错“The superclass "javax.servlet.http.HttpServlet" was not found on the Java Build Path”
- CTF实验吧-登陆一下好吗??【false SQL注入】
- 2017 年中总结
- Vue-组件化应用构建
- CodeForces
- linux系统top命令分析CPU和内存详解
- linux 用户空间与内核空间——高端内存详解