spark MLlib学习-卡方检测
来源:互联网 发布:gotv 倚天屠龙记 源码 编辑:程序博客网 时间:2024/05/21 14:59
spark-卡方检测
- 卡方检测基本原理
- 卡方检测基本步骤
- 代码实现
- 运行结果
代码块
import org.apache.log4j.{Level, Logger}import org.apache.spark.mllib.linalg.{Matrices, Matrix, Vectors}import org.apache.spark.mllib.stat.Statisticsimport org.apache.spark.{SparkConf, SparkContext}/** * Created by Administrator on 2017/2/8 0008. */object ChiSqlTest { /* 分别对Vector和Matrix 进行卡方检验 ********************************************************* * 卡方检测表示统计样本的实际观测值和预测值之间的偏离程度, * 实际观测值与预测值之间的偏离程度决定卡方值的大小,卡方值 * 越大,表示越偏离样本的实际值,反之,越小表示越接近实际值 * 如果卡方为0,表示预测值和实际值完全吻合。 * ********************************************************* */ def main(args: Array[String]) { val conf = new SparkConf() .setMaster("local") .setAppName(this.getClass.getSimpleName.filter(!_.equals('$'))) val sc = new SparkContext(conf) Logger.getRootLogger.setLevel(Level.WARN) val vd = Vectors.dense(1, 2, 3, 4, 5) val vResult = Statistics.chiSqTest(vd) println(s"向量卡方检测 :$vResult") val mtx = Matrices.dense(3, 2, Array(1, 3, 5, 2, 4, 6)) val mtxResult = Statistics.chiSqTest(mtx) println(s"矩阵的卡方检测:$mtxResult") val mtx2 = Matrices.dense(2, 2, Array(1, 2, 3, 4)) printChiSqTest(mtx2) sc.stop() //打印信息 方差,自由度,统计量,p值 } def printChiSqTest(matrix: Matrix): Unit = { val mtxResult = Statistics.chiSqTest(matrix) println(mtxResult) }}
运行结果
17/04/01 18:55:46 INFO Utils: Successfully started service 'SparkUI' on port 4040.17/04/01 18:55:46 INFO SparkUI: Started SparkUI at http://121.48.185.192:404017/04/01 18:55:46 INFO Executor: Starting executor ID driver on host localhost17/04/01 18:55:46 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56289.17/04/01 18:55:46 INFO NettyBlockTransferService: Server created on 5628917/04/01 18:55:46 INFO BlockManagerMaster: Trying to register BlockManager17/04/01 18:55:46 INFO BlockManagerMasterEndpoint: Registering block manager localhost:56289 with 457.9 MB RAM, BlockManagerId(driver, localhost, 56289)17/04/01 18:55:46 INFO BlockManagerMaster: Registered BlockManager向量卡方检测 :Chi squared test summary:method: pearsondegrees of freedom = 4 statistic = 3.333333333333333 pValue = 0.5036682742334986 No presumption against null hypothesis: observed follows the same distribution as expected..矩阵的卡方检测:Chi squared test summary:method: pearsondegrees of freedom = 2 statistic = 0.14141414141414144 pValue = 0.931734784568187 No presumption against null hypothesis: the occurrence of the outcomes is statistically independent..Chi squared test summary:method: pearsondegrees of freedom = 1 statistic = 0.07936507936507939 pValue = 0.7781596861761658 No presumption against null hypothesis: the occurrence of the outcomes is statistically independent..Process finished with exit code 0
0 0
- spark MLlib学习-卡方检测
- Spark中组件Mllib的学习20之假设检验-卡方检验
- Spark中组件Mllib的学习22之假设检验-卡方检验概念理解
- Spark MLlib 学习资料
- spark MLlib 学习
- spark MLlib 学习
- MLlib 卡方检验
- Spark MLlib知识点学习整理
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib 入门学习笔记
- Spark MLlib
- TypeScript 基本语法
- eclipse 用adb shell ps 命令检测某个应用的资源占用
- 好文推荐:程序猿不应该只会敲代码,否则几年后你依然平平庸庸
- Android Studio ButterKnife 配置
- 递归递推练习 F
- spark MLlib学习-卡方检测
- 09 類的繼承
- B
- 一些基本的Linux命令
- Linux中的用户和用户组
- 2516 2014年中山市选拔赛 dwarf tower
- iframe嵌套界面自适应,可高度自由收缩
- HBase学习之负载均衡(balance)
- Spring MVC处理XML数据(1)