EM算法及其扩展
来源:互联网 发布:pmp认证 知乎 编辑:程序博客网 时间:2024/06/13 07:05
Preface
题目中的各大算法前人早已总结得很完善,如JerryLead,abcjennifer,pluskid 等各路前辈,开此题的目的就是斗胆在前人的基础上,画蛇添足,把自己工作中的一点研究和经验分享出来,与大家一同探讨进步,多多支持。
EM 部分一、EM的历史 (一)、大话EM (二)、EM核心二、EM的推导 (一)、误差下界导出EM (二)、EM迭代收敛证明 (三)、Q函数的分布推广三、EM的推广 (一)、GEM (二)、ECM四、EM的一些参考资料GMM 部分一、一维GMM (一)、理论推导 (二)、实现部分 1、数据模拟(R) 2、GMM的实现(scala)二、多维GMM (一)、理论推导 (二)、实现部分 1、数据模拟(R) 2、GMMs的实现(scala) (1)、初始值模拟 (2)、行列式计算 (3)、矩阵逆计算 (4)、GMMs综合
一、EM的历史
EM的历史就是一部血泪史,难怪MLAPP一书的作者Murphy会说:“This is an unfortunate term, since there are many ways to generalize EM.... .”
(一)、大话EM
图1 Dempster, Laird, Rubin(EM算法的重要引领者)
EM(Expectation Maximization)算法的发展有一个分水岭–1977,那一年,Dempster, Laird, Rubin(上图1)三个一起发表了一篇论文:Maximum Likelihood from Incomplete Data via the EM Algorithm。 自此,EM算法结束了一种零零散散状态,以一种严谨的数学表达及证明进入人们的视野。
其实,在DLR(Dempster, Laird, Rubin)之前,一些MLE(maximum likelihood estimation)问题,早已有人用到EM的一些思想去解决,但是都是比较特定的领域,直到DLR总览之后的总结,EM才大放光彩,那么为什么说EM的历史是一部血泪史,故事还得从DLR之后继续说,当DLR首次为EM正名之后,EM中的Q函数,M步是一直是很多人“诟病”的焦点,大家关注较多的通常有两个问题,一是Q函数的复杂度,二是M步的优化效率,然后就此展开了一场“厮杀”,有人觉得EM似乎可以从另外一个角度来解释,于是有了GEM框架(后面会提及),其觉得不必每次都最大化M步,找到一个比之前似然值更大的即可,有点梯度下降迭代的影子。有人觉得GEM的想法可取,但是高维下M步的迭代速度还是有点慢,于是有了ECM(后面会提及)对EM的扩展,不久,又有个哥们觉得ECM的改进不过瘾,进而修改了其中CM步骤的优化问题,提出了ECME算法,然后ECME在提出一段时间后,又被人整合到了新的框架,AECM算法中,等等,后面类似这样的事还有很多,今天提出的观点,明天就有人在你的基础上改进,一场算法竞技,血与泪并存,正是因为他们的不懈,推动着历史的滚轮大步向前。
待整理…
/** * Created by yangcheng on 2016/9/10. * This code is about gaussian mixture model(GMM). * Hope it well done in our work. */object GMM { def main(args: Array[String]): Unit = { import scala.io.Source._ import scala.Array._ import scala.collection.mutable import scala.util.Random val data = fromFile("C:/Users/yangcheng/Desktop/rnormData.txt").getLines().toArray //data.foreach(println) // k indicates that it has k kinds of gaussian model val k: Int = 4 // initial the parameter val alpha: Array[Double] = new Array[Double](k).map(_ + (1.0 / k)) val theta = new mutable.HashMap[Int, Map[String, Double]]() //Random.setSeed(1000) for (i <- 0 until k) { theta.put(i, Map("mean" -> Random.nextInt(100), "var" -> Random.nextInt(100))) } //theta.foreach(println) // training var len: Double = 1.0 var j: Int = 0 while (len > math.pow(10, -1)) { j += 1 val oldTheta = theta.clone() // calculate the latent variable (E step) val gamma = data.map(a => { val rs = new Array[Double](k) for (i <- 0 until k) { rs(i) = alpha(i) * (1 / math.sqrt(2 * math.Pi * theta(i)("var")) * math.exp(-math.pow(a.toDouble - theta(i)("mean"), 2) / 2 / theta(i)("var"))) } rs.map(_ / rs.sum) }) //gamma.foreach(a => println(a.mkString(","))) // maximum the Q fun (M step) for (i <- 0 until k) { // update the sd of gaussian val sdjk = gamma.map(_ (i)).zip(data.map(a => math.pow(a.toDouble - theta(i)("mean"), 2))).map(a => a._1 * a._2).sum theta(i) += ("var" -> sdjk / gamma.map(_ (i)).sum) // update the mean of gaussian val meanjk = gamma.map(_ (i)).zip(data).map(a => a._1 * a._2.toDouble).sum theta(i) += ("mean" -> meanjk / gamma.map(_ (i)).sum) // update the alpha alpha(i) = gamma.map(_ (i)).sum / data.length } // judge the length of update step is enough or not len = 0.0 for (i <- 0 until k) { len += (theta(i)("mean") - oldTheta(i)("mean")).abs len += (theta(i)("var") - oldTheta(i)("var")).abs } //oldTheta.foreach(a => println("oldTheta = " + a)) } // print the result for(i <- 0 until k) println("alpha" + i + " = " + alpha(i)) theta.foreach(a => println("theta = " + a)) println("j = " + j) }}
/** * Created by yangcheng on 2016/9/12. * This code is about Multidimensional gaussian mixture model(GMMs). * it more general in real world. * Hope it well done in work. */object GMMs { def main(args: Array[String]): Unit = { import scala.io.Source._ import scala.Array._ import scala.collection.mutable import scala.util.Random import scala.annotation.tailrec // load the data val mvdata = fromFile("C:/Users/yangcheng/Desktop/mvrnormData.txt").getLines().toArray //mvdata.map(_.split(" ")).foreach(a => println(a.mkString("[", ", ", "]"))) // initial the parameter // k indicates that it has k kinds of gaussian model // dim indicates that those gaussian model have dim dimensions val dim: Int = 2 val k: Int = 2 val alpha: Array[Double] = new Array[Double](k).map(_ + 1.0 / k) val mvMu = mutable.HashMap[Int, Array[Double]]() val mvSigma = mutable.HashMap[Int, Array[Array[Double]]]() val Imatrix = mutable.HashMap[Int, Array[Array[Double]]]() // Generate random mean vector and covariance matrix (positive definite matrix, Mr.Hurwitz theorem) for (i <- 0 until k) { // for a mean vector mvMu.put(i, new Array[Double](dim).map(_ + Random.nextInt(100))) // for a cov matrix and identity matrix mvSigma.put(i, ofDim[Double](dim, dim)) Imatrix.put(i, ofDim[Double](dim, dim)) for (j <- 0 until dim) { mvSigma(i)(j)(j) = (Random.nextInt(10) + 1) * 10 Imatrix(i)(j)(j) = 1 for (s <- j + 1 until dim) { mvSigma(i)(j)(s) = Random.nextGaussian() mvSigma(i)(s)(j) = mvSigma(i)(j)(s) Imatrix(i)(j)(s) = 0 Imatrix(i)(s)(j) = Imatrix(i)(j)(s) } } } //mvdata.map(_.split(" ")).map(a => a.map(_.toDouble)).map(b => b.zip(mvMu(0)).map(c => c._1 - c._2)).foreach(a => println(a.mkString("\t"))) //alpha.foreach(println) mvMu.foreach(a => println("mvMu = " + a)) mvMu(0).foreach(a => println("mvMu0 = " + a)) //mvSigma.foreach(a => println("mvSigma = " + a)) //mvSigma.foreach(a => println(a._2.mkString("\t"))) //mvSigma(0).foreach(a => println(a.mkString("\t\t"))) //Imatrix.foreach(a => println(a._2.mkString("\t"))) //Imatrix(0).foreach(a => println(a.mkString("\t\t"))) // define the function of Extension matrix adjustment, make sure there are haven't zero row or column //def adjust(M: Array[Array[Double]]): Array[Array[Double]] = { // val allRowSum = M.reduceLeft(_.zip(_).map(a => 2 * a._1 + 2 * a._2)) // M.map(a => a.zip(allRowSum).map(b => b._1 + b._2)) //} //adjust(mvSigma(0)).foreach(a => println("adjust mvSigma = " + a.mkString("\t"))) // Define the calculation of the inverse of a matrix, with a elementary transformation // Extension matrix elementary transformation def calculateInverse(Sigma: Array[Array[Double]], E: Array[Array[Double]]): Array[Array[Double]] = { val l = Sigma.length // combine Sigma with E, form the Extension Matrix val ExtensionMatrix = ofDim[Double](l, 2 * l) for (i <- 0 until l) { ExtensionMatrix(i) = Sigma(i).++(E(i)) } //ExtensionMatrix.foreach(a => println("ExtensionMatrix = " + a.mkString("\t"))) // elementary transformation @tailrec def elementaryTransform(eM: Array[Array[Double]], n: Int): Array[Array[Double]] = { if (n >= l) { for (j <- (1 until l).reverse) { for (k <- 0 until j) { eM(k) = eM(k).zip(eM(j)).map(a => a._1 - eM(k)(j) * a._2) } } eM } else { eM(n) = eM(n).map(_ / eM(n)(n)) for (i <- n until l - 1) { eM(i + 1) = eM(i + 1).map(_ / eM(i + 1)(n)) eM(i + 1) = eM(i + 1).zip(eM(n)).map(a => a._1 - a._2) } elementaryTransform(eM, n + 1) } } // take out the identity matrix elementaryTransform(ExtensionMatrix, 0).map(a => { val gg = new Array[Double](l) for (i <- 0 until l) { gg(i) = a(i + l) } gg }) } // print the output calculateInverse(Sigma = mvSigma(0), E = Imatrix(0)).foreach(a => println("Inverse of Sigma = " + a.mkString("\t"))) // Define the calculation of the determinant of a matrix // turn M into Upper triangular matrix @tailrec def calculateDeteminant(M: Array[Array[Double]], n: Int): Double = { if (n >= M.length - 1) { var Deteminant: Double = 1.0 for (i <- M.indices) { Deteminant = Deteminant * M(i)(i) } Deteminant } else { for (i <- n until M.length - 1) { M(i + 1) = M(i + 1).zip(M(n)).map(a => a._1 - a._2 * M(i + 1)(n) / M(n)(n)) } calculateDeteminant(M, n + 1) } } //println(calculateDeteminant(mvSigma(0),0)) // def the matrix multiplication function def matrixMultiplicationLeft(left: Array[Double], middle: Array[Array[Double]]): Array[Double] = { val col = middle(0).length val leftResult = new Array[Double](col) for (j <- 0 until col) { leftResult(j) = left.zip(middle.map(_ (j))).map(a => a._1 * a._2).sum } leftResult } //matrixMultiplicationLeft(alpha,mvSigma(0)).foreach(println) // training var len: Double = 1.0 var j: Int = 0 while (len > math.pow(10, -1)) { j += 1 val oldmvMu = mvMu.clone() val oldmvSigma = mvSigma.clone() // calculate the latent variable (E step) val gamma = mvdata.map(a => { val rs = new Array[Double](k) for (i <- 0 until k) { val dx = a.split(" ").zip(mvMu(i)).map(b => b._1.toDouble - b._2) //.toArray val temp = matrixMultiplicationLeft(dx, calculateInverse(mvSigma(i), Imatrix(i))).zip(dx).map(a => a._1 * a._2).sum val ISigma = mvSigma(i).clone() rs(i) = (1.0 / math.pow(2 * math.Pi, dim / 2)) * (1.0 / math.pow(calculateDeteminant(ISigma, 0).abs, 0.5)) * math.exp(-0.5 * temp) * alpha(i) } rs.map(_ / rs.sum) }) //gamma.foreach(a => println(a.mkString(","))) //val dx = mvdata(0).split(" ").zip(mvMu(0)).map(b => b._1.toDouble - b._2) //println("dx = " + dx.mkString("\t")) //val temp00 = calculateInverse(Sigma = mvSigma(0), E = Imatrix(0)) //val temp11 = matrixMultiplicationLeft(dx, calculateInverse(mvSigma(0), Imatrix(0))).zip(dx).map(a => a._1 * a._2).sum //println("temp0 = " + temp00(0).mkString("\t")) // maximum the Q fun (M step) for (i <- 0 until k) { // update the sigma of the multidimensional gaussian model val iSigma = ofDim[Double](dim, dim) val dx2 = mvdata.map(a => { a.split(" ").zip(mvMu(i)).map(b => b._1.toDouble - b._2) }) for (n <- 0 until dim) { for (m <- 0 until dim) { iSigma(n)(m) = gamma.map(_ (i)).zip(dx2.map(_ (n)).zip(dx2.map(_ (m))).map(a => a._1.toDouble * a._2)).map(b => b._1 * b._2).sum / gamma.map(_ (i)).sum } } mvSigma.put(i, iSigma) // update the mean of the multidimensional gaussian model val iMu = new Array[Double](dim) for (j <- 0 until dim) { iMu(j) = gamma.map(_ (i)).zip(mvdata.map(_.split(" ")).map(_ (j))).map(a => a._1 * a._2.toDouble).sum / gamma.map(_ (i)).sum } mvMu.put(i, iMu) // update the alpha alpha(i) = gamma.map(_ (i)).sum / mvdata.length } // judge the length of update step is enough or not len = 0.0 for (i <- 0 until k) { len += mvMu(i).zip(oldmvMu(i)).map(a => (a._1 - a._2).abs).sum for (j <- 0 until dim) { len += mvSigma(i)(j).zip(oldmvSigma(i)(j)).map(a => (a._1 - a._2).abs).sum } } } alpha.foreach(println) mvMu.foreach(a => println("mvMu = " + a._2.mkString("\t"))) mvSigma.foreach(a => println("mvSigma = " + a._2.mkString("\t"))) mvSigma(0).foreach(a => println("mvSigma0 = " + a.mkString("\t"))) mvSigma(1).foreach(a => println("mvSigma0 = " + a.mkString("\t"))) println(j) }}
- EM算法及其扩展
- EM算法及其推广
- EM算法及其应用
- EM算法及其推广
- EM算法及其推广
- EM算法及其推广
- 欧几里得算法及其扩展
- 欧几里德算法及其扩展
- 欧几里得算法及其扩展
- EM算法 自己整理(有待扩展)
- EM算法及其应用(代码)
- EM算法及其应用(代码)
- EM算法及其通俗易懂的讲解
- EM算法-数学原理及其证明
- EM算法及其推广学习笔记
- EM算法-数学原理及其证明
- EM算法及其应用(代码)
- 欧几里德算法及其扩展算法
- HDU 1575 矩阵快速幂裸题
- 第一个自己的app ----- TasteNews
- PAT乙级别.1035. 插入与归并(25)
- 设计模式之适配器(Adapter)模式
- BZOJ1233【usaco open 2009】干草堆 tower
- EM算法及其扩展
- arrays.sort用法arrays.fill用法arrays.equals用法
- java io InputStream 转 byte
- HND pentest
- Linux-系统管理-dstat
- HDU 4004 The Frog's Games(二分)
- 3. Longest Substring Without Repeating Characters--2016/09/19
- 京东面试算法题-爬山
- JavaScript-05