机器学习_算法_HMM
来源:互联网 发布:英语句子语法分析软件 编辑:程序博客网 时间:2024/06/05 16:03
参考:http://wenku.baidu.com/view/ff413e1cc5da50e2524d7f1c.html
http://blog.csdn.net/likelet/article/details/7056068
第二篇写的非常全,如果大家想看完整的公式请参考下面这篇,
http://wenku.baidu.com/view/e92d85a3284ac850ad0242e9.html
网上的ppt大部分都是错的,都是抄来抄去,错误都是一样的,都没有尝试过就往网上放,包括网上广为流传的 我爱自然语言网 的那篇,也不知道这些老师是怎么教学生的,自己都没有验证的东西就过来教学生,这不是误人子弟吗?
使用beta函数后向算法,除了上面那篇写的终结函数Pr(o|r) = sum(beta(1,i) * Pi(i) * b(i,o1))是对的外,其他的都是错误的,都写成了前向公式的终结函数
一样的,还是把python代码贴一下运行结果:
'''Created on Aug 31, 2013@author: blacklaw@ref: http://wenku.baidu.com/view/e92d85a3284ac850ad0242e9.html'''# equal PI# inital state probabilitiesINITIAL_STATE = {'Sunny' : 0.63, 'Cloudy' : 0.17, 'Rainy' : 0.20 } # equal A# state transition matrix# weather yesterday weather todayTRANSITION = {"Sunny" : {'Sunny' : 0.5, 'Cloudy' : 0.375, 'Rainy' : 0.125, }, 'Cloudy' : {'Sunny' : 0.25, 'Cloudy' : 0.125, 'Rainy' : 0.625, }, 'Rainy' : {'Sunny' : 0.25, 'Cloudy' : 0.375, 'Rainy' : 0.375, } }# equal B# confusion matrix # hidden state observed stateCONFUSION = {"Sunny" : {'Dry' : 0.6, 'Dryish' : 0.2, 'Damp' : 0.15, 'Soggy' : 0.05, }, 'Cloudy' : {'Dry' : 0.25, 'Dryish' : 0.25, 'Damp' : 0.25, 'Soggy' : 0.25, }, 'Rainy' : {'Dry' : 0.05, 'Dryish' : 0.10, 'Damp' : 0.35, 'Soggy' : 0.50, } }import copyclass HMM(): def __init__(self, PI, A, B): self.PI = PI self.A = A self.B = B # *********** for calc probability via exhaustive way(START) ************* def _add_one_hidden_layer(self,pairs, keys, mults): new_pairs = [] for origin_pair in pairs: for key in keys: pair = copy.deepcopy(origin_pair) l_key = pair['keys'][-1] pair['keys'].append(key) pair['mult'] *= mults[l_key][key] new_pairs.append(pair) return new_pairs # exhaustive search for solution def _exhaustive(self, tran, init, deep): pairs = [] for k, v in init.items(): pairs.append({'keys' : [k], 'mult' : v}) for i in range(deep - 1): pairs = self._add_one_hidden_layer(pairs, tran.keys(), tran) return pairs def _add_observation(self, pairs, confu, obser): pairs = copy.deepcopy(pairs) for pair in pairs: pair['obser'] = 1 for i, key in enumerate(pair['keys']): pair['obser'] *= confu[key][obser[i]] return pairs def _calc_probabilities(self, pairs): mults = [] for pair in pairs: p_mult = pair['obser'] * pair['mult'] pair['p_mult'] = p_mult mults.append(p_mult) return sum(mults) def calc_prob_by_exhaustive(self, test): pairs = self._exhaustive(self.A, self.PI, len(test)) pairs = self._add_observation(pairs, self.B, test) prob = self._calc_probabilities(pairs) return prob def most_probability_hidden(self, test): # {'keys': ['Rainy', 'Rainy', 'Rainy'], 'mult': 0.02} pairs = self._exhaustive(self.A, self.PI, len(test)) # [{'keys': ['Rainy', 'Rainy', 'Rainy'], 'obser': 0.008, 'mult': 0.02} pairs = self._add_observation(pairs, self.B, test) # [{'keys': ['Sunny', 'Cloudy', 'Rainy'], 'obser': 0.075, 'p_mult': 0.011, 'mult': 0.14}, self._calc_probabilities(pairs) pairs = sorted(pairs, key = lambda di: -di['p_mult']) #return ['Dry', 'Damp', 'Soggy'] return pairs[0]['keys'] # *************** calc probability via exhaustive (END) ****************** # forward algorithm def alpha(self, time, state, ot = None): b = self.B[state][self.O[time]] if time == 0: return self.PI[state] * b alps = [] for t_state in self.A.keys(): alps.append(self.alpha(time - 1, t_state) * self.A[t_state][state]) return sum(alps) * b # backward algorithm def beta(self, t, i, ot = None): if t == len(self.O) - 1: return 1 betas = [] for j in self.A.keys(): b = self.B[j][self.O[t + 1]] betas.append(self.beta(t + 1, j) * self.A[i][j] * b) return sum(betas) # forward calced by alpha foo def calc_prob_by_forward(self, test): sum_a = 0 self.O = test for k in self.A.keys(): sum_a += self.alpha(len(self.O) - 1, k) return sum_a # backward calced by beta foo def calc_prob_by_backward(self, test): sum_b = 0 for k in self.A.keys(): sum_b += self.PI[k] * self.B[k][self.O[0]] * self.beta(0, k) return sum_b # def gamma(self, t, state, ot = None): div_son = 0 mults = [] for s in self.A.keys(): mult = self.alpha(t, s) * self.beta(t, s) mults.append(mult) if s == state: div_son = mult #print div_son #print mults return div_son / sum(mults) # def xi(self, t, state_a, state_b): div_son = 0 div_mot = 0 for state_i in self.A.keys(): for state_j in self.A.keys(): son = self.alpha(t, state_i) * self.beta(t + 1, state_j) * \ self.A[state_i][state_j] * self.B[state_j][self.O[t + 1]] div_mot += son if state_a == state_i and state_b == state_j: div_son = son return div_son / div_mot def new_pi(self, state): return self.gamma(0, state) # def new_a(self, state_i, state_j): div_son = [] div_mot = [] for t in range(len(self.O) - 1): div_son.append(self.xi(t, state_i, state_j)) div_mot.append(self.gamma(t, state_i)) return sum(div_son) / sum(div_mot) # def new_b(self, state_j, ot): div_son = [] div_mot = [] for t in range(len(self.O)): if (self.O[t] == ot): div_son.append(self.gamma(t, state_j)) div_mot.append(self.gamma(t, state_j)) return sum(div_son) / sum(div_mot) def new_r(self): ret_pi = copy.deepcopy(self.PI) for k in self.PI.keys(): ret_pi[k] = self.new_pi(k) ret_a = copy.deepcopy(self.A) for i in self.A.keys(): for j in self.A.keys(): ret_a[i][j] = self.new_a(i, j) ret_b = copy.deepcopy(self.B) for i in self.B.keys(): for k in self.B[self.B.keys()[0]].keys(): ret_b[i][k] = self.new_b(i, k) return ret_pi, ret_a, ret_b def update_r(self): self.PI, self.A, self.B = self.new_r() if __name__ == "__main__": TEST = ['Dry', 'Damp', 'Soggy'] hmm = HMM(INITIAL_STATE, TRANSITION, CONFUSION) print 'Test weather(observed states): %s' % TEST print 'Probabilities calced by exhaustive way: %s' % hmm.calc_prob_by_exhaustive(TEST) print 'Probabilities calced by forward algorithm: %s' % hmm.calc_prob_by_forward(TEST) print 'Probabilities calced by backward algorithm: %s' % hmm.calc_prob_by_backward(TEST) print 'Most probability hidden states(by exhaustive): %s' % hmm.most_probability_hidden(TEST) THRESHOLD = 0.001 # train 100 times for i in range(100): hmm.update_r() prob = hmm.calc_prob_by_forward(TEST) if 1 - prob < THRESHOLD: break print 'Train TIme: %d Probability: %s' % (i + 1, prob)
运行结果
Test weather(observed states): ['Dry', 'Damp', 'Soggy']Probabilities calced by exhaustive way: 0.02690140625Probabilities calced by forward algorithm: 0.02690140625Probabilities calced by backward algorithm: 0.02690140625Most probability hidden states(by exhaustive): ['Sunny', 'Cloudy', 'Rainy']Train TIme: 1 Probability: 0.158562518529Train TIme: 2 Probability: 0.529673080199Train TIme: 3 Probability: 0.924981269129Train TIme: 4 Probability: 0.997909042525
- 机器学习_算法_HMM
- 机器学习_算法_svm
- 机器学习_算法_Apriori
- 机器学习_算法_AdaBoost
- 机器学习_算法_KNN
- 机器学习_集成算法
- 机器学习_遗传算法
- 机器学习_算法_朴素贝叶斯
- 机器学习_算法_神经网络_BP
- 统计学习方法_读书笔记_10章_HMM
- 机器学习_常用算法列举
- 机器学习_常用算法简介
- 机器学习_算法_ID3,C4.5
- 机器学习_算法_kmeans聚类
- 机器学习_贝叶斯网络分类算法
- 机器学习算法_第1篇
- 机器学习:EM算法_续
- 机器学习_神经网络算法入门
- WinCE MUI的实现----本人亲自实践
- POJ2342_Anniversary party 树形DP
- 二叉树遍历:已知前序和中序,求后序
- PostgreSQL与MySQL比较
- android textview和edittext中插入表情
- 机器学习_算法_HMM
- Nginx负载均衡简易方法
- ubuntu下QT连接各种数据库报错解决(QSqlDatabase: * driver not loaded)
- Unix学习笔记(2)
- 初始化编译环境 下载源代码
- Java中的方法绑定及多态
- 查找数组的波谷
- ttm of the 256th meeting, theme:Learning
- something...