贝叶斯网络结构学习之K2算法（基于FullBNT-1.0.4的MATLAB实现）

来源：互联网发布：软件开发协议范本编辑：程序博客网时间：2024/04/28 10:37

题目：贝叶斯网络结构学习之K2算法（基于FullBNT-1.0.4的MATLAB实现）

有关贝叶斯网络结构学习的一基本概念可以参考：贝叶斯网络结构学习方法简介

有关函数输入输出参数的解释可以参考：贝叶斯网络结构学习若干问题解释

对于应用贝叶斯评分进行结构学习的研究，国外的学者做了很多相关的工作。其中最著名的是Cooper和Herskovits提出的K2算法，他们应用贝叶斯评分和爬山法搜索的方法来优化网络模型，其中的评分函数即为贝叶斯评分的一种简化，又被称为K2评分。（摘自【张鸿勋. 基于K2评分的贝叶斯网结构学习算法的研究[D]. 北京工业大学, 2009.】）

K2算法于【Cooper G F, Herskovits E. A Bayesian method for the inductionof probabilistic networks from data[J]. Machine Learning, 1992, 9(4):309-347.】中提出，算法流程如下：

算法第10行中的Pred(x_i)的意思是根据节点顺序(nodeordering)排在x_i前面的节点：

算法第7、10、11行出现的函数g即为评分函数：

为什么取名叫K2算法呢？原文中给出了解释：

本文基于FullBNT-1.0.4工具箱，可自行搜索下载链接，附录1给出了几个参考链接。在MATLAB中设置第三方工具箱的步骤参见附录2，若MATLAB版本不同，可自行搜索添加步骤，大同小异。

工具箱添加完成后，即可在MATLAB的命令行窗口(commandwindows)使用help命令查看函数的帮助文件，使用edit命令打开函数的源文件。

函数learn_struct_K2.m目录：\FullBNT-1.0.4\BNT\learning\learn_struct_K2.m，接下来简单解释一下输入和输出参数。

输入：（注意除特别标注外，其余均来自learn_struct_K2.m本身）

data:大小n*m，其中n为节点个数，m为数据集的大小（即流程中的databaseD）

% data(i,m) = value of node i in case m (can be a cell array).

ns:大小n*1，每个node可能的取值个数（一般要求是离散值）

以下注释来自\FullBNT-1.0.4\BNT\general\mk_bnet.m，learn_struct_K2.m中的注释太简单

% node_sizes(i) is the number of values node i can take on,%   or the length of node i if i is a continuous-valued vector.% node_sizes(i) = 1 if i is a utility node.

order:大小n*1，节点之间顺序，K2算法需要先设定各节点之间的顺序，即专家知识

% order(i) is the i'th node in the topological ordering.

max_fan_in:大小n*1，每个节点最大的父亲节点个数（即流程中的upper bound u），默认为节点个数

% max_fan_in - this the largest number of parents we allow per node [N]

scoring_fn:评分函数的类型，默认为贝叶斯评分

% scoring_fn - 'bayesian' or 'bic' [ 'bayesian' ]%              Currently, only networks with all tabular nodes support Bayesian scoring.

输出：

dag:大小n*n，若第i个节点是第j个节点的父亲，则dag(i,j)=1，否则dag(i,j)=0

函数\FullBNT-1.0.4\BNT\learning\learn_struct_K2.m的代码：

function dag = learn_struct_K2(data, ns, order, varargin)% LEARN_STRUCT_K2 Greedily learn the best structure compatible with a fixed node ordering% best_dag = learn_struct_K2(data, node_sizes, order, ...)%% data(i,m) = value of node i in case m (can be a cell array).% node_sizes(i) is the size of node i.% order(i) is the i'th node in the topological ordering.%% The following optional arguments can be specified in the form of name/value pairs:% [default value in brackets]%% max_fan_in - this the largest number of parents we allow per node [N]% scoring_fn - 'bayesian' or 'bic' [ 'bayesian' ]%              Currently, only networks with all tabular nodes support Bayesian scoring.% type       - type{i} is the type of CPD to use for node i, where the type is a string%              of the form 'tabular', 'noisy_or', 'gaussian', etc. [ all cells contain 'tabular' ]% params     - params{i} contains optional arguments passed to the CPD constructor for node i,%              or [] if none.  [ all cells contain {'prior', 1}, meaning use uniform Dirichlet priors ]% discrete   - the list of discrete nodes [ 1:N ]% clamped    - clamped(i,m) = 1 if node i is clamped in case m [ zeros(N, ncases) ]% verbose    - 'yes' means display output while running [ 'no' ]%% e.g., dag = learn_struct_K2(data, ns, order, 'scoring_fn', 'bic', 'params', [])%% To be backwards compatible with BNT2, you can also specify arguments as follows%   dag = learn_struct_K2(data, node_sizes, order, max_fan_in)    %% This algorithm is described in% - Cooper and Herskovits,  "A Bayesian method for the induction of probabilistic%      networks from data", Machine Learning Journal 9:308--347, 1992[n ncases] = size(data);% set default paramstype = cell(1,n);params = cell(1,n);for i=1:n  type{i} = 'tabular';  %params{i} = { 'prior', 1 };  params{i} = { 'prior_type', 'dirichlet', 'dirichlet_weight', 1 };endscoring_fn = 'bayesian';discrete = 1:n;clamped = zeros(n, ncases);max_fan_in = n;verbose = 0;args = varargin;nargs = length(args);if length(args) > 0   if isstr(args{1})    for i=1:2:nargs      switch args{i},       case 'verbose',    verbose = strcmp(args{i+1}, 'yes');       case 'max_fan_in', max_fan_in = args{i+1};        case 'scoring_fn', scoring_fn = args{i+1};       case 'type',       type = args{i+1};        case 'discrete',   discrete = args{i+1};        case 'clamped',    clamped = args{i+1};        case 'params',     if isempty(args{i+1}), params = cell(1,n); else params = args{i+1};  end      end    end  else    max_fan_in = args{1};  endenddag = zeros(n,n);for i=1:n  ps = [];  j = order(i);  u = find(clamped(j,:)==0);      score = score_family(j, ps, type{j}, scoring_fn, ns, discrete, data(:,u), params{j});  if verbose, fprintf('\nnode %d, empty score %6.4f\n', j, score); end  done = 0;  while ~done & (length(ps) <= max_fan_in)    pps = mysetdiff(order(1:i-1), ps); % potential parents    nps = length(pps);    pscore = zeros(1, nps);    for pi=1:nps      p = pps(pi);      pscore(pi) = score_family(j, [ps p], type{j}, scoring_fn, ns, discrete, data(:,u), params{j});      if verbose, fprintf('considering adding %d to %d, score %6.4f\n', p, j, pscore(pi)); end    end    [best_pscore, best_p] = max(pscore);    best_p = pps(best_p);    if best_pscore > score      score = best_pscore;      ps = [ps best_p];      if verbose, fprintf('* adding %d to %d, score %6.4f\n', best_p, j, best_pscore); end    else      done = 1;    end  end  if ~isempty(ps) % need this check for matlab 5.2    dag(ps, j) = 1;  endend

这个函数很简单，基本与上图中的K2流程一模一样，函数调用了评分函数score_family.m，可以通过参数设置不同类型的评分函数。

工具箱提供了函数learn_struct_K2.m的使用例子k2demo1.m，文件目录：\FullBNT-1.0.4\BNT\examples\static\StructLearn\k2demo1.m

N = 4;dag = zeros(N,N);%C = 1; S = 2; R = 3; W = 4;C = 4; S = 2; R = 3; W = 1; % arbitrary orderdag(C,[R S]) = 1;dag(R,W) = 1;dag(S,W)=1;false = 1; true = 2;ns = 2*ones(1,N); % binary nodesbnet = mk_bnet(dag, ns);bnet.CPD{C} = tabular_CPD(bnet, C, [0.5 0.5]);bnet.CPD{R} = tabular_CPD(bnet, R, [0.8 0.2 0.2 0.8]);bnet.CPD{S} = tabular_CPD(bnet, S, [0.5 0.9 0.5 0.1]);bnet.CPD{W} = tabular_CPD(bnet, W, [1 0.1 0.1 0.01 0 0.9 0.9 0.99]);seed = 0;rand('state', seed);randn('state', seed);ncases = 100;data = zeros(N, ncases);for m=1:ncases  data(:,m) = cell2num(sample_bnet(bnet));endorder = [C S R W];max_fan_in = 2;%dag2 = learn_struct_K2(data, ns, order, 'max_fan_in', max_fan_in, 'verbose', 'yes');  sz = 5:5:50;for i=1:length(sz)  dag2 = learn_struct_K2(data(:,1:sz(i)), ns, order, 'max_fan_in', max_fan_in);  correct(i) = isequal(dag, dag2);endcorrectfor i=1:length(sz)  dag3 = learn_struct_K2(data(:,1:sz(i)), ns, order, 'max_fan_in', max_fan_in, 'scoring_fn', 'bic', 'params', []);  correct(i) = isequal(dag, dag3);endcorrect

这个例子很简单：定义贝叶斯网络-->根据贝叶斯网生成数据集-->根据数据集使用K2算法学习贝叶斯网络结构，并与定义的真实网络结构作对比，以判断K2算法重构性能。

前7行代码是自己定义一个贝叶斯网络结构，第9~16行代码是给出这个贝叶斯网的条件概率表（即，参数），第18~25行是根据前16行定义的贝叶斯网来生成数据集。剩下的代码是分别采用了两种评分函数（第34行采用的是默认的贝叶斯评分，第41行采用的是BIC评分）根据数据集使用K2算法学习网络结构，并比较网络结构与真实结构的差异。值得注意的是，两处调用learn_struct_K2均未直接使用整个数据集，而是将数据集前50个每5个分成一小份，第1次使用数据集前5个，第2次使用数据集前10个，第3次使用数据集前15个，依此类推，每次将学得的网络结构与真实的网络结构作对比。通过结果可以发现，当数据集大于30时即可基本学习到正确的贝叶斯网络结构。

运行输出结果如下：

correct =     0     0     0     0     0     1     1     1     1     1correct =     0     0     0     0     0     1     0     1     1     1

有了以上例子，可进一步琢磨每一个输入参数的含义，进而可将learn_struct_K2函数用起来了，但是K2算法需要事先给定节点顺序，好处是可以充分利用专家知识，坏处是如果无法事先确定节点顺序怎么办？所以我们还需要掌握一些不需要输入节点顺序的贝叶斯网络结构学习函数(当不知道节点顺序但又想用K2算法时，有一种做法是随机生成若干个顺序如1000个分别学习多个贝叶斯网络结构，然后选择一个最好的结构)。另外，在learn_struct_K2中调用了评分函数score_family，既然评分函数都有了，那么就可以自行设计搜索策略例如模拟退火、遗传算法等进而得到新的基于评分的贝叶斯网络结构学习算法。

附录1：BNT下载链接

原来作者(http://www.cs.ubc.ca/~murphyk/)的主页的BNT工具箱链接已经失效，在网上搜了几个下载链接，我用的是FullBNT-1.0.4.zip，在CSDN下载的。

1、FullBNT-1.0.4.zip

http://download.csdn.net/download/yinzuooguigui/9014175

http://www.pudn.com/downloads464/sourcecode/math/detail1949349.html

https://pan.baidu.com/s/1o7X2Sqe 密码vted

https://github.com/bayesnet/bnt

2、FullBNT-1.0.7.zip

https://pan.baidu.com/share/link?shareid=3829960033&uk=2955152529

3、作者murphyk主页Bayes工具箱汇总

http://www.cs.ubc.ca/~murphyk/Bayes/junk/bnsoft.html

上面链接列出了很多Bayes工具箱，虽然很多链接已经失效了。