Real adaboost
来源:互联网 发布:利驰软件怎么样 编辑:程序博客网 时间:2024/05/01 02:15
adaboost算法是boost算法的一种,它使用指数函数作为损失函数。
adaboost算法又有若干分支,包括DAB(Discrete AdaBoost),RAB(Real AdaBoost),LB(LogitBoost),GAB(Gentle AdaBoost)。这些算法在opencv 中均有所实现。其中,最为人们所熟知的当属经典adaboost算法,即离散adaboost,此处不展开,请参考相关博文。
1 Real adaboost
1.1 算法流程
训练样本集
1)初始化权值:
2)DO FOR
[1]基于带
a. 对
b. 统计
c. 定义
d.选取
[2]调整权重
3)强分类器
1.2 算法实现
GML 程序分析。
Real Adaboost.m
---real adaboost 的实现的主函数。
区域划分使用决策树实现;
决策树节点分裂规则为 误差不纯度;
每个区域划分表现形式为 树的叶子节点。
for It = 1 : Max_Iter % 步骤2)DO FOR t=1,2,...,T %chose best learner nodes = train(WeakLrn, Data, Labels, distr); %步骤 a. 对S进行划分, %实现方式为:创建CART树.每个nodes为树的叶子节点,作用是生成一个划分区域i for i = 1:length(nodes) %遍历叶子节点 curr_tr = nodes{i}; step_out = calc_output(curr_tr, Data); % 生成一个划分区域i,属于此区域,则为1,否则为0 s1 = sum( (Labels == 1) .* (step_out) .* distr); %划分区域i中,正样本权重和 s2 = sum( (Labels == -1) .* (step_out) .* distr); %划分区域i中,负样本权重和 if(s1 == 0 && s2 == 0) continue; end Alpha = 0.5*log((s1 + eps) / (s2+eps)); %计算置信度 Weights(end+1) = Alpha; %区域i的置信度输出值 Learners{end+1} = curr_tr; final_hyp = final_hyp + step_out .* Alpha; %合并同一个label的区域 end distr = exp(- 1 * (Labels .* final_hyp)); % 更新权重并归一化 Z = sum(distr); distr = distr / Z; end
function nodes = train(node, dataset, labels, weights)max_split = node.max_split;[left right spit_error] = do_learn_nu(node, dataset, labels, weights); % spit_error 为本节点分裂后的误分类不纯度;根节点没有误分类不纯度nodes = {left, right};%% 用于根节点初始化left_pos = sum((calc_output(left , dataset) == labels) .* weights); %左子树,正确判定的权重和 ,calc_output()函数根据节点阈值,输出0、1left_neg = sum((calc_output(left , dataset) == -labels) .* weights);%左子树,错误判定的权重和right_pos = sum((calc_output(right, dataset) == labels) .* weights); %右子树,正确判定的权重和right_neg = sum((calc_output(right, dataset) == -labels) .* weights); %右子树,错误判定的权重和errors = [min(left_pos, left_neg), min(right_pos, right_neg)]; %左右分裂节点错分不纯度if(right_pos == 0 && right_neg == 0) %某节点被完全正确分类 return;endif(left_pos == 0 && left_neg == 0) %某节点被完全正确分类 return;end[errors, IDX] = sort(errors); %将错分不纯度从小到大排序errors = flipdim(errors,2); %转换为从大到小排序IDX = flipdim(IDX,2); %转换为从大到小排序nodes = nodes(IDX); %错分误差大的放在前面(谁分类的效果不好,就先处理谁)splits = [];split_errors = [];deltas = [];%%for i = 2 : max_split%% for j = 1 : length(errors) if(length(deltas) >= j) continue; end max_node = nodes{j}; max_node_out = calc_output(max_node, dataset); %预测 mask = find(max_node_out == 1); %不一定是正样本,只是1表示满足分裂条件 [left right spit_error] = do_learn_nu(node, dataset(:,mask), labels(mask), weights(mask), max_node); %继续分裂 ,spit_error 为分裂后的不纯度 left_pos = sum((calc_output(left , dataset) == labels) .* weights); left_neg = sum((calc_output(left , dataset) == -labels) .* weights); right_pos = sum((calc_output(right, dataset) == labels) .* weights); right_neg = sum((calc_output(right, dataset) == -labels) .* weights); splits{end+1} = left; splits{end+1} = right; if( (right_pos + right_neg) == 0 || (left_pos + left_neg) == 0) %某分支被完全正确分类,则后续不再分裂 deltas(end+1) = 0; else deltas(end+1) = errors(j) - spit_error; %不纯度变化量: errors为原节点不纯度 ,spit_error为分裂后的不纯度 end split_errors(end+1) = min(left_pos, left_neg); split_errors(end+1) = min(right_pos, right_neg); end %% if(max(deltas) == 0) return; end best_split = find(deltas == max(deltas)); %选择不纯度下降最快的一个节点继续分裂 best_split = best_split(1); cut_vec = [1 : (best_split-1) (best_split + 1) : length(errors)]; %将分裂的节点删除,后续补充为分裂后的节点 nodes = nodes(cut_vec); % 节点信息 errors = errors(cut_vec); % 节点不纯度 deltas = deltas(cut_vec); % 节点不纯度变化量 nodes{end+1} = splits{2 * best_split - 1}; nodes{end+1} = splits{2 * best_split}; errors(end+1) = split_errors(2 * best_split - 1); errors(end+1) = split_errors(2 * best_split); cut_vec = [1 : 2 * (best_split-1) 2 * (best_split)+1 : length(split_errors)]; split_errors = split_errors(cut_vec); splits = splits(cut_vec);end
function [tree_node_left, tree_node_right, split_error] = do_learn_nu(tree_node, dataset, labels, weights, papa)tree_node_left = tree_node; %左右子树初始化tree_node_right = tree_node;if(nargin > 4) tree_node_left.parent = papa; %%后续递归调用,父节点; 爸爸节点 (●'◡'●) tree_node_right.parent = papa;end% [i,t] = weakLearner(weights,dataset', labels');% % tree_node_left.right_constrain = t;% tree_node_right.left_constrain = t;% % tree_node_left.dim = i;% tree_node_right.dim = i;% % return;%dataset=data_w(dataset) ;%stump = set_distr(stump, get_sampl_weights(dataset)) ;Distr = weights;%[trainpat, traintarg] = get_train( dataset);trainpat = dataset;traintarg = labels;tr_size = size(trainpat, 2); %样本个数T_MIN = zeros(3,size(trainpat,1));d_min = 1;d_max = size(trainpat,1);%%for d = d_min : d_max; %% 遍历所有属性 [DS, IX] = sort(trainpat(d,:)); % 某一属性值升序排序 DS:数据排序后 TS = traintarg(IX); %TS:target 排序后 DiS = Distr(IX); % DiS: 权重排序后 lDS = length(DS); % 样本个数 vPos = 0 * TS; vNeg = vPos; i = 1; j = 1; while i <= lDS k = 0; while i + k <= lDS && DS(i) == DS(i+k) if(TS(i+k) > 0) vPos(j) = vPos(j) + DiS(i+k); % 可取值,不重复 else vNeg(j) = vNeg(j) + DiS(i+k); % 可取值,不重复 end k = k + 1; end i = i + k; j = j + 1; end vNeg = vNeg(1:j-1); %可取值,正样本权重 vPos = vPos(1:j-1); %可取值,负样本权重 Error = zeros(1, j - 1); %分类误差 InvError = Error; IPos = vPos; % 积分 INeg = vNeg; % 积分 for i = 2 : length(IPos) % 每一个点为分裂点,左右子树 正样本,负样本分别权重 IPos(i) = IPos(i-1) + vPos(i); INeg(i) = INeg(i-1) + vNeg(i); end Ntot = INeg(end); %所有负样本权重和 Ptot = IPos(end); %所有正样本权重和 for i = 1 : j - 1 Error(i) = IPos(i) + Ntot - INeg(i); %左为负样本,右为正 InvError(i) = INeg(i) + Ptot - IPos(i);%左为正样本,右为负 end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% idx_of_err_min = find(Error == min(Error)); %查找分裂点 if(length(idx_of_err_min) < 1) idx_of_err_min = 1; end if(length(idx_of_err_min) <1) idx_of_err_min = idx_of_err_min; end idx_of_err_min = idx_of_err_min(1); %%%%%%%%%%%%%%%%%%%%%%%%%%%% idx_of_inv_err_min = find(InvError == min(InvError)); %查找分裂点 if(length(idx_of_inv_err_min) < 1) idx_of_inv_err_min = 1; end idx_of_inv_err_min = idx_of_inv_err_min(1); %%%%%%%%%%%%%%%%%%%%%%%%%%% if(Error(idx_of_err_min) < InvError(idx_of_inv_err_min)) %再次比较,判定左正右负 还是 左负右正 T_MIN(1,d) = Error(idx_of_err_min); %某d属性分裂后的误差 T_MIN(2,d) = idx_of_err_min; %某d属性分裂点 T_MIN(3,d) = -1; %左为负 %某d属性做子树判定为负样本 else T_MIN(1,d) = InvError(idx_of_inv_err_min); T_MIN(2,d) = idx_of_inv_err_min; T_MIN(3,d) = 1; %左为正 endend %% end for d%% 在所有属性分裂中,找到最优的分裂点dim = [];best_dim = find(T_MIN(1,:) == min(T_MIN(1,:)));dim = best_dim(1); tree_node_left.dim = dim; %保存tree_node_right.dim = dim;TDS = sort(trainpat(dim,:)); %根据最佳属性排序样本lDS = length(TDS);DS = TDS * 0;i = 1;j = 1;while i <= lDS k = 0; while i + k <= lDS && TDS(i) == TDS(i+k) DS(j) = TDS(i); k = k + 1; end i = i + k; j = j + 1;endDS = DS(1:j-1); %最佳属性可选值split = ( DS(T_MIN(2,dim)) + DS(min(T_MIN(2,dim) + 1, length(DS))) ) / 2; %分裂节点阈值split_error = T_MIN(1,dim);%分裂误差tree_node_left.right_constrain = split; %保存分裂点阈值,左子树tree_node_right.left_constrain = split; %保存分裂点阈值,右子树
0 0
- Real adaboost
- Real Adaboost总结
- Discrete AdaBoost,Real AdaBoost,LogitBoost AdaBoost,Gentle AdaBoost
- sklearn Discrete AdaBoost vs Real AdaBoost
- 物体检测 real adaboost + 像素区块差
- 几种Boost算法的比较(Discrete AdaBoost, Real AdaBoost, LogitBoost, Gentle Adaboost)
- Adaboost
- adaBoost
- adaboost
- Adaboost
- Adaboost
- AdaBoost
- AdaBoost
- AdaBoost
- AdaBoost
- AdaBoost
- AdaBoost
- adaboost
- 通过Spring Data Neo4J操作您的图形数据库
- 我的第一篇博客
- Java并发包中的线程池的种类和特性介绍
- 函数作为参数
- POJ 1835 宇航员 中文
- Real adaboost
- 【算法】组中元素全是1~n之间的整数问题
- ajax(四)
- 【蓝桥杯题解】矩阵翻硬币
- ARM体系结构(一)
- 【zabbix】利用LLD自动发现功能监控多Memcached实例
- 随笔
- 爬取微博用户的原创微博
- linux查看进程启动时间