AdaBoost算法

来源:互联网 发布:网络结构七层 编辑:程序博客网 时间:2024/06/06 00:09

一、Adaboost算法简介

对给定训练样本用一个弱分类器进行分类,通过上次分类的准确率,来确定这个分类器及分错样本的权值,并且根据更新后的样本(上一次分错的样本能够有更大的权值),来重新选择一个分类器进行分类。

最终,分类的错误率得到要求,停止分类。并根据权值把这些分类器组合成一个强分类器。

二、构建Adaboost的MATLAB工具箱

1.weakLearner()函数
构建一个弱分类器,这个分类器根据采用样本的平均值作为阈值,进行分类。

function WL = weakLearner(w, TrainData, label)% input: w - Sample Weight%        TrainData - data to be classified%        label - the class label of TrainData% output: WL - 结构体,保存弱分类器向量相关信息%               WL.[m n] = size(TrainData);% search the best discriminative feature exhaustlypInd = (label == 1); % 正例的索引nInd = (label == -1); % 反例的索引for iFeature = 1:n    % 以正例和反例在iFeature特征那一维上的均值作为分类域值    pMean = pInd' * TrainData(:, iFeature) / sum(pInd); % 正例的均值(与trainData的每一列相乘)    nMean = nInd' * TrainData(:, iFeature) / sum(nInd); % 反例的均值    thres(iFeature) = (pMean + nMean) / 2; % 在iFeature那一维特征上的分类阈值    nRes = TrainData(:, iFeature) >= thres(iFeature);    pRes = TrainData(:, iFeature) < thres(iFeature);    nRes = -1 * nRes;    res = pRes + nRes;    error(iFeature) = w * ( label ~=  res);end[val, ind] = max(abs(error-0.5))if error(ind) > 0.5    % 应当将分类器反向判别    error(ind) = 1 - error(ind);    WL.direction = -1;else    WL.direction = 1;endWL.iFeature = ind;%WL struct包含direction,IFeature,error,threshold。WL.error = error(ind);WL.thres = thres(ind);

2.Adaboost()函数
根据weakLearner()函数构造的弱分类器,更新弱分类器权重及训练的样本集,组成一个强分类器,从而得到新的假设序列CABoosted。

function CABoosted = adaBoost( TrainData, label, nIter )% Training : boosting the weak classifiers% Input: %       TrainData - 训练数据%       label - 类标签%       nIter - 弱分类器个数pInd = find(label == 1);nInd = find(label == -1);nP = length(pInd);nN = length(nInd);% 初始化权重向量,让正例和反例各自计算权重可以保证每次分类时都考虑正例和反例的贡献% 否则如果正例远远大于反例,反例w比例太小,数量又少,其作用(贡献、代价)的不到体现w(pInd) = 1 / (2 * nP);w(nInd) = 1 / (2 * nN);%w = ones(1, (nP + nN));%w = w/ sum(w);eps = 0.001;%eps = 0;% 建立nIter个弱分类器分量,组成一个强分类器for iIt = 1:nIter    % 归一化 w    w = w / sum(w);%     if(mod(iIt,10)==0)%         disp([iIt,nIter]);%     end    WL = weakLearner(w,TrainData,label);    CABoosted{iIt}.classifier = WL;%     beta(iIt) = WL.error / (1 - WL.error);%     alfa(iIt) = log( 1 / (beta(iIt) + eps) );    nRes = TrainData(:, WL.iFeature) >= WL.thres;    pRes = TrainData(:, WL.iFeature) < WL.thres;    nRes = -1 * nRes;    res = pRes + nRes;    if WL.direction == -1        res = -1 * res;    end%    w = w .* exp( log(beta(iIt) + eps) * ( 1-(res~=label) ) )'; % 更新w ?????   alfa(iIt) = (1/2) * log( (1-WL.error) / (WL.error + eps) );    w = w .* exp( -alfa(iIt) * (label .* res) )';    CABoosted{iIt}.alfa = alfa(iIt);        if WL.error < eps        break;    endend

3.AdaBoostClassify()函数
将若干个的弱分类器进行近似经验估计,转换成强分类器,对Data进行分类,存放在classLabel。

function [classLabel, sum] = adaBoostClassify( Data, CABoosted )% Input:%      Data - 待分类数据矩阵,每行一个样本%      CABoosted - CellArray 类型,记录着每个若分类器的相关信息% Output:%      classLabel - Data的类标号%      sum - 可以作为分类置信度信息[m n] = size(Data);sum = zeros(m, 1);nWL = length(CABoosted);for iWL = 1:nWL    WL = CABoosted{iWL}.classifier;    alfa = CABoosted{iWL}.alfa;    pRes = Data(:, WL.iFeature) >= WL.thres;    nRes = Data(:, WL.iFeature) < WL.thres;    nRes = -1 * nRes;    res = pRes + nRes;    if WL.direction == -1        res = -1 * res;    end    sum = sum + alfa * res;endclassLabel =  ones(m, 1);ind = find(sum >= 0);classLabel(ind) = -1;
原创粉丝点击