AdaBoost
来源:互联网 发布:进入recovery模式软件 编辑:程序博客网 时间:2024/06/17 03:47
Scouting:
Scouting is done by testing the classifiers in the pool using a training set T of N multidimensional data points x.
We test and rank all classifiers in the expert pool by charging a cost any time a classifier fails(a miss), and a cost
every time a classifier provides the right label(a success or "hit"). We require
so that misses are penalized more heavily than hits. It might seem strange to penalize a hit with non-zero cost, but as long as the penalty of success is smaller than the penalty for a miss
everything is fine. This kind of error function different from the usual squared Euclidian distance to the classification target is called an exponential loss function.
AdaBoost uses exponential error loss as error criterion.
The main idea of AdaBoost is to proceed systematically by extracting one classifier from the pool in each of M iterations. The drafting process concentrates in selecting new classifiers for the committee focusing on which can help with the still misclassified examples. The best team players are those which can provide new insights to the committee. Classifiers being drafted should complement each other in an optimal way.
Drafting:
In each iteration we need to rank all classifiers, so that we can select the current best out of the pool. At the m-th iteration we have already included m-1 classifiers in the committee and we want to draft the next one. The current linear combination of classifiers is
We define the total cost, or total error, of the extended classifier as the exponential loss
where are yet to be determined in an optimal way. Since our intention is to draft
we rewrite the above expression as
for i = 1, ..., N. In the first iteration for i=1,...,N. During later iterations, the vector
represents the weights assigned to each data point in the training set at iteration m. We can split the above Eq into two sums
This means that the total cost is the weighted cost of all hits plus the weighted cost of all misses.
Writing the first summand as and the second as
we simplify the notation to
Now, is the total sum W of the weights of all data points, that is, a constant in the current iteration. The right hand side of the equation is minimized when at the m-th iteration we pick the classifier with the lowest total cost
(that is the lowest rate of weighted error). Intuitively this makes sense, the next draftee, km, should be the one with the lowest penalty given the current set of weights.
Weighting:
That is, the pool of classifiers dose not need to be given in advance, it only needs to ideally exist.
- Adaboost
- adaBoost
- adaboost
- Adaboost
- Adaboost
- AdaBoost
- AdaBoost
- AdaBoost
- AdaBoost
- AdaBoost
- AdaBoost
- adaboost
- AdaBoost
- Adaboost
- Adaboost
- AdaBoost
- Adaboost
- Adaboost
- 分隔符匹配
- 欢迎使用CSDN-markdown编辑器
- 简单聊天室的代码
- 算法练习 动态规划 硬币找零
- Yii 玩转Databases
- AdaBoost
- nyoj 47
- hdoj 2068 RPG的错排
- OpenCv皮肤检测-HSV分离
- 在Mac OSX系统的Docker机上启用Docker远程API功能
- BZOJ_P1507[NOI2003]Editor(块状链表)
- leetcode 328 Odd Even Linked List(难易度:Easy)
- 9、神经网络
- 安卓日记——利用http上传文件到服务器