Learning with ensembles

来源：互联网发布：rng淘宝官方旗舰店编辑：程序博客网时间：2024/05/22 15:55

The goal behind ensemble methodsis to combine different classifers into a meta-classifer that has a better generalization performance than each individual classifer alone。focus on the most popular ensemble methods that use the majority votingprinciple。

1. Ensemble error

from scipy.misc import combimport mathdef ensemble_error(n_classifier, error):    k_start = math.ceil(n_classifier / 2.0)    probs = [comb(n_classifier, k) * error**k * (1 - error)**(n_classifier - k)             for k in range(int(k_start), n_classifier + 1)]    return sum(probs)print('Ensemble error', ensemble_error(n_classifier=11, error=0.25))

('Ensemble error', 0.034327507019042969)

2. compute the ensemble error rates for a range of different base errors from 0.0 to 1.0 to visualize the relationship between ensemble and base errors in a line graph

import numpy as nperror_range = np.arange(0.0, 1.01, 0.01)ens_errors = [ensemble_error(n_classifier=11, error=error)              for error in error_range]plt.plot(error_range,         ens_errors,         label='Ensemble error',         linewidth=2)plt.plot(error_range,         error_range,         linestyle='--',         label='Base error',         linewidth=2)plt.xlabel('Base error')plt.ylabel('Base/Ensemble error')plt.legend(loc='upper left')plt.grid()plt.show()

As we can see in the resulting plot, the error probability of an ensemble is always better than the error of an individual base classifer as long as the base classifers perform better than random guessing (ε <0.5 )

Reference：《Python Machine Learning》

0 0