scikit-learn学习：elastic net

来源：互联网发布：淘宝男装代理货源网编辑：程序博客网时间：2024/06/10 19:54

elastic net是结合了lasso和ridge regression的模型，其计算公式如下：

根据官网介绍：elastic net在具有多个特征，并且特征之间具有一定关联的数据中比较有用。

以下为训练误差和测试误差程序：

import numpy as npfrom sklearn import linear_model################################################################################ Generate sample datan_samples_train, n_samples_test, n_features = 75, 150, 500np.random.seed(0)coef = np.random.randn(n_features)coef[50:] = 0.0  # only the top 10 features are impacting the modelX = np.random.randn(n_samples_train + n_samples_test, n_features)y = np.dot(X, coef)# Split train and test dataX_train, X_test = X[:n_samples_train], X[n_samples_train:]y_train, y_test = y[:n_samples_train], y[n_samples_train:]################################################################################ Compute train and test errorsalphas = np.logspace(-5, 1, 60)enet = linear_model.ElasticNet(l1_ratio=0.7)train_errors = list()test_errors = list()for alpha in alphas:    enet.set_params(alpha=alpha)    enet.fit(X_train, y_train)    train_errors.append(enet.score(X_train, y_train))    test_errors.append(enet.score(X_test, y_test))i_alpha_optim = np.argmax(test_errors)alpha_optim = alphas[i_alpha_optim]print("Optimal regularization parameter : %s" % alpha_optim)# Estimate the coef_ on full data with optimal regularization parameterenet.set_params(alpha=alpha_optim)coef_ = enet.fit(X, y).coef_################################################################################ Plot results functionsimport matplotlib.pyplot as pltplt.subplot(2, 1, 1)plt.semilogx(alphas, train_errors, label='Train')plt.semilogx(alphas, test_errors, label='Test')plt.vlines(alpha_optim, plt.ylim()[0], np.max(test_errors), color='k',           linewidth=3, label='Optimum on test')plt.legend(loc='lower left')plt.ylim([0, 1.2])plt.xlabel('Regularization parameter')plt.ylabel('Performance')# Show estimated coef_ vs true coefplt.subplot(2, 1, 2)plt.plot(coef, label='True coef')plt.plot(coef_, label='Estimated coef')plt.legend()plt.subplots_adjust(0.09, 0.04, 0.94, 0.94, 0.26, 0.26)plt.show()

实验结果：

Optimal regularization parameter : 0.000335292414925