Lasso and Elastic Net for Sparse Signals：线性模型之套索和弹性网稀疏信号对比

来源：互联网发布：澳门大学知乎编辑：程序博客网时间：2024/05/25 18:09

这两个模型都是针对线性回归模型linear_model,区别在于使用了不同的损失函数或者不同的正则项函数

相关指数R2知识介绍

回归平方和+残差平方和=总偏差平方和

残差平方和=sum(y预测i-y观测i）^2

总偏差平方和=sum(y观测i-y观测平均)^2

回归平方和=sum(y预测i-y观测平均)^2

R2=1-残差平方和/总偏差平方和

import numpy as np # 数组库import matplotlib.pyplot as plt # 作图库from sklearn.metrics import r2_score  # 使用R2相关指数作为模型指标 ，metrics：指标库# 构造数据集np.random.seed(42) # 随机种子n_samples, n_features = 50, 200 # 样本数，特征数母X = np.random.randn(n_samples, n_features)# 构造训练特征矩阵coef = 3 * np.random.randn(n_features) # 构造权重数组大小inds = np.arange(n_features) # 构造顺序权重值np.random.shuffle(inds) # 随机话顺序权重值coef[inds[10:]] = 0  # sparsify coef  权重赋值y = np.dot(X, coef) # 构造训练的目标数据集# add noise  加入噪声y += 0.01 * np.random.normal((n_samples,)) # 构造有噪声的目标数据集# Split data in train set and test set # 分割出训练和测试集n_samples = X.shape[0]X_train, y_train = X[:n_samples / 2], y[:n_samples / 2] # 训练集x，yX_test, y_test = X[n_samples / 2:], y[n_samples / 2:] # 测试集x,y

Lasso：平方损失+L1范数

from sklearn.linear_model import Lassoalpha = 0.1lasso = Lasso(alpha=alpha)y_pred_lasso = lasso.fit(X_train, y_train).predict(X_test)# 训练并预测r2_score_lasso = r2_score(y_test, y_pred_lasso)# 根据观测和预测得出R2print(lasso)print("r^2 on test data : %f" % r2_score_lasso)Out:  Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000,   normalize=False, positive=False, precompute=False, random_state=None,selection='cyclic', tol=0.0001, warm_start=False)r^2 on test data : 0.384710

ElasticNet：平方损失+L1和L2范数的混合使用

这里写图片描述

from sklearn.linear_model import ElasticNet #导入ElasticNet模块enet = ElasticNet(alpha=alpha, l1_ratio=0.7) # 导入ElasticNet模型y_pred_enet = enet.fit(X_train, y_train).predict(X_test)#训练并预测r2_score_enet = r2_score(y_test, y_pred_enet)#根据预测和观测得出R2print(enet)print("r^2 on test data : %f" % r2_score_enet)Out:  ElasticNet(alpha=0.1, copy_X=True, fit_intercept=True, l1_ratio=0.7,      max_iter=1000, normalize=False, positive=False, precompute=False,      random_state=None, selection='cyclic', tol=0.0001, warm_start=False)r^2 on test data : 0.24017

Lasso和ElasticNet的在相同训练集下，训练出来的权重个数及大小，并与原有构造数据做数据可视化

plt.plot(lasso.coef_,color='lightgreen',linewidth=2,label='Lasso coefficients')# lasso每个权重的值为纵坐标，下标为横坐标（i,wi）i=1...200(200个特征)plt.plot(enet.coef_,color='gold',linewidth=2,label='Elastic net coefficients')# ElasticNet每个权重的值为纵坐标，下标为横坐标（i,wi）i=1...200(200个特征)plt.plot(coef,'--',color='navy',linewidth=2,label='Original coefficients')#  原始的每个权重的值为纵坐标，下标为横坐标（i,wi）i=1...200(200个特征)plt.xlabel('the index of coefficient')# 横轴为权重的位置索引plt.ylabel('the values of the index coefficient')# 纵轴为对应索引的权重的值plt.legend(loc='best') #图例plt.title("Lasso R^2: %f, Elastic Net R^2: %f"          % (r2_score_lasso, r2_score_enet))plt.show()

注：个人笔记

0 0