Sparse solutions with L1 regularization

来源:互联网 发布:淘宝论坛 - 淘宝网 编辑:程序博客网 时间:2024/06/05 19:06

1. Common solutions to reduce the generalization error are listed as follows :


• Collect more training data

• Introduce a penalty for complexity via regularization【L1、 L2】

• Choose a simpler model with fewer parameters

• Reduce the dimensionality of the data【L1、 L2】


2. Precessing

from sklearn.model_selection import train_test_splitimport pandas as pdfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.preprocessing import StandardScalerimport numpy as npimport matplotlib.pyplot as plt
df_wine = pd.read_csv('./datasets/wine/wine.data',                      header=None)# https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.datadf_wine.columns = ['Class label', 'Alcohol', 'Malic acid', 'Ash',                   'Alcalinity of ash', 'Magnesium', 'Total phenols',                   'Flavanoids', 'Nonflavanoid phenols', 'Proanthocyanins',                   'Color intensity', 'Hue', 'OD280/OD315 of diluted wines',                   'Proline']print('Class labels', np.unique(df_wine['Class label']))print('\nWine data excerpt:\n\n', df_wine.head())X, y = df_wine.iloc[:, 1:].values, df_wine.iloc[:, 0].valuesX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)print('Section: Bringing features onto the same scale')mms = MinMaxScaler()X_train_norm = mms.fit_transform(X_train)X_test_norm = mms.transform(X_test)stdsc = StandardScaler()X_train_std = stdsc.fit_transform(X_train)X_test_std = stdsc.transform(X_test)

3. Common ways to reduce overftting byregularization and dimensionality reduction via feature selection
L1 regularization--- Scikit-learn support L1 regularization
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(penalty='l1', C=0.1)
lr.fit(X_train_std, y_train)

4. Let's plot the regularization path, which is the weight coeffcients of the different features for different regularization strengths
fig = plt.figure()ax = plt.subplot(111)colors = ['blue', 'green', 'red', 'cyan',          'magenta', 'yellow', 'black',          'pink', 'lightgreen', 'lightblue',          'gray', 'indigo', 'orange']weights, params = [], []for c in np.arange(-4, 6):    lr = LogisticRegression(penalty='l1', C=10**c, random_state=0)    lr.fit(X_train_std, y_train)    weights.append(lr.coef_[1])    params.append(10**c)weights = np.array(weights)for column, color in zip(range(weights.shape[1]), colors):    plt.plot(params, weights[:, column],             label=df_wine.columns[column + 1],             color=color)plt.axhline(0, color='black', linestyle='--', linewidth=3)plt.xlim([10**(-5), 10**5])plt.ylabel('weight coefficient')plt.xlabel('C')plt.xscale('log')plt.legend(loc='upper left')ax.legend(loc='upper center',          bbox_to_anchor=(1.38, 1.03),          ncol=1, fancybox=True)plt.show()

5 . Analysis 
    The resulting plot provides us with further insights about the behavior of L1 regularization. As we can see, all features weights will be zero if we penalize the model with a strong regularization parameter (C< 0.1);C is the inverse of the regularization parameter.

Reference:《Python Machine Learning》


0 0
原创粉丝点击