Sparse solutions with L1 regularization

来源：互联网发布：淘宝论坛 - 淘宝网编辑：程序博客网时间：2024/06/05 19:06

1. Common solutions to reduce the generalization error are listed as follows :

• Collect more training data

• Introduce a penalty for complexity via regularization【L1、 L2】

• Choose a simpler model with fewer parameters

• Reduce the dimensionality of the data【L1、 L2】

2. Precessing

from sklearn.model_selection import train_test_splitimport pandas as pdfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.preprocessing import StandardScalerimport numpy as npimport matplotlib.pyplot as plt

df_wine = pd.read_csv('./datasets/wine/wine.data',                      header=None)# https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.datadf_wine.columns = ['Class label', 'Alcohol', 'Malic acid', 'Ash',                   'Alcalinity of ash', 'Magnesium', 'Total phenols',                   'Flavanoids', 'Nonflavanoid phenols', 'Proanthocyanins',                   'Color intensity', 'Hue', 'OD280/OD315 of diluted wines',                   'Proline']print('Class labels', np.unique(df_wine['Class label']))print('\nWine data excerpt:\n\n', df_wine.head())X, y = df_wine.iloc[:, 1:].values, df_wine.iloc[:, 0].valuesX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)print('Section: Bringing features onto the same scale')mms = MinMaxScaler()X_train_norm = mms.fit_transform(X_train)X_test_norm = mms.transform(X_test)stdsc = StandardScaler()X_train_std = stdsc.fit_transform(X_train)X_test_std = stdsc.transform(X_test)

3. Common ways to reduce overftting byregularization and dimensionality reduction via feature selection

L1 regularization--- Scikit-learn support L1 regularization

from sklearn.linear_model import LogisticRegression

lr = LogisticRegression(penalty='l1', C=0.1)

lr.fit(X_train_std, y_train)

4. Let's plot the regularization path, which is the weight coeffcients of the different features for different regularization strengths

fig = plt.figure()ax = plt.subplot(111)colors = ['blue', 'green', 'red', 'cyan',          'magenta', 'yellow', 'black',          'pink', 'lightgreen', 'lightblue',          'gray', 'indigo', 'orange']weights, params = [], []for c in np.arange(-4, 6):    lr = LogisticRegression(penalty='l1', C=10**c, random_state=0)    lr.fit(X_train_std, y_train)    weights.append(lr.coef_[1])    params.append(10**c)weights = np.array(weights)for column, color in zip(range(weights.shape[1]), colors):    plt.plot(params, weights[:, column],             label=df_wine.columns[column + 1],             color=color)plt.axhline(0, color='black', linestyle='--', linewidth=3)plt.xlim([10**(-5), 10**5])plt.ylabel('weight coefficient')plt.xlabel('C')plt.xscale('log')plt.legend(loc='upper left')ax.legend(loc='upper center',          bbox_to_anchor=(1.38, 1.03),          ncol=1, fancybox=True)plt.show()

5 . Analysis

The resulting plot provides us with further insights about the behavior of L1 regularization. As we can see, all features weights will be zero if we penalize the model with a strong regularization parameter (C< 0.1);C is the inverse of the regularization parameter.

Reference:《Python Machine Learning》

0 0