嵌入式选择与L1正则化

来源:互联网 发布:法语网络教学视频 编辑:程序博客网 时间:2024/06/06 04:20

这里写图片描述
这里写图片描述
这里写图片描述
这里写图片描述
这里写图片描述
这里写图片描述

参考代码:
(需将SpectralClassificationTrain.mat文件放入与下面py文件同文件夹下)

# Author: Irene Long# Date: 20170621import numpy as npimport scipy.io as sioimport pandas as pdimport matplotlib.pyplot as pltdef Lasso_first_derivatives(X,Y,W):    '''    Return the value of the first derivatives of f when the argument is X, Y, W.    Parameters    ----------    X : numpy.array        The amount of rows is the amount of the features.        The amount of columns is the amount of the samples.    Y : numpy.array        The labels of samples.        Y.shape = [ 1, samples ]    W : numpy.array        The weights in the linear fiting.        W.shape = [ features, 1 ]    Returns    -------    The value of the first derivatives of f when the argument is X, Y, W.     In the shape of [ 1, samples ].    '''    Y_minus_WTX = Y - W.T * X    return 2 * np.sum(Y_minus_WTX * X , axis=1)def Lasso_second_derivative( X ):    '''    Return the value of the second derivatives of f when the argument is X.    Parameters    ----------    X : numpy.array        The amount of rows is the amount of the samples.        The amount of columns is the amount of the features.    Returns    -------    The value of the second derivatives of f when the argument is X.     The value is a floating number.     '''    v = []    for i in range( X.shape[ 0 ] ):        v.append( np.dot( X[ i ], X[ i ] ) )    return np.sum( np.array( v ) )def get_L( X ):    '''    Return L in the theory.    Parameters    ----------    X : numpy.array        The amount of rows is the amount of the samples.        The amount of columns is the amount of the features.    Returns    -------    L = 0.5 * Lasso_second_derivative( X )    The value is a floating number.    '''    return 0.5 * Lasso_second_derivative( X )def get_z( W, L, X, Y ):    '''    Return z in the theory.    Parameters    ----------    X : numpy.array        The amount of rows is the amount of the samples.        The amount of columns is the amount of the features.    Returns    -------    z = W - Lasso_first_derivatives( X.T, Y, W ) / L    The result is a vector in the shape of [ 1, features ].    '''    return W - Lasso_first_derivatives( X.T, Y, W ) / L# Load datasetssc_train = sio.loadmat('SpectralClassificationTrain.mat')X = np.array( pd.DataFrame( sc_train['train_x'] ) )Y = sc_train['train_y'][:,0]num, d = X.shapeW = np.random.random( [ 1, d ] )L = get_L( X )lambd = 2z = get_z( W, L, X, Y )flag = lambd / Lj = 0f0 = 0count = 0while( 1 ):    count += 1    print( 'count=',count )    for i in range( d ):        if z[ 0, i ] > flag:            W[ 0, i ] = z[ 0, i ] - flag        elif z[ 0, i ] < -flag:            W[ 0, i ] = z[ 0, i ] + flag        else:            W[ 0, i ] = 0    f = np.sum( W == 0 )    print( 'number of 0:', f )    if f0 == f:        j += 1    # If the amount of 0 in W is unchanged in 20 iterations,then stop the looping.    if j == 20:         break    else:        f0 = f        z = get_z( W, L, X, Y )print( 'final:', np.sum( W == 0 ) )
原创粉丝点击