机器学习： Logistic Regression--python

来源：互联网发布：qq阅读网络异常怎么办编辑：程序博客网时间：2024/05/16 10:41

今天介绍 logistic regression，虽然里面有 regression 这个词，但是这其实是一种分类的方法，这个分类方法输出的也是 0-1 之间的一个数，可以看成是一种概率输出，这个分类器利用一种 BP 迭代和随机梯度下降的方法来训练求得参数和建立分类模型。

首先来看看这个分类器用到的主要函数，即 sigmoid 函数：

y = σ (x) = 1 1 + e - x

这个函数有一个很好的特性，就是它的导数，

\partial y \partial x = σ (x) (1 - σ (x))

下面看看，如何利用这个函数来做分类，假设样本为向量 x, 经过权重系数 w 以及 bias 的转换，变成 u=wTx+b，再经过 sigmoid 函数的转换，最终输出一个预测概率 y=σ(u) , 样本的 ground truth 为 t, 则预测值与真实 label 之间的误差可以用最小均方误差表示：

e = 1 2 (y - t) 2

我们可以通过不断的调整 w 和 b 让预测值和真实 label 之间逐渐接近，根据链式法则，我们可以得到：

\partial e \partial w = \partial e \partial y \partial y \partial u \partial u \partial w

而每一部分的偏导数都可以求得：

∂e∂y=y−t
∂y∂u=σ(u)(1−σ(u))
∂u∂w=x

根据求得的偏导数，可以对权重系数进行更新：

w : = w + α \partial e \partial w

下面给出一个用 logistic regression 做分类的例子：

import numpy as npfrom sklearn import datasetsdef Sigmoid(x):    return 1.0/(1 + np.exp(-x))def Generate_label(y, N_class):    N_sample = len(y)    label = np.zeros((N_sample, N_class))    for ii in range(N_sample):        label[ii, int(y[ii])]=1         return label# load the iris datairis = datasets.load_iris()x_data = iris.datay_label = iris.targetclass_name = iris.target_namesn_sample = len(x_data)n_class = len(set(y_label))np.random.seed(0)index = np.random.permutation(n_sample)x_data = x_data[index]y_label = y_label[index].astype(np.float)train_x = x_data[: int(.8 * n_sample)]train_y = y_label[: int( .8 * n_sample)]test_x = x_data[int(.8 * n_sample) :]test_y = y_label[int(.8 * n_sample) :]train_label = Generate_label(train_y, n_class)test_label = Generate_label(test_y, n_class)# training processD = train_x.shape[1]W = 0.01 * np.random.rand(D, n_class)b = np.zeros((1, n_class))    step_size = 1e-1reg = 1e-3train_sample = train_x.shape[0]    batch_size = 10num_batch = train_sample / batch_sizetrain_epoch = 1000for ii in range (train_epoch):    for batch_ii in range(num_batch):        batch_x = train_x[batch_ii * batch_size:            (batch_ii+1) * batch_size, :]        batch_y = train_label[batch_ii * batch_size:            (batch_ii+1) * batch_size, :]        scores = np.dot(batch_x, W) + b        y_out = Sigmoid(scores)        e = y_out - batch_y        dataloss = 0.5 * np.sum(e*e) / batch_size        regloss = 0.5 * reg *  np.sum(W*W)        L = dataloss + regloss        dscores = e * y_out * (1 - y_out) / batch_size        dw = np.dot(batch_x.T, dscores)        db = np.sum(dscores, axis=0, keepdims=True)        dw += reg*W        W = W - step_size * dw        b = b - step_size * db    if (ii % 10 == 0):        print 'the training loss is: %.4f' % L# test processscores = np.dot(test_x, W) + by_out = Sigmoid(scores)predict_out = np.argmax(y_out, axis=1)print 'test accuracy: %.2f' % (np.mean(predict_out == test_y))

阅读全文

0 0