Training a deep autoencoder or a classifier on MNIST digits_Rbm训练(python)

来源：互联网发布：家庭网络设计方案编辑：程序博客网时间：2024/06/06 03:07

Training a deep autoencoder or a classifier on MNIST digits_Rbm训练(python)

一、Rbm阅读材料

http://en.wikipedia.org/wiki/Restricted_Boltzmann_machine

http://deeplearning.net/tutorial/rbm.html

二、Rbm训练的基本原理

三、Rbm代码分析

我们建立一个RBM类型。网络参数可以经过结构体初始化，也可以通过参数初始化。当一个RBM被当作一个深度网络的搭建块时，初始化是有有用的。RBM作为深度网络的基本块时，权值矩阵和隐藏层偏置都由一个多层感知器相应的sigmoid层共享。

class RBM(object):  """Restricted Boltzmann Machine (RBM) """  def __init__(self, input=None, n_visible=784, n_hidden=500,               W=None, hbias=None, vbias=None, numpy_rng=None,               theano_rng=None):      """      RBM constructor. Defines the parameters of the model along with      basic operations for inferring hidden from visible (and vice-versa),      as well as for performing CD updates.      :param input: None for standalone RBMs or symbolic variable if RBM is      part of a larger graph.      :param n_visible: number of visible units      :param n_hidden: number of hidden units      :param W: None for standalone RBMs or symbolic variable pointing to a      shared weight matrix in case RBM is part of a DBN network; in a DBN,      the weights are shared between RBMs and layers of a MLP      :param hbias: None for standalone RBMs or symbolic variable pointing      to a shared hidden units bias vector in case RBM is part of a      different network      :param vbias: None for standalone RBMs or a symbolic variable      pointing to a shared visible units bias      """      self.n_visible = n_visible      self.n_hidden = n_hidden      if numpy_rng is None:          # create a number generator          numpy_rng = numpy.random.RandomState(1234)      if theano_rng is None:          theano_rng = RandomStreams(numpy_rng.randint(2 ** 30))      if W is None :         # W is initialized with `initial_W` which is uniformely sampled         # from -4.*sqrt(6./(n_visible+n_hidden)) and 4.*sqrt(6./(n_hidden+n_visible))         # the output of uniform if converted using asarray to dtype         # theano.config.floatX so that the code is runable on GPU         initial_W = numpy.asarray(numpy.random.uniform(                   low=-4 * numpy.sqrt(6. / (n_hidden + n_visible)),                   high=4 * numpy.sqrt(6. / (n_hidden + n_visible)),                   size=(n_visible, n_hidden)),                   dtype=theano.config.floatX)         # theano shared variables for weights and biases         W = theano.shared(value=initial_W, name='W')      if hbias is None :         # create shared variable for hidden units bias         hbias = theano.shared(value=numpy.zeros(n_hidden,                             dtype=theano.config.floatX), name='hbias')      if vbias is None :          # create shared variable for visible units bias          vbias = theano.shared(value =numpy.zeros(n_visible,                              dtype = theano.config.floatX),name='vbias')      # initialize input layer for standalone RBM or layer0 of DBN      self.input = input if input else T.dmatrix('input')      self.W = W      self.hbias = hbias      self.vbias = vbias      self.theano_rng = theano_rng      # **** WARNING: It is not a good idea to put things in this list      # other than shared variables created in this function.      self.params = [self.W, self.hbias, self.vbias]Next step is to define functions which construct the symbolic graph associated with Eqs. (7) - (8). The code is as follows:def propup(self, vis):    ''' This function propagates the visible units activation upwards to    the hidden units    Note that we return also the pre_sigmoid_activation of the layer. As    it will turn out later, due to how Theano deals with optimization and    stability this symbolic variable will be needed to write down a more    stable graph (see details in the reconstruction cost function)    '''    pre_sigmoid_activation = T.dot(vis, self.W) + self.hbias    return [pre_sigmoid_activation, T.nnet.sigmoid(pre_sigmoid_activation)]def sample_h_given_v(self, v0_sample):    ''' This function infers state of hidden units given visible units '''    # compute the activation of the hidden units given a sample of the visibles    pre_sigmoid_h1, h1_mean = self.propup(v0_sample)    # get a sample of the hiddens given their activation    # Note that theano_rng.binomial returns a symbolic sample of dtype    # int64 by default. If we want to keep our computations in floatX    # for the GPU we need to specify to return the dtype floatX    h1_sample = self.theano_rng.binomial(size=h1_mean.shape, n=1, p=h1_mean,                                         dtype=theano.config.floatX)    return [pre_sigmoid_h1, h1_mean, h1_sample]def propdown(self, hid):    '''This function propagates the hidden units activation downwards to    the visible units    Note that we return also the pre_sigmoid_activation of the layer. As    it will turn out later, due to how Theano deals with optimization and    stability this symbolic variable will be needed to write down a more    stable graph (see details in the reconstruction cost function)    '''    pre_sigmoid_activation = T.dot(hid, self.W.T) + self.vbias    return [pre_sigmoid_activation, T.nnet.sigmoid(pre_sigmoid_activation)]def sample_v_given_h(self, h0_sample):    ''' This function infers state of visible units given hidden units '''    # compute the activation of the visible given the hidden sample    pre_sigmoid_v1, v1_mean = self.propdown(h0_sample)    # get a sample of the visible given their activation    # Note that theano_rng.binomial returns a symbolic sample of dtype    # int64 by default. If we want to keep our computations in floatX    # for the GPU we need to specify to return the dtype floatX    v1_sample = self.theano_rng.binomial(size=v1_mean.shape,n=1, p=v1_mean,                                         dtype=theano.config.floatX)    return [pre_sigmoid_v1, v1_mean, v1_sample]We can then use these functions to define the symbolic graph for a Gibbs sampling step. We define two functions:•gibbs_vhv which performs a step of Gibbs sampling starting from the visible units. As we shall see, this will be useful for sampling from the RBM.•gibbs_hvh which performs a step of Gibbs sampling starting from the hidden units. This function will be useful for performing CD and PCD updates.The code is as follows:def gibbs_hvh(self, h0_sample):    ''' This function implements one step of Gibbs sampling,        starting from the hidden state'''    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h0_sample)    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v1_sample)    return [pre_sigmoid_v1, v1_mean, v1_sample, pre_sigmoid_h1, h1_mean, h1_sample]def gibbs_vhv(self, v0_sample):    ''' This function implements one step of Gibbs sampling,        starting from the visible state'''    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v0_sample)    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h1_sample)    return [pre_sigmoid_h1, h1_mean, h1_sample, pre_sigmoid_v1, v1_mean, v1_sample]

我们可以采用这些函数来定义一次吉布斯采样的符号图形。我们定义了两个函数：

gibbs_vhv ：这个函数从可视单元开始，执行一次吉布斯采样。正如我们所看见的，这对从RBM采样是有用的。

gibbs_hvh ：这个函数从隐藏层单元开始，执行一次吉布斯采样。这个函数对执行CD和PCD 的更新有是帮助的。

这两个函数的代码如下：

def gibbs_hvh(self, h0_sample):    ''' This function implements one step of Gibbs sampling,        starting from the hidden state'''    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h0_sample)    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v1_sample)    return [pre_sigmoid_v1, v1_mean, v1_sample, pre_sigmoid_h1, h1_mean, h1_sample]def gibbs_vhv(self, v0_sample):    ''' This function implements one step of Gibbs sampling,        starting from the visible state'''    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v0_sample)    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h1_sample)    return [pre_sigmoid_h1, h1_mean, h1_sample, pre_sigmoid_v1, v1_mean, v1_sample]

0 0