tensorflow参数初始化--identity initializtion

来源:互联网 发布:vs2017 js的智能提示 编辑:程序博客网 时间:2024/05/22 01:52

卷积层权重初始化的时候,通常有以下几种方法:

1.Random Uniform distribution

函数为:

class RandomUniform(Initializer):  """Initializer that generates tensors with a uniform distribution.  Args:    minval: A python scalar or a scalar tensor. Lower bound of the range      of random values to generate.    maxval: A python scalar or a scalar tensor. Upper bound of the range      of random values to generate.  Defaults to 1 for float types.    seed: A Python integer. Used to create random seeds. See      @{tf.set_random_seed}      for behavior.    dtype: The data type.  """  def __init__(self, minval=0, maxval=None, seed=None, dtype=dtypes.float32):    self.minval = minval    self.maxval = maxval    self.seed = seed    self.dtype = dtype  def __call__(self, shape, dtype=None, partition_info=None):    if dtype is None:      dtype = self.dtype    return random_ops.random_uniform(shape, self.minval, self.maxval,                                     dtype, seed=self.seed)

将参数w初始化值为[minval,maxval]范围内的随机均匀分布

2.Random Normal distribution(正态分布)

函数定义为:

class RandomNormal(Initializer):  """Initializer that generates tensors with a normal distribution.  Args:    mean: a python scalar or a scalar tensor. Mean of the random values      to generate.    stddev: a python scalar or a scalar tensor. Standard deviation of the      random values to generate.    seed: A Python integer. Used to create random seeds. See      @{tf.set_random_seed}      for behavior.    dtype: The data type. Only floating point types are supported.  """  def __init__(self, mean=0.0, stddev=1.0, seed=None, dtype=dtypes.float32):    self.mean = mean    self.stddev = stddev    self.seed = seed    self.dtype = _assert_float_dtype(dtype)  def __call__(self, shape, dtype=None, partition_info=None):    if dtype is None:      dtype = self.dtype    return random_ops.random_normal(shape, self.mean, self.stddev,                                    dtype, seed=self.seed)

将参数w初始化值为均值为mean,方差为stddev的高斯分布值.

3.Truncated Normal distribution(截断正态分布)

函数为:

class TruncatedNormal(Initializer):  """Initializer that generates a truncated normal distribution.  These values are similar to values from a `random_normal_initializer`  except that values more than two standard deviations from the mean  are discarded and re-drawn. This is the recommended initializer for  neural network weights and filters.  Args:    mean: a python scalar or a scalar tensor. Mean of the random values      to generate.    stddev: a python scalar or a scalar tensor. Standard deviation of the      random values to generate.    seed: A Python integer. Used to create random seeds. See      @{tf.set_random_seed}      for behavior.    dtype: The data type. Only floating point types are supported.  """  def __init__(self, mean=0.0, stddev=1.0, seed=None, dtype=dtypes.float32):    self.mean = mean    self.stddev = stddev    self.seed = seed    self.dtype = _assert_float_dtype(dtype)  def __call__(self, shape, dtype=None, partition_info=None):    if dtype is None:      dtype = self.dtype    return random_ops.truncated_normal(shape, self.mean, self.stddev,                                       dtype, seed=self.seed)

Truncated Normal 与Random Normal一样都为将权重初始化为正态分布,不过对于权重大于阈值(two standard deviations from the mean)的值截断.Truncated Normal初始化为常用的神经网络权重和滤波器初始化方法.

三种初始化方法tensorflow调用示例如下:

w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],                    initializer=tf.random_uniform_initializer(minval=0.0, maxval=1.0 ))
w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],                    initializer=random_normal_initializer(mean=m,stddev=stddev))
w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],                    initializer=tf.truncated_normal_initializer(mean=m,stddev=stddev))

identity initializtion

在CNN中,有时我们希望将权重初始化为上一层的feature map能够完整的传递到下一层,即对于卷积操作F2=F1w,我们希望初始化权重矩阵w,使得F2=F1,此时的权重均值w初始化操作就叫identity initializtion.

tensorflow代码实现identity initializtion代码为:

def identity_initializer():    def _initializer(shape, dtype=tf.float32):        if len(shape) == 1:            return tf.constant_op.constant(0., dtype=dtype, shape=shape)        elif len(shape) == 2 and shape[0] == shape[1]:            return tf.constant_op.constant(np.identity(shape[0], dtype))        elif len(shape) == 4 and shape[2] == shape[3]:            array = np.zeros(shape, dtype=float)            cx, cy = shape[0]/2, shape[1]/2            for i in range(shape[2]):                array[cx, cy, i, i] = 1            return tf.constant_op.constant(array, dtype=dtype)        else:            raise    return _initializer
def identity_initializer():    def _initializer(shape, dtype=tf.float32, partition_info=None):        array = np.zeros(shape, dtype=float)        cx, cy = shape[0]//2, shape[1]//2        for i in range(shape[2]):                array[cx, cy, i, i] = 1        return tf.constant(array, dtype=dtype)    return _initializer

初始化后,权重矩阵array的其他值为0,除了array[cx, cy, :,:]为单位矩阵,例如shape=[3,3,8,8],得到的array[1,2,:,:]矩阵值为,

[[ 1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1.]]

调用示例代码为:

import tensorflow.contrib.slim as slimnet=slim.conv2d(input,gm,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv1')
原创粉丝点击