tensorflow学习笔记(十二):Normalization

来源：互联网发布：c语言界面的汉化编辑：程序博客网时间：2024/06/05 08:03

Normalization

local_response_normalization

local_response_normalization出现在论文”ImageNet Classification with deep Convolutional Neural Networks”中,论文中说,这种normalization对于泛化是有好处的.

b i x, y = a i x , y ( k + α \sum m i n ( 0 , i + n / 2 ) j = m a x ( 0 , i - n / 2 ) ( a j x , y ) 2 ) β

经过了一个conv2d或pooling后,我们获得了[batch_size, height, width, channels]这样一个tensor.现在,将channels称之为层,不考虑batch_size
-

i代表第

i层
-

aix,y就代表第

i层的 (x,y)位置所对应的值
-

n个相邻feature maps.
-

k...α...n...β是hyper parameters
- 可以看出,这个函数的功能就是,

aix,y需要用他的相邻的map的同位置的值进行normalization
在alexnet中,

k=2,n=5,α=10−4,β=0.75

tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)'''Local Response Normalization.The 4-D input tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within depth_radius. In detail,'''"""input: A Tensor. Must be one of the following types: float32, half. 4-D.depth_radius: An optional int. Defaults to 5. 0-D. Half-width of the 1-D normalization window.bias: An optional float. Defaults to 1. An offset (usually positive to avoid dividing by 0).alpha: An optional float. Defaults to 1. A scale factor, usually positive.beta: An optional float. Defaults to 0.5. An exponent.name: A name for the operation (optional)."""

depth_radius: 就是公式里的n/2
bias : 公式里的k
input: 将conv2d或pooling 的输出输入就行了[batch_size, height, width, channels]
return :[batch_size, height, width, channels], 正则化后

batch_normalization

论文地址
batch_normalization, 故名思意,就是以batch为单位进行normalization
- 输入:mini_batch: In={x1,x2,..,xm}
- γ,β,需要学习的参数,都是向量
- ϵ: 一个常量
- 输出: Out={y1,y2,...,ym}
算法如下:
(1)mini_batch mean:

μ I n \leftarrow 1 m \sum i = 1 m x i

(2)mini_batch variance

σ 2 I n = 1 m \sum i = 1 m (x i - μ I n) 2

(3)Normalize

x ̂ i = x i - μ I n σ 2 I n + ϵ ‾ ‾ ‾ ‾ ‾ ‾ ‾ \sqrt

(4)scale and shift

y i = γ x ̂ i + β

可以看出,batch_normalization之后,数据的维数没有任何变化,只是数值发生了变化

Out作为下一层的输入
函数:
tf.nn.batch_normalization()

def batch_normalization(x,                        mean,                        variance,                        offset,                        scale,                        variance_epsilon,                        name=None):

Args:
- x: Input Tensor of arbitrary dimensionality.
- mean: A mean Tensor.
- variance: A variance Tensor.
- offset: An offset Tensor, often denoted β in equations, or None. If present, will be added to the normalized tensor.
- scale: A scale Tensor, often denoted γ in equations, or None. If present, the scale is applied to the normalized tensor.
- variance_epsilon: A small float number to avoid dividing by 0.
- name: A name for this operation (optional).
- Returns: the normalized, scaled, offset tensor.
对于卷积,x:[bathc,height,width,depth]
对于卷积,我们要feature map中共享 γi 和 βi ,所以 γ,β的维度是[depth]

现在,我们需要一个函数返回mean和variance, 看下面.

tf.nn.moments()

def moments(x, axes, shift=None, name=None, keep_dims=False):# for simple batch normalization pass `axes=[0]` (batch only).

对于卷积的batch_normalization, x 为[batch_size, height, width, depth],axes=[0,1,2],就会输出(mean,variance), mean 与 variance 均为标量。

0 0