deformable convolution

来源:互联网 发布:女生滴风油精知乎 编辑:程序博客网 时间:2024/06/08 14:55

CNN由于固定的几何结构,导致其对几何形变的建模受到限制。为了加强CNN对形变的建模能力,文献”deformable convolution network的”提出了deformable convolution 和 deformable RoI pooling两种网络结构单元。deformable convolution 和 deformable RoI pooling都是基于通过学习一个额外的偏移(offset),使卷积核对输入feature map的采样的产生偏移,集中于感兴趣的目标区域。可以将deformable convolution , deformable RoI pooling加入现有的CNN中,并可进行端到端训练。

deformable convolution

这里写图片描述

上图为3x3标准卷积和deformable卷积。图a为标准卷积,绿色点位卷积核权重值。图b,c,d为可变形卷积,箭头表示卷积核权重的偏移。图c为可变形卷积学到了平移尺度形变,图d为旋转形变。

可变形卷积网络结构如下:
这里写图片描述

通过一个卷积层,对输入feaure map学习偏移量offset,再通过双线性插值,得到输出feature map.

标准卷积:

3×3卷积核为例,首先生成卷积核采样网格点:
这里写图片描述
对于输出feature map y中的每个点p0,计算公式为:
这里写图片描述

x为输入feature map。

可变形卷积:

对于可变形卷积,在采样网格点R的基础上,加上偏移量offsets :

这里写图片描述

式中,N=|R|.
那么输出feature map y在p0点的值为:

这里写图片描述

这样,采样点位标准的网格点加上偏移量。上式可以通过双线性插值实现:
这里写图片描述

p=这里写图片描述
p表示输入feaure map x中的所有空间坐标点,G(.,.)为双线性插值核,G时2维的,将其分为两个1维的核:

这里写图片描述

代码分析:

class ConvOffset2D(Conv2D):    """ConvOffset2D    Convolutional layer responsible for learning the 2D offsets and output the    deformed feature map using bilinear interpolation    Note that this layer does not perform convolution on the deformed feature    map. See get_deform_cnn in cnn.py for usage    """    def __init__(self, filters, init_normal_stddev=0.01, **kwargs):        """Init        Parameters        ----------        filters : int            Number of channel of the input feature map        init_normal_stddev : float            Normal kernel initialization        **kwargs:            Pass to superclass. See Con2D layer in Keras        """        self.filters = filters        super(ConvOffset2D, self).__init__(            self.filters * 2, (3, 3), padding='same', use_bias=False,            kernel_initializer=RandomNormal(0, init_normal_stddev),            **kwargs        )    def call(self, x):        """Return the deformed featured map"""        x_shape = x.get_shape()        offsets = super(ConvOffset2D, self).call(x)        # offsets: (b*c, h, w, 2)        offsets = self._to_bc_h_w_2(offsets, x_shape)        # x: (b*c, h, w)        x = self._to_bc_h_w(x, x_shape)        # X_offset: (b*c, h, w)        x_offset = tf_batch_map_offsets(x, offsets)        # x_offset: (b, h, w, c)        x_offset = self._to_b_h_w_c(x_offset, x_shape)        return x_offset    def compute_output_shape(self, input_shape):        """Output shape is the same as input shape        Because this layer does only the deformation part        """        return input_shape    @staticmethod    def _to_bc_h_w_2(x, x_shape):        """(b, h, w, 2c) -> (b*c, h, w, 2)"""        x = tf.transpose(x, [0, 3, 1, 2])        x = tf.reshape(x, (-1, int(x_shape[1]), int(x_shape[2]), 2))        return x    @staticmethod    def _to_bc_h_w(x, x_shape):        """(b, h, w, c) -> (b*c, h, w)"""        x = tf.transpose(x, [0, 3, 1, 2])        x = tf.reshape(x, (-1, int(x_shape[1]), int(x_shape[2])))        return x    @staticmethod    def _to_b_h_w_c(x, x_shape):        """(b*c, h, w) -> (b, h, w, c)"""        x = tf.reshape(            x, (-1, int(x_shape[3]), int(x_shape[1]), int(x_shape[2]))        )        x = tf.transpose(x, [0, 2, 3, 1])        return x

通过一个卷积层学习偏移量:

offsets = super(ConvOffset2D, self).call(x)

对输入特征,采用双线性插值,计算可变形卷积结果:

# X_offset: (b*c, h, w)x_offset = tf_batch_map_offsets(x, offsets)

双线性插值函数为:

def tf_batch_map_offsets(input, offsets, order=1):    """Batch map offsets into input    Parameters    ---------    input : tf.Tensor. shape = (b, s, s)    offsets: tf.Tensor. shape = (b, s, s, 2)    Returns    -------    tf.Tensor. shape = (b, s, s)    """    input_shape = tf.shape(input)    batch_size = input_shape[0]    input_size = input_shape[1]    offsets = tf.reshape(offsets, (batch_size, -1, 2))    grid = tf.meshgrid(        tf.range(input_size), tf.range(input_size), indexing='ij'    )    grid = tf.stack(grid, axis=-1)    grid = tf.cast(grid, 'float32')    grid = tf.reshape(grid, (-1, 2))    grid = tf_repeat_2d(grid, batch_size)    coords = offsets + grid    mapped_vals = tf_batch_map_coordinates(input, coords)    return mapped_vals

可变性卷积在CNN中的应用为:

标准卷积:

# conv12l = Conv2D(64, (3, 3), padding='same', strides=(2, 2), name='conv12')(l)l = Activation('relu', name='conv12_relu')(l)l = BatchNormalization(name='conv12_bn')(l)

可变形卷积:

# conv12l_offset = ConvOffset2D(32, name='conv12_offset')(l)l = Conv2D(64, (3, 3), padding='same', strides=(2, 2), name='conv12', trainable=trainable)(l_offset)l = Activation('relu', name='conv12_relu')(l)l = BatchNormalization(name='conv12_bn')(l)

训练minist模型:

完整CNN代码:

def get_cnn():    inputs = l = Input((28, 28, 1), name='input')    # conv11    l = Conv2D(32, (3, 3), padding='same', name='conv11')(l)    l = Activation('relu', name='conv11_relu')(l)    l = BatchNormalization(name='conv11_bn')(l)    # conv12    l = Conv2D(64, (3, 3), padding='same', strides=(2, 2), name='conv12')(l)    l = Activation('relu', name='conv12_relu')(l)    l = BatchNormalization(name='conv12_bn')(l)    # conv21    l = Conv2D(128, (3, 3), padding='same', name='conv21')(l)    l = Activation('relu', name='conv21_relu')(l)    l = BatchNormalization(name='conv21_bn')(l)    # conv22    l = Conv2D(128, (3, 3), padding='same', strides=(2, 2), name='conv22')(l)    l = Activation('relu', name='conv22_relu')(l)    l = BatchNormalization(name='conv22_bn')(l)    # out    l = GlobalAvgPool2D(name='avg_pool')(l)    l = Dense(10, name='fc1')(l)    outputs = l = Activation('softmax', name='out')(l)    return inputs, outputs

加入可变形卷积的CNN:

def get_deform_cnn(trainable):    inputs = l = Input((28, 28, 1), name='input')    # conv11    l = Conv2D(32, (3, 3), padding='same', name='conv11', trainable=trainable)(l)    l = Activation('relu', name='conv11_relu')(l)    l = BatchNormalization(name='conv11_bn')(l)    # conv12    l_offset = ConvOffset2D(32, name='conv12_offset')(l)    l = Conv2D(64, (3, 3), padding='same', strides=(2, 2), name='conv12', trainable=trainable)(l_offset)    l = Activation('relu', name='conv12_relu')(l)    l = BatchNormalization(name='conv12_bn')(l)    # conv21    l_offset = ConvOffset2D(64, name='conv21_offset')(l)    l = Conv2D(128, (3, 3), padding='same', name='conv21', trainable=trainable)(l_offset)    l = Activation('relu', name='conv21_relu')(l)    l = BatchNormalization(name='conv21_bn')(l)    # conv22    l_offset = ConvOffset2D(128, name='conv22_offset')(l)    l = Conv2D(128, (3, 3), padding='same', strides=(2, 2), name='conv22', trainable=trainable)(l_offset)    l = Activation('relu', name='conv22_relu')(l)    l = BatchNormalization(name='conv22_bn')(l)    # out    l = GlobalAvgPool2D(name='avg_pool')(l)    l = Dense(10, name='fc1', trainable=trainable)(l)    outputs = l = Activation('softmax', name='out')(l)    return inputs, outputs

训练代码为:

python scaled_mnist.py

训练结果

每层参数量

这里写图片描述

标注卷积训练,测试结果

这里写图片描述

可变形卷积训练,测试结果

这里写图片描述

代码参考自:https://github.com/felixlaumon/deform-conv