Alexnet与vggnet的比较

来源：互联网发布：seajs 源码分析玉伯编辑：程序博客网时间：2024/06/18 18:49

Alexnet与vggnet的比较

AlexNet 是一种典型的convolutional neural network，它由5层 convolutional layer，2层 fully connected layer，和最后一层label layer (1000个node,每个node代表ImageNet中的一个类别)组成。2012年，deep learning的大牛教授Geoffrey Hinton (University of Toronto, Google)的学生AlexKrizhevsky设计了一个8层的CNN，并把它用于ImageNet的imageclassification，直接把当时最好算法的错误率差不多减半。这引起了computer vision community 的强烈关注。这篇文章的出现也是 deep learning开始被computer vision community接受的关键转折点。如是后来大家把这个8层的CNN命名为AlexNet。

alexnet图像：

Alexnet的卷积参数：

Layer

Input

Kernel

Output

Stride

Pad

256 * 3 * 227 * 227

48 * 3 * 11 * 11

256 * 48 * 55 * 55

256 * 48 * 27 * 27

128 * 48 * 5 * 5

256 * 128 * 27 * 27

256 * 128 * 13 * 13

192 * 128 * 3 * 3

256 * 192 * 13 * 13

192 * 192 * 3 * 3

256 * 192 * 13 * 13

192 * 192 * 3 * 3

256 * 192 * 13 * 13

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenetclassification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.

VGG-Net同样也是一种CNN，它来自 Andrew Zisserman教授的组(Oxford)，VGG-Net在2014年的 ILSVRC localization and classification两个问题上分别取得了第一名和第二名，VGG-Net不同于AlexNet的地方是：VGG-Net使用更多的层，通常有16－19层，而AlexNet只有8层。另外一个不同的地方是：VGG-Net的所有convolutional layer 使用同样大小的 convolutional filter，大小为 3 x 3。

VGGnet图像：

data_size=3*32*32

label_size=10

batch_size = 128

MomentumOptimizer(0.9)

VGG卷积网络参数：

Layer

Input

Kernel

Output

Stride

Pad

256 * 3 * 224 * 224

64 * 3 * 3 * 3

256 * 64 * 222 * 222

64 * 64 * 3 * 3

256 * 64 * 220 * 220

256 * 64 * 110 * 110

128 * 64 * 3 * 3

256 * 128 * 108 * 108

128 * 128 * 3 * 3

256 * 128 * 106 * 106

256 * 128 * 58 * 58

256 * 128 * 3 * 3

256 * 256 * 56 * 56

256 * 256 * 3 * 3

256 * 256 * 54 * 54

256 * 256 * 3 * 3

256 * 256 * 52 * 52

256 * 256 * 3 * 3

256 * 256 * 52 * 52

256 * 256 * 26 * 26

512 * 256 * 3 * 3

256 * 512 * 24 * 24

512 * 512 * 3 * 3

256 * 512 * 22 * 22

512 * 512 * 3 * 3

256 * 512 * 20 * 20

512 * 512 * 3 * 3

256 * 512 * 18 * 18

Simonyan, Karen, and Andrew Zisserman. “Very deepconvolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014).

Output_size 与 Input_size Kernel_size Padding Stride 的关系

Out_size=（In_size−Kernel_size+2×Pad_sizeStride+1）/stride

0 0