CS231n Assignment2--Q4

来源:互联网 发布:suse网络不通 编辑:程序博客网 时间:2024/05/22 10:58

ConvNet on CIFAR-10

作业代码已上传至我github: https://github.com/jingshuangliu22/cs231n,欢迎参考、讨论、指正。

ConvolutionalNetworks.ipynb

Convolutional Networks

X_val: (1000, 3, 32, 32)
X_train: (49000, 3, 32, 32)
X_test: (1000, 3, 32, 32)
y_val: (1000,)
y_train: (49000,)
y_test: (1000,)

Convolution: Naive forward pass

Testing conv_forward_naive
difference: 2.21214764175e-08

Aside: Image processing via convolutions

这里写图片描述

Convolution: Naive backward pass

(4, 3, 5, 5)
Testing conv_backward_naive function
dx error: 2.86996555609e-09
dw error: 8.89199816094e-11
db error: 1.28608966847e-11

Max pooling: Naive forward

Testing max_pool_forward_naive function:
difference: 4.16666651573e-08

Max pooling: Naive backward

Testing max_pool_backward_naive function:
dx error: 3.27562240181e-12

Fast layers

Testing conv_forward_fast:
Naive: 5.081940s
Fast: 0.018507s
Speedup: 274.595512x
Difference: 1.16929567919e-11

Testing conv_backward_fast:
Naive: 5.766440s
Fast: 0.016649s
Speedup: 346.353382x
dx difference: 5.50346117301e-11
dw difference: 1.43192497573e-12
db difference: 6.05276059469e-15

Testing pool_forward_fast:
Naive: 0.346517s
fast: 0.003890s
speedup: 89.078022x
difference: 0.0

Testing pool_backward_fast:
Naive: 0.985550s
speedup: 63.318669x
dx difference: 0.0

Convolutional “sandwich” layers

Testing conv_relu_pool
dx error: 6.70230258127e-09
dw error: 1.09056690493e-08
db error: 3.01773123551e-11

Testing conv_relu:
dx error: 3.81650148345e-09
dw error: 4.99662214526e-10
db error: 9.06861817176e-12

Three-layer ConvNet

Sanity check loss

Initial loss (no regularization): 2.3025852649
Initial loss (with regularization): 2.50908390269

Gradient check

W1 max relative error: 1.569717e-02
W2 max relative error: 3.066828e-02
W3 max relative error: 1.419102e-05
b1 max relative error: 8.317148e-05
b2 max relative error: 2.353250e-05
b3 max relative error: 1.440009e-09

Overfit small data

(Iteration 1 / 40) loss: 2.300992
(Epoch 0 / 20) train acc: 0.150000; val_acc: 0.140000
(Epoch 1 / 20) train acc: 0.250000; val_acc: 0.135000
(Epoch 2 / 20) train acc: 0.430000; val_acc: 0.133000
(Epoch 3 / 20) train acc: 0.600000; val_acc: 0.179000
(Epoch 4 / 20) train acc: 0.720000; val_acc: 0.208000
(Epoch 5 / 20) train acc: 0.700000; val_acc: 0.185000
(Epoch 6 / 20) train acc: 0.740000; val_acc: 0.199000
(Epoch 7 / 20) train acc: 0.850000; val_acc: 0.217000
(Epoch 8 / 20) train acc: 0.900000; val_acc: 0.234000
(Epoch 9 / 20) train acc: 0.930000; val_acc: 0.218000
(Epoch 10 / 20) train acc: 0.990000; val_acc: 0.237000
(Epoch 11 / 20) train acc: 0.980000; val_acc: 0.223000
(Epoch 12 / 20) train acc: 1.000000; val_acc: 0.212000
(Epoch 13 / 20) train acc: 1.000000; val_acc: 0.207000
(Epoch 14 / 20) train acc: 1.000000; val_acc: 0.218000
(Epoch 15 / 20) train acc: 1.000000; val_acc: 0.221000
(Epoch 16 / 20) train acc: 1.000000; val_acc: 0.216000
(Epoch 17 / 20) train acc: 1.000000; val_acc: 0.219000
(Epoch 18 / 20) train acc: 1.000000; val_acc: 0.220000
(Epoch 19 / 20) train acc: 1.000000; val_acc: 0.223000
(Epoch 20 / 20) train acc: 1.000000; val_acc: 0.221000

这里写图片描述

Train the net

(Iteration 1 / 980) loss: 2.304626
(Epoch 0 / 1) train acc: 0.106000; val_acc: 0.112000
(Iteration 21 / 980) loss: 1.971010
(Iteration 41 / 980) loss: 2.064785
(Iteration 61 / 980) loss: 1.587921
(Iteration 81 / 980) loss: 1.989868
(Iteration 101 / 980) loss: 1.636903
(Iteration 121 / 980) loss: 1.483881
(Iteration 141 / 980) loss: 1.399884
(Iteration 161 / 980) loss: 1.288087
(Iteration 181 / 980) loss: 1.363215
(Iteration 201 / 980) loss: 1.224695
(Iteration 221 / 980) loss: 1.323481
(Iteration 241 / 980) loss: 1.800588
(Iteration 261 / 980) loss: 1.549284
(Iteration 281 / 980) loss: 1.376083
(Iteration 301 / 980) loss: 1.667215
(Iteration 321 / 980) loss: 1.431907
(Iteration 341 / 980) loss: 1.411308
(Iteration 361 / 980) loss: 1.442389
(Iteration 381 / 980) loss: 1.453209
(Iteration 401 / 980) loss: 1.348286
(Iteration 421 / 980) loss: 1.335179
(Iteration 441 / 980) loss: 1.409447
(Iteration 461 / 980) loss: 1.529341
(Iteration 481 / 980) loss: 1.310048
(Iteration 501 / 980) loss: 1.196834
(Iteration 521 / 980) loss: 1.229232
(Iteration 541 / 980) loss: 1.306282
(Iteration 561 / 980) loss: 1.322753
(Iteration 581 / 980) loss: 1.654557
(Iteration 601 / 980) loss: 1.192295
(Iteration 621 / 980) loss: 1.456525
(Iteration 641 / 980) loss: 1.350476
(Iteration 661 / 980) loss: 1.540975
(Iteration 681 / 980) loss: 1.200224
(Iteration 701 / 980) loss: 1.048157
(Iteration 721 / 980) loss: 1.253005
(Iteration 741 / 980) loss: 1.484741
(Iteration 761 / 980) loss: 0.905320
(Iteration 781 / 980) loss: 1.295154
(Iteration 801 / 980) loss: 1.316080
(Iteration 821 / 980) loss: 1.060246
(Iteration 841 / 980) loss: 1.157002
(Iteration 861 / 980) loss: 1.161670
(Iteration 881 / 980) loss: 1.318632
(Iteration 901 / 980) loss: 1.032028
(Iteration 921 / 980) loss: 1.356438
(Iteration 941 / 980) loss: 1.039454
(Iteration 961 / 980) loss: 1.228898
(Epoch 1 / 1) train acc: 0.579000; val_acc: 0.582000

Visualize Filters

这里写图片描述

Spatial Batch Normalization

Spatial batch normalization: forward

Before spatial batch normalization:
Shape: (2, 3, 4, 5)
Means: [ 10.60650047 9.8014902 9.96205646]
Stds: [ 4.57393753 4.65675074 3.46200416]
After spatial batch normalization:
Shape: (2, 3, 4, 5)
Means: [ 4.88498131e-16 -1.78503046e-16 1.48839274e-16]
Stds: [ 0.99999976 0.99999977 0.99999958]
After spatial batch normalization (nontrivial gamma, beta):
Shape: (2, 3, 4, 5)
Means: [ 6. 7. 8.]
Stds: [ 2.99999928 3.99999908 4.99999791]

After spatial batch normalization (test-time):
means: [ 0.07765532 0.04435318 0.04660607 0.03627944]
stds: [ 1.01960419 1.00324775 1.00426909 1.03452781]

Spatial batch normalization: backward

dx error: 1.06390341758e-07
dgamma error: 8.05854582064e-12
dbeta error: 3.27560239896e-12

0 0