Keras训练CIFAR 10 测试集精度90%网络结构及代码

来源:互联网 发布:sql 两个字段不相等 编辑:程序博客网 时间:2024/06/05 15:04

最近接触ML,在尝试玩儿CIFAR 10。一般网络上面的代码精度只有60%-80%。我再其基础上瞎JB修改了一下,在训练集精度93%左右的时候开始出现过拟合,最终测试集精度在90%出头。后续还会继续尝试提高精度,这个网络有需求的小伙伴可以拿去玩耍。

Keras下载及安装: Keras安装文档

先贴结果:

Epoch 768/2000

250/250 [==============================] - 543s - loss: 0.1814 - acc: 0.9418 - val_loss: 0.3165 - val_acc: 0.9045

请无视我的训练时间,毕竟CPU训练。。。500+S一个EPOCH,你猜训了2000次要多久呢? T.T


不废话,贴代码

使用以下代码覆盖原Keras/example/cifar10_cnn.py即可。

'''Train a simple deep CNN on the CIFAR10 small images dataset.GPU run command with Theano backend (with TensorFlow, the GPU is automatically used):    THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatx=float32 python cifar10_cnn.pyIt gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs.(it's still underfitting at that point, though).'''from __future__ import print_functionimport kerasfrom keras.datasets import cifar10from keras.preprocessing.image import ImageDataGeneratorfrom keras.models import Sequentialfrom keras.layers import Dense, Dropout, Activation, Flattenfrom keras.layers import Conv2D, MaxPooling2Dbatch_size = 200num_classes = 10epochs = 2000data_augmentation = True# The data, shuffled and split between train and test sets:(x_train, y_train), (x_test, y_test) = cifar10.load_data()print('x_train shape:', x_train.shape)print(x_train.shape[0], 'train samples')print(x_test.shape[0], 'test samples')# Convert class vectors to binary class matrices.y_train = keras.utils.to_categorical(y_train, num_classes)y_test = keras.utils.to_categorical(y_test, num_classes)model = Sequential()model.add(Conv2D(32, (3, 3), padding='same',input_shape=x_train.shape[1:]))model.add(Activation('relu'))model.add(Conv2D(32, (3, 3)))model.add(Activation('relu'))model.add(Dropout(0.25))model.add(Conv2D(64, (3, 3), padding='same'))model.add(Activation('relu'))model.add(Conv2D(64, (3, 3)))model.add(Activation('relu'))model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.25))model.add(Conv2D(128, (3, 3), padding='same'))model.add(Activation('relu'))model.add(Conv2D(128, (3, 3)))model.add(Activation('relu'))model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.25))model.add(Conv2D(256, (3, 3), padding='same'))model.add(Activation('relu'))model.add(Conv2D(256, (1, 1)))model.add(Activation('relu'))model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.25))model.add(Flatten())model.add(Dense(512))model.add(Activation('relu'))model.add(Dropout(0.5))model.add(Dense(num_classes))model.add(Activation('softmax'))# initiate RMSprop optimizeropt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)# Let's train the model using RMSpropmodel.compile(loss='categorical_crossentropy',              optimizer=opt,              metrics=['accuracy'])x_train = x_train.astype('float32')x_test = x_test.astype('float32')x_train /= 255x_test /= 255if not data_augmentation:    print('Not using data augmentation.')    model.fit(x_train, y_train,              batch_size=batch_size,              epochs=epochs,              validation_data=(x_test, y_test),              shuffle=True)else:    print('Using real-time data augmentation.')    # This will do preprocessing and realtime data augmentation:    datagen = ImageDataGenerator(        featurewise_center=False,  # set input mean to 0 over the dataset        samplewise_center=False,  # set each sample mean to 0        featurewise_std_normalization=False,  # divide inputs by std of the dataset        samplewise_std_normalization=False,  # divide each input by its std        zca_whitening=False,  # apply ZCA whitening        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)        horizontal_flip=True,  # randomly flip images        vertical_flip=False)  # randomly flip images    # Compute quantities required for feature-wise normalization    # (std, mean, and principal components if ZCA whitening is applied).    datagen.fit(x_train)    # Fit the model on the batches generated by datagen.flow().    model.fit_generator(datagen.flow(x_train, y_train,                                     batch_size=batch_size),                        steps_per_epoch=x_train.shape[0] // batch_size,                        epochs=epochs,                        validation_data=(x_test, y_test))