keras学习中遇到的问题

来源：互联网发布：linux怎么改名编辑：程序博客网时间：2024/05/24 06:29

1.我该怎样引用Keras？

如果Keras在你的研究中起到了作用，请在你的公开发表作品中以如下形式引用Keras：

@misc{chollet2015keras,

author = {Chollet, François},

title = {Keras},

year = {2015},

publisher = {GitHub},

journal = {GitHub repository},

howpublished = {\url{https://github.com/fchollet/keras}}

}

2.如果在GPU上运行Keras？

如果你的后台是TensorFlow,如果有任何可用的GPU被检测到，那么你的代码会自动的在GPU上运行。如果后台是Theano，你可以使用如下的方法：

方法一:使用Theano flags

THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py

上面的'gpu'应根据你的设备进行变更，如：'gpu0','gpu1'等

方法二：建立你的.theanorc：http://deeplearning.net/software/theano/library/config.html

方法三：在代码开始，人工设定theano.config.device, theano.config.floatX

import theano

theano.config.device = 'gpu'

theano.config.floatX = 'float32'

3.如何储存一个Keras 模型？

这里不推荐使用pickle 或者 cPickle(两都均为python模块)储存Keras模型

如果只需要储存模型的结构，而不是权值，可以这样做：

#save as JSON

json_string = model.to_json()

#save as YAML

yaml_string = model.to_yaml()

你可以从这些数据中建立一个新的模型：

# model reconstruction from JSON:

from keras.models import model_from_json

model = model_from_json(json_string)

# model reconstruction from YAML

model = model_from_yaml(yaml_string)

如果你需要储存一个模型的权重，可以采用HDF5来存储：

model.save_weights('my_model_weights.h5')

同样可以从HDF5的数据中来重现原有的结构和权重：

model.load_weights('my_model_weights.h5')

通过上述的方法，可以通过序列化的方法对模型进行存储和重构：

json_string = model.to_json()

open('my_model_architecture.json','w').write(json_string)

model.save_weights('my_model_weights.h5')

# elsewhere...

model = model_from_json(open('my_model_architecture.json').read())

model.load_weights('my_model_weights.h5')

4.为什么训练误差会比检验误差高很多？

一个Keras模型，有两种模式：训练和检验。dropout、l1和l2正则方法在测试的时候是不使用的。

训练误差是训练过程中每个batch误差的均值，由于模型一直在变化，最先训练的batch的误差通常比最后batch的误差大。一个epoch的测试误差的结果来自于一个epoch的最最后，导致了较小的误差值。

5.该如何看到中间层的输出？

可以通过构建一个Keras function 在给定输入的时候返回某一层的输出：

from keras import backend as K

# with a Sequential model

get_3rd_layer_output = K.function([model.layers[0].input],

[model.layers[3].get_output(train=False)])

layer_output = get_3rd_layer_output([X])[0]

# with a Graph model

get_conv_layer_output = K.function([model.inputs[i].input for i in model.input_order],

[model.nodes['conv'].get_output(train=False)])

conv_output = get_conv_layer_output([input_data_dict[i] for i in model.input_order])[0]

同样，可以直接构造Theano 和TensorFlow function

6.对于不能全部放入内存的数据该如何处理？

可以只用batch训练 model.train_on_batch(X, y) 和model.test_on_batch(X, y)

参考模型文档(暂无)

或者用如下方法model.fit_generator(data_generator, samples_per_epoch, nb_epoch)写一个迭代器，例：https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py

7.当损失值不再减小，该如何中止迭代？

可以使用EarlyStooping方法进行callback

from keras.callbacks import EarlyStopping

early_stopping = EarlyStopping(monitor='val_loss', patience=2)

model.fit(X, y, validation_split=0.2, callbacks=[early_stopping])

关于callback的更多信息请查看callback文档(暂无)

8.如何验证？

如果设置model.fit的validation_split参数为0.1，将取最后数据的10%做为验证数据，如果为0.25则取最后数据的25%作为验证。

9.数据在训练过程会被打乱吗？

是的，如果model.fit中的shuffle被设为True则，训练数据在每一个epoch将会被随机打乱。验证数据不会。

10.该如何记录在每个epoch中的训练误差、验证误差、准确率？

model.fit方法返回一个History callback，返回值中有history属性，包含了一系列连续的损失和准确率。

hist = model.fit(X, y, validation_split=0.2)

print(hist.history)

11.该如何使用stateful RNNs(暂缺，看看相关文章再补充)

阅读全文

0 0