Tensorflow cifar10_multi_gpu问题:Variable conv1/weights/ExponentialMovingAverage/ does not exist

来源:互联网 发布:天谕玉虚捏脸数据分享 编辑:程序博客网 时间:2024/05/20 00:53

[root@dl3 cifar10]# python cifar10_multi_gpu_train.py --num_gpus=2Traceback (most recent call last):  File "cifar10_multi_gpu_train.py", line 274, in <module>    tf.app.run()  File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run    _sys.exit(main(_sys.argv[:1] + flags_passthrough))  File "cifar10_multi_gpu_train.py", line 270, in main    train()  File "cifar10_multi_gpu_train.py", line 211, in train    variables_averages_op = variable_averages.apply(tf.trainable_variables())  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/moving_averages.py", line 367, in apply    colocate_with_primary=True)  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 113, in create_slot    return _create_slot_var(primary, val, "", validate_shape, None, None)  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 66, in _create_slot_var    validate_shape=validate_shape)  File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1065, in get_variable    use_resource=use_resource, custom_getter=custom_getter)  File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 962, in get_variable    use_resource=use_resource, custom_getter=custom_getter)  File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 367, in get_variable    validate_shape=validate_shape, use_resource=use_resource)  File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 352, in _true_getter    use_resource=use_resource)  File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 682, in _get_single_variable    "VarScope?" % name)ValueError: Variable conv1/weights/ExponentialMovingAverage/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?


参考一下解决方法:

http://stackoverflow.com/questions/41986583/tensorflow-multi-gpu-example-error-variable-conv1-weights-exponentialmovingaver


up vote
down voteaccepted

you can find the answer to your problem here: Issue 6220

You need to put:
with tf.variable_scope(tf.get_variable_scope())
in front of the loop which runs over your devices ...

so, do that:

with tf.variable_scope(tf.get_variable_scope()):    for i in xrange(FLAGS.num_gpus):        with tf.device('/gpu:%d' % i):

The explanation is given in the link...
Here the quote:

When you do tf.get_variable_scope().reuse_variables() you set the current scope to reuse variables. If you call the optimizer in such scope, it's trying to reuse slot variables, which it cannot find, so it throws an error. If you put a scope around, the tf.get_variable_scope().reuse_variables() only affects that scope, so when you exit it, you're back in the non-reusing mode, the one you want.

Hope that helps, let me know if I should clarify more.