TensorFlow下运行Google的Im2txt:show and tell inception v3

来源:互联网 发布:html引导页源码 编辑:程序博客网 时间:2024/06/05 21:02

我的设备:ubuntu14.04+GPU

TensorFlow1.0.1


相关论文《Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

https://arxiv.org/abs/1609.06647

去年9月刚开源的

github:https://github.com/tensorflow/models/tree/master/im2txt#generating-captions


根据GitHub的readme

先安装相关东西

Bazel
根据官网

$echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$sudo apt-get update&& sudo apt-get install bazel
报错:
有一些软件包无法被安装。如果您用的是 unstable发行版,这也许是因为系统无法达到您要求的状态造成的。该版本中可能会有一些您需要的软件
包尚未被创建或是它们已被从新到(Incoming)目录移出。
下列信息可能会对解决问题有所帮助:
 
下列软件包有未满足的依赖关系:
 bazel : 依赖: google-jdk 但无法安装它
                 java8-jdk但无法安装它
                 java8-sdk但无法安装它
                 oracle-java8-installer但无法安装它
E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。
 
试了网上的无数方法,各种换源都没用,直到我看到官网的一行字:
If you want to use the JDK 7, please replace jdk1.8 with jdk1.7 and if you want to install the testing version of Bazel, replace stable with testing.
 
应该是因为我的系统是ubuntu14.04,所以用的jdk7
$ update-java-alternatives -l
#java-1.7.0-openjdk-amd641071 /usr/lib/jvm/java-1.7.0-openjdk-amd64

继续按照官网
$echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.7" | sudo tee /etc/apt/sources.list.d/bazel.list
$curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$sudo apt-get update&& sudo apt-get install bazel
$sudo apt-get upgrade bazel

检查自己是否安装好了
$/usr/bin/bazel version


NumPy
安装官方文档
https://www.scipy.org/install.html
$python -m pip install --upgrade pip
$pip install --user numpy scipy matplotlib ipython jupyter pandas sympy nose
测试:
$python
>>>import scipy
>>>import numpy
>>>scipy.test()
>>>numpy.test()
网上说也可以这么装,不懂跟GitHub上链接的网址有什么不同
$sudo apt-get install Python-scipy
$sudo apt-get install python-numpy
$sudo apt-get install python-matplotlib

Natural Language Toolkit (NLTK):
首先安装NLTK
http://www.nltk.org/install.html
$sudo pip install -U nltk
$sudo pip install -U numpy
$python
>>> import nltk

然后安装NLTK数据
http://www.nltk.org/data.html
$sudo python
>>> import nltk
>>> nltk.download()
设定目录为/usr/share/nltk_data

测试数据已安装
>>> from nltk.corpus import brown
>>> brown.words()
['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ...]

这步完成的时候最后再运行一下:
>>>import nltk
>>>nltk.download('punkt')
以防下载和预处理的时候遇到以下问题

LookupError:

**********************************************************************

 Resource u'tokenizers/punkt/english.pickle' not found.  Please

  usethe NLTK Downloader to obtain the resource: >>>

 nltk.download()

 Searched in:

    -'/home/ubuntu/nltk_data'

    -'/usr/share/nltk_data'

    -'/usr/local/share/nltk_data'

    -'/usr/lib/nltk_data'

    -'/usr/local/lib/nltk_data'

    -u''

**********************************************************************


预处理
# Location to save the MSCOCO data.MSCOCO_DIR="${HOME}/im2txt/data/mscoco"# Build the preprocessing script.bazel build im2txt/download_and_preprocess_mscoco# Run the preprocessing script.bazel-bin/im2txt/download_and_preprocess_mscoco "${MSCOCO_DIR}"
这步比较简单,但是网络不好的话会经常蜜汁重下,我就下了好几次,每次时间还特别久,总之真是令人崩溃


看到这个就说明预处理成功啦

训练

$ MSCOCO_DIR="/path/to/MSCOCO"
$ INCEPTION_CHECKPOINT="/path/to/inception_v3.ckpt"
$ MODEL_DIR="/path/to/models/im2txt/model"
$ bazel build -c opt im2txt/...
$ bazel-bin/im2txt/train \
> --input_file_pattern="${MSCOCO_DIR}/train-?????-of-00256" \
> --inception_checkpoint_file="${INCEPTION_CHECKPOINT}" \
> --train_dir="${MODEL_DIR}/train" \
> --train_inception=false \
> --number_of_steps=1000000

报错:
AttributeError: 'module' object has no attribute '_base'

解决方法:
$ pip install --upgrade html5lib==1.0b8

解决后模型训练中:


接下来就是等待啦,网上说需要一到两周
我训练了三四天的样子就训练完了

接下来就是精调,在此之前测试一下训练的效果

想偷懒的小伙伴也可以跳过训练的步骤,直接用我训练的模型


用现有的模型
直接用我训练好的模型

https://github.com/withyou1771/im2txt

$CHECKPOINT_PATH="/path/to/model.ckpt-1000000"
$VOCAB_FILE="/path/to/word_counts.txt"
$IMAGE_FILE="/path/to/models/im2txt/1.jpg"
$bazel build -c opt im2txt/run_inference
$bazel-bin/im2txt/run_inference \

 --checkpoint_path=${CHECKPOINT_PATH} \

 --vocab_file=${VOCAB_FILE} \

 --input_files=${IMAGE_FILE}


$ CHECKPOINT_PATH="/path/to/model.ckpt-1000000"
$ IMAGE_FILE="/path/to/1.jpg"
$ VOCAB_FILE="/path/to/word_counts.txt"
$ bazel build -c opt im2txt/run_inference
INFO: Found 1 target...
Target //im2txt:run_inference up-to-date:
bazel-bin/im2txt/run_inference
INFO: Elapsed time: 0.138s, Critical Path: 0.00s
(tensorflow)ubuntu@ubuntu-All-Series:/home/data1/tf/models/im2txt$ bazel-bin/im2txt/run_inference --checkpoint_path=${CHECKPOINT_PATH} --vocab_file=${VOCAB_FILE} --input_files=${IMAGE_FILE}
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
INFO:tensorflow:Building model.
INFO:tensorflow:Initializing vocabulary from file: /word_counts.txt
INFO:tensorflow:Created vocabulary with 11520 words
INFO:tensorflow:Running caption generation on 1 files matching/1.jpg
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:0a:00.0
Total memory: 7.92GiB
Free memory: 7.81GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x541a970
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 1 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:09:00.0
Total memory: 7.92GiB
Free memory: 7.81GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x541e2f0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 2 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:06:00.0
Total memory: 7.92GiB
Free memory: 7.81GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x5421c70
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 3 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:05:00.0
Total memory: 7.92GiB
Free memory: 7.57GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 1 2 3
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 1: Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 2: Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 3: Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:0a:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:2) -> (device: 2, name: GeForce GTX 1080, pci bus id: 0000:06:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:3) -> (device: 3, name: GeForce GTX 1080, pci bus id: 0000:05:00.0)
INFO:tensorflow:Loading model from checkpoint: /home/data1/tf/model4/model.ckpt-1000000
INFO:tensorflow:Successfully loaded checkpoint: model.ckpt-1000000

Captions for image 1.jpg:

  0)a cat laying on top of a grass covered field . (p=0.002806)

  1)a black and white cat laying on top of a grass covered field . (p=0.000498)

  2)a black and white cat laying on top of a green field . (p=0.000412)



处理一张照片大约需要8秒,大部分时间都用在加载cuda
可以同时处理多张图片,路径用逗号隔开就可以。同时处理5张图片需要26秒,平均每张5秒左右

由于python和TensorFlow版本的不同,即使是用别人的模型也可能遇到很多错误
报错一:
ValueError: No checkpoint file found in: None
word_counts.txt文件的格式不对
替换vocabulary.py第49行
reverse_vocab = [eval(line.split()[0]) for line in reverse_vocab]

报错二:
NotFoundError (see above for traceback): Tensor name "lstm/basic_lstm
ts" not found in checkpoint files /home/data1/tf/model2/model.ckpt-30
[[Node: save/RestoreV2_381 = RestoreV2[dtypes=[DT_FLOAT], _d
:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV
r_names, save/RestoreV2_381/shape_and_slices)]]
[[Node: save/RestoreV2_295/_177 = _Recv[client_terminated=fa
evice="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:loca
ca:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_1333
reV2_295", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/ta
]()]]

TensorFlow1.0中BasicLSTMCell更改了默认变量名,不再匹配检查点。
相关的资料
https://www.tensorflow.org/install/migration

0 0
原创粉丝点击