Docker 安装 TensorFlow GPU 实战

来源:互联网 发布:java认证培训学校 编辑:程序博客网 时间:2024/06/10 18:02

参考:
http://blog.csdn.net/u011291159/article/details/66970202
https://github.com/NVIDIA/nvidia-docker?utm_source=tuicool&utm_medium=referral
https://docs.docker.com/engine/installation/linux/ubuntu/#install-using-the-repository

安装背景

AI如雨后春笋般的出现,DEVOPS的理论不断深入。所有高大上的开源产品都支持两个环境:docker 和Linux。本文主要讲解怎么在一台安装了GPU的centos7 环境安装tensorflow docker镜像。国内就几个大厂的同学可以享受这种高级环境待遇,如果您有该环境建议尝试起来吧,毕竟AI可以让我们多一项skill。

安装nvidia-docker

nvidia 对docker进行了一层封装,可以支持nivdia 的cpu。
具体的安装过程可以参考:
https://github.com/NVIDIA/nvidia-docker?utm_source=tuicool&utm_medium=referral

安装玩以后使用nvidia配置的命令:

[root@~]# nvidia-nvidia-bug-report.sh     nvidia-debugdump         nvidia-installer         nvidia-settings          nvidia-xconfignvidia-cuda-mps-control  nvidia-docker            nvidia-modprobe          nvidia-smi               nvidia-cuda-mps-server   nvidia-docker-plugin     nvidia-persistenced      nvidia-uninstall 

如果有下面的错误,说明没有启动相关服务:

[root@ourui]# nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpudocker: Error response from daemon: create nvidia_driver_367.48: create nvidia_driver_367.48: Error looking up volume plugin nvidia-docker: legacy plugin: plugin not found.See 'docker run --help'.

使用下面命令查看nvidia-docker 是否启动

root@ourui]# systemctl status nvidia-docker● nvidia-docker.service - NVIDIA Docker plugin   Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled)   Active: inactive (dead)     Docs: https://github.com/NVIDIA/nvidia-docker/wiki[root@ourui]# systemctl start nvidia-docker [root@ourui]# systemctl status nvidia-docker● nvidia-docker.service - NVIDIA Docker plugin   Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled)   Active: active (running) since Mon 2017-03-27 10:39:16 CST; 2s ago     Docs: https://github.com/NVIDIA/nvidia-docker/wiki  Process: 51649 ExecStartPost=/bin/sh -c /bin/echo unix://$SOCK_DIR/nvidia-docker.sock > $SPEC_FILE (code=exited, status=0/SUCCESS)  Process: 51644 ExecStartPost=/bin/sh -c /bin/mkdir -p $( dirname $SPEC_FILE ) (code=exited, status=0/SUCCESS) Main PID: 51643 (nvidia-docker-p)   Memory: 13.9M   CGroup: /system.slice/nvidia-docker.service           └─51643 /usr/bin/nvidia-docker-plugin -s /var/lib/nvidia-dockerMar 27 10:39:16 ctum2e1302005.idc.wanda-group.net systemd[1]: Starting NVIDIA Docker plugin...Mar 27 10:39:16 ctum2e1302005.idc.wanda-group.net systemd[1]: Started NVIDIA Docker plugin.

这一步就把基本的nvidia docker 环境安装好。需要注意,nvidia没有提供最新发布docker的版本,如果需要测试最新的docker release版本需要使用别的方法。

下载docker images

tensorflow 社区在docker hub 提供了一套images下载地址:
https://hub.docker.com/r/tensorflow/tensorflow/

由于我们都知道的原因,国内有时候下载docker hub的images 都是问题。我让我想起了一句话:这是一个最好的时代、也是一个最坏的时代。为了自己的房贷,想办法吧!

国内很多docker hub ,当然可以直接使用国内的docker hub,同时也提供了一些加速器,所谓加速,你们明白的。下面我们看看使用阿里云加速器:

https://yq.aliyun.com/articles/29941

设置好了过后就可以直接下载docker iamges 了

nvidia-docker pull tensorflow/tensorflow:latest-gpu

启动container

[root@ourui]# nvidia-docker run -it -d -p  8888:8888 tensorflow/tensorflow:latest-gpu  69fede4460082f3e4aa847fc34ac0f58e797dc44b10d65643a70d2a1e7e4ba03[root@ourui]# [root@ourui]# nvidia-docker logs 69fede4460082f3e4aa847fc34ac0f58e797dc44b10d65643a70d2a1e7e4ba03[I 02:45:08.016 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret[W 02:45:08.031 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.[I 02:45:08.037 NotebookApp] Serving notebooks from local directory: /notebooks[I 02:45:08.037 NotebookApp] 0 active kernels [I 02:45:08.037 NotebookApp] The Jupyter Notebook is running at: http://[all ip addresses on your system]:8888/?token=f1d1717e2fdbf8c1807f5017315396be05a6b95310d87cb9[I 02:45:08.038 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).[C 02:45:08.038 NotebookApp]     Copy/paste this URL into your browser when you connect for the first time,    to login with a token:        http://localhost:8888/?token=f1d1717e2fdbf8c1807f5017315396be05a6b95310d87cb9

测试

打开web:
http://ip:8888/?token=f1d1717e2fdbf8c1807f5017315396be05a6b95310d87cb9

Fire and Forget模式

这是发送消息的推荐方式,不会阻塞地等待消息。它拥有最好的并发性和可扩展性。

原创粉丝点击