ubuntu16.04 + GTX1050-Ti + cuda8.0(解决桌面重复登录)

来源:互联网 发布:多益网络账号登陆 编辑:程序博客网 时间:2024/06/05 14:50

序言

tensorflow中,为了编译并运行能够使用 GPU 的 TensorFlow, 需要先安装 NVIDIA 提供的 Cuda Toolkit和 CUDNN

失败的安装过程

首先tensorflow中文社区的安装提示中提示Cuda Toolkit 7.0和 CUDNN 6.5 V2.这版本.在CUDA官网中查询release notes:http://docs.nvidia.com/cuda/#axzz4g719e0em,得知Cuda Toolkit主要包含一下内容:Compiler    The CUDA-C and CUDA-C++ compilerTools    The following development tools are available in the bin/ directory (except for Nsight Visual Studio Edition (VSE) which is installed as a plug-in to Microsoft Visual Studio)    IDEs: nsight (Linux, Mac), Nsight VSE (Windows)    Debuggers: cuda-memcheck, cuda-gdb (Linux, Mac), Nsight VSE (Windows)    Profilers: nvprof, nvvp, Nsight VSE (Windows)    Utilities: cuobjdump, nvdisasm, gwiz

cudnn主要是NIVAD提供的神经网络GPU加速的库.

我选择安装最新版本,想当然也知道,N卡加速肯定有系统要求的,去官网查看安装说明里面有如下内容:To use CUDA on your system, you will need the following installed:    CUDA-capable GPU    A supported version of Linux with a gcc compiler and toolchain    NVIDIA CUDA Toolkit (available at http://developer.nvidia.com/cuda-downloads) 显卡需求:TensorFlow 的 GPU 特性只支持 NVidia Compute Capability >= 3.5 的显卡.操作系统需求和GCC版本需求.最后才是安装开发TOOLKIT.我选择runfile的安装形式,官网提示直接使用 sudo sh cudaxxxx报错:It appears that an X server is running. Please exit X before installation. If you're sure that X is not running, but are getting this error, please delete any X lock files in /tmp.官网提示:Disable the Nouveau drivers.Reboot into text mode (runlevel 3).Verify that the Nouveau drivers are not loaded. If the Nouveau drivers are still loaded, consult your distribution's documentation to see if further steps are needed to disable Nouveau.其中disable需要将模块添加黑名单,然后加载到内核中.期间需要使用到mkinitramfs报错:E: Problem with MergeList /var/lib/apt/lists/ppa.launchpad.net_vincent-c_nevernote_ubuntu_dists_xenial_main_binary-amd64_Packages解决:sudo rm /var/lib/apt/lists/* -vflsmod | grep nouveau显示已经没有该模块,报错如下please delete any x lock file in /tmp删除/temp 下.X0文件

ERROR: The kernel module failed to load, because it was not signed by a key

   that is trusted by the kernel. Please try installing the driver   again, and sign the kernel module when prompted to do so.

仔细看下报错,然后阅读安装说明.

查看/usr/src/ 目录下没有kernel source

使用apt-get install linux-source 4.8.0安装

放弃了安装默认361版本的驱动,去官网上下载375的适配我设配的版本。安装后NVIDIA驱动问题,导致桌面循环登录(不确定是不是这个原因,我猜的)

现在桌面进不去,首先我想先把NVIDIA的驱动给删掉,重启试试。

命令行下中文乱码(也是醉了,把语言还原成英文算了)

用vi(或nano等文本编辑器)打开 /etc/default/locale 文件
将原来的配置内容修改为
LANG=”en_US.UTF-8″
LANGUAGE=”en_US:en”

Use the following command to uninstall a Driver runfile installation:

$ sudo /usr/bin/nvidia-uninstall

reboot 继续报错,session 会话下的所有内容都不会被保存。此时startx 可以进去了

xsession-error 里面全都是sogoupy的错误?

安装gdm后命令行也无法进入。放弃了~

重装系统待续。~


重新整理安装步骤

经过上述过程,我的ubuntu系统已经崩溃掉了。解决方法如下。
1.重装Ubuntu16.04系统。
2.安装NVIDIA375驱动,仍然进不去。
3.卸载sudo ./NVIDIAxxxxxxx.run –uninstall 后可以进去。
根据别人博客说明,解决如下
1.lsmod |grep nouveau,查看到源生的驱动在。
2.禁用自带的 nouveau nvidia驱动 (important!)
创建一个文件通过命令 sudo vim /etc/modprobe.d/blacklist-nouveau.conf
并添加如下内容:
blacklist nouveau
options nouveau modeset=0
再更新一下
sudo update-initramfs -u
修改后需要重启系统。确认下Nouveau是已经被你干掉,使用命令: lsmod | grep nouveau
3.禁掉lightdm桌面管理器,安装驱动(secure boot disabled)
sudo /etc/init.d/lightdm stop
sudo ./NVIDIA-Linux-x86_64-375.20.run –no-opengl-files(这个参数不知道什么用,别人说有用我就加了)
sudo /etc/init.d/lightdm start
4.OK,解决。

PS:其中用了sigh the kernel 选项,因为without it 安装失败。

安装cuda toolkit

sudo sh cuda_xxxx_linux.run
选项如下

DescriptionThis package includes over 100+ CUDA examples that demonstratevarious CUDA programming principles, and efficient CUDAimplementation of algorithms in specific application domains.The NVIDIA CUDA Samples License Agreement is available inDo you accept the previously read EULA?accept/decline/quit: acceptInstall NVIDIA Accelerated Graphics Driver for Linux-x86_64 367.48?(y)es/(n)o/(q)uit: nInstall the CUDA 8.0 Toolkit?(y)es/(n)o/(q)uit: yEnter Toolkit Location [ default is /usr/local/cuda-8.0 ]:Do you want to install a symbolic link at /usr/local/cuda?(y)es/(n)o/(q)uit: yInstall the CUDA 8.0 Samples?(y)es/(n)o/(q)uit: yEnter CUDA Samples Location [ default is /home/c302 ]:Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...Installing the CUDA Samples in /home/c302 ...Copying samples to /home/c302/NVIDIA_CUDA-8.0_Samples now...Finished copying samples.============ Summary ============Driver:   Not SelectedToolkit:  Installed in /usr/local/cuda-8.0Samples:  Installed in /home/c302Please make sure that -   PATH includes /usr/local/cuda-8.0/bin -   LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as rootTo uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/binPlease see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:    sudo <CudaInstaller>.run -silent -driverLogfile is /tmp/cuda_install_9045.log

添加环境变量

export PATH=/usr/local/cuda-8.0/bin:$PATHexport LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH

添加系统变量修改到系统文件

sudo vi /etc/profilereboot(重启生效)

验证cuda

c302@c302-dl:~/Downloads$ nvcc -Vnvcc: NVIDIA (R) Cuda compiler driverCopyright (c) 2005-2016 NVIDIA CorporationBuilt on Sun_Sep__4_22:14:01_CDT_2016Cuda compilation tools, release 8.0, V8.0.44

测试

cd ‘/home/xxxx/NVIDIA_CUDA-8.0_Samples’make  //这里需要点时间cd 0_Simple/matrixMul./matrixMul

这里写图片描述

其他问题:循环登录黑屏驱动不工作参见如下博客http://www.cnblogs.com/matthewli/p/6715553.html?utm_source=tuicool&utm_medium=referralsecure boot情况下签名内核模块参考如下:http://www.linuxdiyf.com/linux/20154.html
4 0
原创粉丝点击