[caffe]安装中遇到的问题

来源:互联网 发布:java获取配置文件路径 编辑:程序博客网 时间:2024/05/16 18:57

1.identifier “cudnnTensor4dDescriptor_t” is undefined

Solution_1:

CMakelist.txt :

Change

caffe_option(USE_CUDNN “Build Caffe with cuDNN libary support” ON IF NOT CPU_ONLY)

to

caffe_option(USE_CUDNN “Build Caffe with cuDNN libary support” OFF IF NOT CPU_ONLY)

Solution_2:

[
原来tsubame里已经安装了cudnn v3….我找到了路径…
http://tsubame.gsic.titech.ac.jp/docs/guides/tsubame2/html_en/programming.html#cuda-compiler
]

其实这个到底是怎么解决的我记不清了,可能只是进行了(1), 或者(1)(2)都进行了
(1)

cd /usr/apps.sp3/cuda
ls
5.0 5.5 6.0 6.5 7.0
cd 7.0
. ./cuda.sh

(2)
After cmake -DCMAKE_INSTALL_PREFIX=$LOCAL -DBOOST_ROOT:PATHNAME=$LOCAL -DCMAKE_CXX_FLAGS=-L$LOCAL/lib -DBUILD_matlab=ON ..

Change the CMakeCache.txt

//Path to cuDNN include directory.CUDNN_INCLUDE:PATH=/usr/apps.sp3/cuda/7.0/include//Path to cuDNN library.CUDNN_LIBRARY:FILEPATH=/usr/apps.sp3/cuda/7.0/lib64/libcudnn.so//CUDNN root folderCUDNN_ROOT:PATH=

2. When I successfully “cmake”, I failed in “make”

[ 98%] Building Matlab interface: /work1/lisa-caffe-public/matlab/caffe/caffe.mexa64Building with ‘g++’./usr/bin/ld: cannot find -lpython2

However, I read those make lines and I found it successfully locate the python path.

-- Python:--   Interpreter       :   /home/usr9/local/bin/python2.7 (ver. 2.7.7)--   Libraries         :   /home0/usr9/local/lib/libpython2.7.so (ver 2.7.7)--   NumPy             :   /home/usr9/local/lib/python2.7/site-packages/numpy/core/include (ver 1.10.1)

There are several ways to solve this problem, however, because I don’t have permision to make symbolic link in university server, I just “cmake” again while this time I removed the matlab part..

cmake -DCMAKE_INSTALL_PREFIX=$LOCAL -DBOOST_ROOT:PATHNAME=$LOCAL -DCMAKE_CXX_FLAGS=-L$LOCAL/lib -DBUILD_matlab=ON ..
cmake -DCMAKE_INSTALL_PREFIX=$LOCAL -DBOOST_ROOT:PATHNAME=$LOCAL -DCMAKE_CXX_FLAGS=-L$LOCAL/lib ..

3. something about the groupID

when I do cmake, it said “cannot found numpy”(and some other things) though I have installed.

Some software are pre-installed in the server, so maybe the groupID?

I changed my groudID from users to t2g-xxxx2011, then this time “cmake.file” error (I installed cmake while the groupID was “users”), I echo $LD_LIBRARY_PATH and find it changed!

If I set groupID=t2g-xxxx2011 -> error A
If I set groupID=users -> no error A but error B

My tutor found that the directory “caffe/” is drwxr-xr-x, then he -> chmod +s caffe/ (all subdirectories except files) [I forgot the command… maybe “g+s” ? I only remember that he also used | xargs and d to find only directories to add s]

Then, “ls -l” , it looks like:

-rw-r--r-- 1  15Mxxxxx t2g-xxxx2011  2102 Jan  8 15:02 README.mddrwxr-sr-x 13 15Mxxxxx t2g-xxxx2011  4096 Jan  8 21:01 build

It works! And I don’t have to change groupID from “users” to “t2g-xxxx2011”

[I download caffe in my local computer and then upload to server, so next time I have to pay attention to that]

found a command:
find . -type d -print0 | xargs -0 chmod g+s

4. mnist example

1)

when I “./examples/mnist/create_mnist.sh”

the error:

Check failed: mdb_env_open(env, filename_->c_str(), 0, 0664) == 0 (12 vs. 0) mdb_env_open failed

I find the solution here:https://groups.google.com/forum/#!msg/caffe-users/m4iCrEK2Qy8/2DRRy2VMZUEJ

” errno 12 means no enough memory, so I change the 1TB value to 1GB in the above line, like that:mdb_env_set_mapsize(mdb_env_, 1099511627776/1024), MDB_SUCCESS). My platform is suse SLES 11, I’m not sure why it doesnot work with 1TB size.”

So I changed the caffe/examples/mnist/convert_mnist_data.cpp,

then removed “build” directory and “cmake”, “make” again.

2)

when I “./examples/mnist/train_lenet.sh”

Check failed: mdb_status == 0 (12 vs. 0) Cannot allocate memory

Solution from: https://github.com/BVLC/caffe/issues/2709

“I solved it by changing the following:
in convert_mnist_data.cpp

CHECK_EQ(mdb_env_set_mapsize(mdb_env, 1099511627776), MDB_SUCCESS) // 1TB
to:

CHECK_EQ(mdb_env_set_mapsize(mdb_env, 1073741824), MDB_SUCCESS) // 1GB

and compile again

and also in :
/src/caffe/util/db_lmdb.cpp
change
LMDB_MAP_SIZE
to
4294967296”

So I changed the LMDB_MAP_SIZE to 1073741824(not 4294967296), and “cmake”,”make” again.

5. use python in caffe

.bashrc.txt

umask 022export LANG=en_US.UTF-8export LANGUAGE=en_US.UTF-8export LC_ALL=en_US.UTF-8export WORK=/work1/t2g-shinoda2011/15M54105export LOCAL=$WORK/localexport INTALL_RN=$WORK/install_rnexport PATH=~/local/bin:~/.gem/ruby/2.0.0/bin:$LOCAL/bin:$INTALL_RN/yasm/bin:$WORK/lisa-caffe-public/examples/LRCN_activity_recognition:$PATHexport MANPATH=~/local/share/man:$LOCAL/share/man:$MANPATHexport PYTHONPATH=$WORK/parallel0/caffe/python:$WORK/newTest1/lisa-caffe-public/python:~/local/lib/python2.7/site-packages:$PYTHONPATHexport GEM_HOME=~/.gem/ruby/gems/2.0.0:$GEM_HOMEexport RUBYLIB=~/.gem/ruby/gems/2.0.0:$RUBYLIBexport INCLUDE_PATH=~/local/include:$LOCAL/include:$INCLUDE_PATHexport C_INCLUDE_PATH=~/local/include:$C_INCLUDE_PATHexport CPLUS_INCLUDE_PATH=~/local/include:$CPLUS_INCLUDE_PATHexport LIBRARY_PATH=$LOCAL/lib:~/local/lib:$LIBRARY_PATHexport LD_LIBRARY_PATH=$LOCAL/lib:~/local/lib:$LD_LIBRARY_PATHexport INCLUDE_PATH=~/local/include/ncursesw:~/local/include:$INCLUDE_PATHexport C_INCLUDE_PATH=~/local/include/ncursesw:~/local/include:$C_INCLUDE_PATHexport CPLUS_INCLUDE_PATH=~/local/include/ncursesw:~/local/include:$CPLUS_INCLUDE_PATHexport http_proxy="http://localhost:3128"export https_proxy="http://localhost:3128"export ftp_proxy="http://localhost:3128"

1.To import the caffe Python module after completing the installation, add the module directory to your $PYTHONPATH by export PYTHONPATH=/path/to/caffe/python:$PYTHONPATH or the like. You should not import the module in the caffe/python/caffe directory!

2.To compile the Python and MATLAB wrappers do make pycaffe and make matcaffe respectively. Be sure to set your MATLAB and Python paths in Makefile.config first!

3.Distribution: run make distribute to create a distribute directory with all the Caffe headers, compiled libraries, binaries, etc. needed for distribution to other machines.

4.Speed: for a faster build, compile in parallel by doing make all -j8 where 8 is the number of parallel threads for compilation (a good choice for the number of threads is the number of cores in your machine)

6 HDF5报错: HDF5 header version与HDF5 library不匹配

http://www.cnblogs.com/platero/p/4077934.html

I0323 14:01:28.488822 10457 layer_factory.hpp:77] Creating layer dataWarning! ***HDF5 library version mismatched error***The HDF5 header files used to compile this application do not matchthe version used by the HDF5 library to which this application is linked.Data corruption or segmentation faults may occur if the application continues.This can happen when an application was compiled by one version of HDF5 butlinked with a different version of static or shared HDF5 library.You should recompile the application or check your shared library relatedsettings such as 'LD_LIBRARY_PATH'.'HDF5_DISABLE_VERSION_CHECK' environment variable is set to 1, application willcontinue at your own risk.Headers are 1.8.16, library is 1.8.13        SUMMARY OF THE HDF5 CONFIGURATION        =================================General Information:-------------------           HDF5 Version: 1.8.13          Configured on: Fri May 16 14:09:09 EDT 2014          Configured by: djmarsha@annlnxdjmarsha11         Configure mode: production            Host system: x86_64-unknown-linux-gnu          Uname information: Linux annlnxdjmarsha11 2.6.18-371.1.2.el5 #1 SMP Tue Oct 22 12:51:53 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux               Byte sex: little-endian              Libraries: static, shared         Installation point: /home/compilations/hdf5/1.8.13/Linux64/installCompiling Options:------------------               Compilation Mode: production                     C Compiler: /opt/anss/bin/gcc ( gcc (GCC) 4.6.1)                         CFLAGS:                      H5_CFLAGS: -std=c99 -pedantic -Wall -Wextra -Wundef -Wshadow -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wcast-align -Wwrite-strings -Wconversion -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wredundant-decls -Wnested-externs -Winline -Wno-long-long -Wfloat-equal -Wmissing-format-attribute -Wmissing-noreturn -Wpacked -Wdisabled-optimization -Wformat=2 -Wendif-labels -Wdeclaration-after-statement -Wold-style-definition -Winvalid-pch -Wvariadic-macros -Wnonnull -Winit-self -Wmissing-include-dirs -Wswitch-default -Wswitch-enum -Wunused-macros -Wunsafe-loop-optimizations -Wc++-compat -Wstrict-overflow -Wlogical-op -Wlarger-than=2048 -Wvla -Wsync-nand -Wframe-larger-than=16384 -Wpacked-bitfield-compat -Wstrict-aliasing -Wstrict-overflow=5 -Wjump-misses-init -Wunsuffixed-float-constants -Wdouble-promotion -Wsuggest-attribute=const -Wtrampolines -O3 -fomit-frame-pointer -finline-functions                      AM_CFLAGS:                       CPPFLAGS:                    H5_CPPFLAGS: -D_POSIX_C_SOURCE=199506L   -DNDEBUG -UH5_DEBUG_API                    AM_CPPFLAGS: -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_BSD_SOURCE               Shared C Library: yes               Static C Library: yes  Statically Linked Executables: no                        LDFLAGS:                     H5_LDFLAGS:                     AM_LDFLAGS:          Extra libraries:  -lz -lrt -ldl -lm                Archiver: ar               Ranlib: ranlib           Debugged Packages:            API Tracing: noLanguages:----------                        Fortran: no                            C++: noFeatures:---------                  Parallel HDF5: no             High Level library: yes                   Threadsafety: no            Default API Mapping: v18 With Deprecated Public Symbols: yes         I/O filters (external): deflate(zlib)         I/O filters (internal): shuffle,fletcher32,nbit,scaleoffset                            MPE: no                     Direct VFD: no                        dmalloc: noClear file buffers before write: yes           Using memory checker: no         Function Stack Tracing: no      Strict File Format Checks: no   Optimization Instrumentation: no       Large File Support (LFS): yesTraceback (most recent call last):  File "/work1/t2g-shinoda2011/15M54105/parallel0/caffe_final/examples/LRCN_activity_recognition/sequence_input_layer.py", line 18, in <module>    import h5py  File "/home/usr9/15M54105/local/lib/python2.7/site-packages/h5py/__init__.py", line 31, in <module>    from .highlevel import *  File "/home/usr9/15M54105/local/lib/python2.7/site-packages/h5py/highlevel.py", line 13, in <module>    from ._hl.base import is_hdf5, HLObject  File "/home/usr9/15M54105/local/lib/python2.7/site-packages/h5py/_hl/base.py", line 78, in <module>    dlapl = default_lapl()  File "/home/usr9/15M54105/local/lib/python2.7/site-packages/h5py/_hl/base.py", line 65, in default_lapl    lapl = h5p.create(h5p.LINK_ACCESS)  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/Scratch/scr_host/tmp/pip-build-3fEnyY/h5py/h5py/_objects.c:2574)  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/Scratch/scr_host/tmp/pip-build-3fEnyY/h5py/h5py/_objects.c:2533)  File "h5py/h5p.pyx", line 130, in h5py.h5p.create (/Scratch/scr_host/tmp/pip-build-3fEnyY/h5py/h5py/h5p.c:2596)ValueError: Not a property list class (Not a property list class)

tsubame的系统里装的应该只有1.8.13阿,1.8.16是哪里来的?
后来我采取了简单粗暴的方式,直接在tsubame里安装1.8.16
第一次安装失败了,因为我的安装路径是在一个已经装了很多软件的文件夹下,于是我新建了一个空的文件夹,重新安装,安装成功.
修改了.bashrc文件,

export HDF5=$WORK/software/hdf5export PATH=$WORK/software/hdf5/bin:~/local/bin:~/.gem/ruby/2.0.0/bin:$LOCAL/bin:$INTALL_RN/yasm/bin:$WORK/lisa-caffe-public/examples/LRCN_activity_recognition:$PATHexport MANPATH=~/local/share/man:$LOCAL/share/man:$MANPATHexport PYTHONPATH=$WORK/parallel0/caffe_final/python:$WORK/newTest1/lisa-caffe-public/python:~/local/lib/python2.7/site-packages:$PYTHONPATHexport GEM_HOME=~/.gem/ruby/gems/2.0.0:$GEM_HOMEexport RUBYLIB=~/.gem/ruby/gems/2.0.0:$RUBYLIBexport INCLUDE_PATH=$WORK/software/hdf5/include:~/local/include:$LOCAL/include:$INCLUDE_PATHexport C_INCLUDE_PATH=$WORK/software/hdf5/include:~/local/include:$C_INCLUDE_PATHexport CPLUS_INCLUDE_PATH=$WORK/software/hdf5/include:~/local/include:$CPLUS_INCLUDE_PATHexport LIBRARY_PATH=$WORK/software/hdf5/lib:$LOCAL/lib:~/local/lib:$LIBRARY_PATHexport LD_LIBRARY_PATH=$WORK/software/hdf5/lib:$LOCAL/lib:~/local/lib:$LD_LIBRARY_PATHexport INCLUDE_PATH=~/local/include/ncursesw:~/local/include:$INCLUDE_PATHexport C_INCLUDE_PATH=~/local/include/ncursesw:~/local/include:$C_INCLUDE_PATHexport CPLUS_INCLUDE_PATH=~/local/include/ncursesw:~/local/include:$CPLUS_INCLUDE_PATH

可是仍旧报一样的错.
于是查看CAFFE的CMakeCache.txt, 发现里面的lib还是链接到了tsubame系统里的1.8.13的路径上,于是手动修改了CMakeCache.txt中所有的hdf5的lib路径,重新进行cmake, 再make.

/work1/t2g-shinoda2011/15M54105/software/hdf5/lib

7. caffe某cpp里新增加一个函数,opencv报错.说未定义.

image_data_layer.cpp

#include <opencv2/contrib/contrib.hpp>void returnImageList(string ImagePath, vector<string>& fileNames){    cv::Directory dir;    fileNames = dir.GetListFiles(ImagePath, "*", false);}

在make的时候,大概到百分之九十几报错,

/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld: skipping incompatible /work1/t2g-shinoda2011/15M54105/local/lib/libstdc++.so when searching for -lstdc++/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld: skipping incompatible /work1/t2g-shinoda2011/15M54105/local/lib/libstdc++.a when searching for -lstdc++/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld: skipping incompatible /work1/t2g-shinoda2011/15M54105/local/lib/libgcc_s.so when searching for -lgcc_s/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld: skipping incompatible /work1/t2g-shinoda2011/15M54105/local/lib/libgcc_s.so when searching for -lgcc_s../lib/libcaffe.so: undefined reference to `cv::Directory::GetListFiles(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)'collect2: ld returned 1 exit statusmake[2]: *** [tools/device_query] Error 1make[1]: *** [tools/CMakeFiles/device_query.dir/all] Error 2make[1]: *** Waiting for unfinished jobs....

因为我加了#include <opencv2/contrib/contrib.hpp>,
就需要把这个lib也加进去.

查看CmakeCache.txt,里面只有opencv_core;opencv_highgui;opencv_imgproc这三个,于是在里面加入opencv_contrib;general;(我也不造general是什么意思)

//Dependencies for the targetcaffe_LIB_DEPENDS:STATIC=general;proto;general;proto;general;/work1/t2g-shinoda2011/15M54105/local/lib/libboost_system.so;general;/work1/t2g-shinoda2011/15M54105/local/lib/libboost_thread.so;general;/work1/t2g-shinoda2011/15M54105/local/lib/libboost_filesystem.so;general;-lpthread;general;/work1/t2g-shinoda2011/15M54105/local/lib/libglog.so;general;/work1/t2g-shinoda2011/15M54105/local/lib/libgflags.so;optimized;/work1/t2g-shinoda2011/15M54105/local/lib/libprotobuf.so;debug;/work1/t2g-shinoda2011/15M54105/local/lib/libprotobuf.so;general;-lpthread;general;/usr/apps.sp3/isv/ansys_inc/16.2/v162/Framework/bin/Linux64/libhdf5_hl.so;general;/usr/apps.sp3/isv/ansys_inc/16.2/v162/Framework/bin/Linux64/libhdf5.so;general;/work1/t2g-shinoda2011/15M54105/local/lib/liblmdb.so;general;/work1/t2g-shinoda2011/15M54105/local/lib/libleveldb.so;general;/work1/t2g-shinoda2011/15M54105/local/lib/libsnappy.so;general;/usr/apps.sp3/cuda/7.0/lib64/libcudart.so;general;/usr/apps.sp3/cuda/7.0/lib64/libcurand.so;general;/usr/apps.sp3/cuda/7.0/lib64/libcublas.so;general;/usr/apps.sp3/cuda/7.0/lib64/libcudnn.so;general;opencv_core;general;opencv_highgui;general;opencv_imgproc;general;opencv_contrib;general;/work1/t2g-shinoda2011/15M54105/local/lib/liblapack.a;general;/work1/t2g-shinoda2011/15M54105/local/lib/libptcblas.a;general;/work1/t2g-shinoda2011/15M54105/local/lib/libatlas.a;general;/home0/usr9/15M54105/local/lib/libpython2.7.so;general;/work1/t2g-shinoda2011/15M54105/local/lib/libboost_python.so;

但是改过之后在cmake,一看cmakecache.txt里修改的东西都不见了.
先放弃.

在caffe根目录下
grep -r opencv
grep -r OPENCV
grep -r OpenCV
终于找到了cmake/Dependencies.cmake

# ---[ OpenCVif(USE_OPENCV)  find_package(OpenCV QUIET COMPONENTS core highgui imgproc contrib imgcodecs)  if(NOT OpenCV_FOUND) # if not OpenCV 3.x, then imgcodecs are not found    find_package(OpenCV REQUIRED COMPONENTS core highgui imgproc contrib)  endif()  include_directories(SYSTEM ${OpenCV_INCLUDE_DIRS})  list(APPEND Caffe_LINKER_LIBS ${OpenCV_LIBS})  message(STATUS "OpenCV found (${OpenCV_CONFIG_PATH})")  add_definitions(-DUSE_OPENCV)endif()
1 0
原创粉丝点击