编译py-faster-rcnn的问题汇总及解决方法

来源：互联网发布：全球通用顶级域名编辑：程序博客网时间：2024/05/21 18:30

按照官网的提示，我开始安装faster rcnn，但是出现了很多问题，我将其汇总了起来，并提出了解决办法。
先说明一下我的配置：

python : anaconda2
linux: centos 6.9

安装faster rcnn请先参考：《cuda8+cudnn4 Faster R-CNN安装塈运行demo》
与《使用cuDNN5编译py-faster-rcnn错误：cudnn.hpp(126): error: argument of type “int” is incompatible …》，要先合一下版本。

问题及解决方法

1.第三步Build the Cython modules 出现如下错误：

解决方法：
这个问题困扰了我好久，经过不断查阅资料，我终于解决了此问题。该问题主要是anaconda的 distutils.extension 在编译nms.gpu_nms出现的问题。
我的解决方法：
先定位到$FRCN_ROOT/lib，再打开setup.py，注释掉nms.gpu_nms模块：

#    Extension('nms.gpu_nms',#        ['nms/nms_kernel.cu', 'nms/gpu_nms.pyx'],#        library_dirs=[CUDA['lib64']],#        libraries=['cudart'],#        language='c++',#        runtime_library_dirs=[CUDA['lib64']],#         # this syntax is specific to this build system#         # we're only going to use certain compiler args with nvcc and not with#         # gcc the implementation of this trick is in customize_compiler() below#        extra_compile_args={'gcc': ["-Wno-unused-function"],#                            'nvcc': ['-arch=sm_35',#                                     '--ptxas-options=-v',#                                     '-c',#                                     '--compiler-options',#                                     "'-fPIC'"]},#        include_dirs = [numpy_include, CUDA['include']]#    ),

然后先编译其他三个模块bbox，nms.cpu_nms，pycocotools._mask。
等到编译结束后，再回过来将上面的注释去掉，重新编译，会发现出现刚才的问题，此时，直接复制出错的命令，将其中的“-R”换成“-Wl,-rpath=”（已验证）或者”-Wl,-R”（已验证），再直接运行修改后的命令（如下）：

g++ -pthread -shared -B /data1/caiyong.wang/bin/anaconda2/compiler_compat -L/data1/caiyong.wang/bin/anaconda2/lib -Wl,-rpath=/data1/caiyong.wang/bin/anaconda2/lib,--no-as-needed build/temp.linux-x86_64-2.7/nms/nms_kernel.o build/temp.linux-x86_64-2.7/nms/gpu_nms.o -L/usr/local/cuda/lib64 -L/data1/caiyong.wang/bin/anaconda2/lib -Wl,-rpath=/usr/local/cuda/lib64 -lcudart -lpython2.7 -o /data1/caiyong.wang/program/faster_rcnn/py-faster-rcnn/lib/nms/gpu_nms.so

最后我们再重新运行一下make命令（注意此时setup.py恢复与以前一样），发现所有的模块都已经编译完成。
【result：】

[caiyong.wang@localhost lib]$ makepython setup.py build_ext --inplacerunning build_extskipping 'utils/bbox.c' Cython extension (up-to-date)skipping 'nms/cpu_nms.c' Cython extension (up-to-date)skipping 'nms/gpu_nms.cpp' Cython extension (up-to-date)skipping 'pycocotools/_mask.c' Cython extension (up-to-date)rm -rf build

也有人建议直接将anaconda换成Anaconda3-4.4.0-Linux-x86_64.sh 可以避免出现这个错误。
参考文献：
1. https://github.com/cupy/cupy/issues/599
2. https://stackoverflow.com/questions/12629042/g-4-6-real-error-unrecognized-option-r
3. http://www.cnblogs.com/jianyingzhou/p/7722570.html
4. https://github.com/rbgirshick/py-faster-rcnn/issues/706

2. src/caffe/test/test_smooth_L1_loss_layer.cpp:11:35: fatal error: caffe/vision_layers.hpp: No such file or directory

解决方案：直接删除这一行就好了

3. 运行 make runtest -j8 出现了找不到so的问题，分别如下：

1). libcudart.so.8.0: cannot open shared object file: No such file or directory
解决方法：
在home目录下，打开.bashrc，输入：

export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATHexport PATH=/usr/local/cuda-8.0/bin:$PATH

然后执行source ~/.bashrc
2).error while loading shared libraries: libglog.so.0: cannot open shared object file: No such file or directory
解决方法：
首先发现libglog.so.0在/usr/local/lib/,因此只需要加入环境变量即可。
在home目录下，打开.bashrc，输入：

export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH

然后执行source ~/.bashrc
3). error while loading shared libraries: libhdf5_hl.so.100
解决方法：

如果已经安装了hdf5，则同理如上将lib路径加入到path中。否则安装hdf5.

安装hdf5,可以直接sudo装，默认装在=/usr/local/hdf5**, 即：

sudo yum install  hdf5-devel

如果没有sudo权限，则可以使用源码安装：
在官网下载源代码，然后根据提示安装。
这里需要注意的是, 当运行到

 ./configure --prefix=/**/hdf5-X.Y.Z/ <more configure_flags>

需要将/**/hdf5-X.Y.Z/替换到自己可以读写的目录下，另外如果出现Syntax error near unexpected token `newline’，可以参考

《Syntax error near unexpected token `newline’ while installing Predictionio》解决，或者直接将要安装的HDF5文件拷贝到安装路径下，使用hdf5默认的路径，一般为当前拷贝的路径下面。可以通过运行命令

 ./configure   --enable-cxx

查看。
完整的安装代码为：

$ gunzip < hdf5-X.Y.Z.tar.gz | tar xf -$ cd hdf5-X.Y.Z$ ./configure --prefix= /**/hdf5-X.Y.Z/ <more configure_flags>$ make$ make check                # run test suite.$ make install$ make check-install        # verify installation.

最后在makefile.conf中加入：

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include  /**/hdf5-X.Y.Z/hdf5/include LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /**/hdf5-X.Y.Z/hdf5/lib

【注意：】

如果安装过 anaconda 的话，那 libhdf5-serial-dev 可以不装。如果编译时提示找不到 hdf5 的库。就把 anaconda/lib 加到 ld.so.conf 中去。$ sudo vim /etc/ld.so.conf添加一行,用户名改为你自己的:/home/your_username/anaconda/lib关闭并保存文件。$ sudo ldconfig

参考文献：

在Matlab中使用Caffe出现HDF5 library version mismatched error的解决办法
在centos7上配置caffe所遇到的一些问题
http://coldmooon.github.io/2015/08/03/caffe_install/

4).error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory
解决方案：
同理找到python lib（我这里是anaconda）的位置，将其加入环境变量即可。

4. libprotobuf error in “make runtest”

运行make runtest时出现如下错误：

[ RUN      ] SGDSolverTest/1.TestLeastSquaresUpdateLROneHundredthlibprotobuf ERROR google/protobuf/text_format.cc:169] Error parsing text-format caffe.SolverParameter: 1:23: Expected identifier.F1018 01:39:16.016651  8291 test_gradient_based_solver.cpp:56] Check failed: google::protobuf::TextFormat::ParseFromString(proto, &param)*** Check failure stack trace: ***    @     0x7f66fa250a5d  google::LogMessage::Fail()    @     0x7f66fa254ef7  google::LogMessage::SendToLog()    @     0x7f66fa252d59  google::LogMessage::Flush()    @     0x7f66fa25305d  google::LogMessageFatal::~LogMessageFatal()    @           0x8fb1c1  caffe::GradientBasedSolverTest<>::InitSolverFromProtoString()    @           0x8e6a93  caffe::GradientBasedSolverTest<>::RunLeastSquaresSolver()    @           0x8f16db  caffe::GradientBasedSolverTest<>::TestLeastSquaresUpdate()    @           0x966cc3  testing::internal::HandleExceptionsInMethodIfSupported<>()    @           0x95eac7  testing::Test::Run()    @           0x95eb6e  testing::TestInfo::Run()    @           0x95ec75  testing::TestCase::Run()    @           0x960f08  testing::internal::UnitTestImpl::RunAllTests()    @           0x961197  testing::UnitTest::Run()    @           0x5257af  main    @       0x3d21a1ed1d  (unknown)    @           0x525355  (unknown)make: *** [runtest] Aborted (core dumped)

原因是 yum install protobuf-devel的版本太低了，使用protoc –version ,发现我的版本是：

$ protoc --versionlibprotoc 2.3.0

而caffe-master使用的protobuf版本是2.5.0及以上，因此更新protobuf可以解决。
但是实际上通过anaconda的pip安装时会安装protobuf的python版本，发现我的版本是3.4.0，而caffe编译时却链接的是yum install安装的protobuf,版本太低，因此需要在本地目录下安装与anaconda匹配的protobuf 3.4.0,

另外似乎caffe支持2.6.1更好，而且更重要的是2.6.1版本的protobuf在手动安装的时候出错很少，因此先pip uninstall protobuf,再pip安装指定的protobuf，即pip install protobuf==2.6.1 . 此为python的版本。

接下来手动安装protobuf。
解决方法：

从http://download.csdn.net/download/liangyihuai/9534593下载protobuf-2.6.1或从其他地方下载protobuf-2.6.1·.tar.gz。
认真阅读gitHub上给出的安装教程。

$./autogen.sh(在csdn下载的可以省略)$./configure --prefix=/home/**/protobuf （自己的目录下）$ make$ make check$ make install 编译成功后将export PATH=/home/**/protobuf/bin:$PATH加入到环境变量中  最后输入 protoc --version命令，如显示protobuf-**则安装成功。

make check的结果：
这里写图片描述

此时protobuf安装成功，我们进入caffe目录下进行make clean，再重新编译安装。
5. cannot find -lopencv_dep_cudart
参考： https://github.com/opencv/opencv/issues/6542

阅读全文

1 0