【小白笔记】IDT源码运行指南（Linux+ffmpeg-0.11.1+opencv-2.4.2）

来源：互联网发布：香港天文台软件编辑：程序博客网时间：2024/05/07 04:00

1）ffmpeg-0.11.1配置

由于opencv的配置需要ffmpeg，否则视频数据无法解码，所以首先安装ffmpeg，如果已安装opencv建议先卸载或之后重装。

下载地址：http://lear.inrialpes.fr/people/wang/improved_trajectories，opencv也在链接中下载。

解压至ffmpeg-0.11.1文件夹中

打开终端，以下代码均在终端输入

1.删除已安装的ffmpeg防止冲突

sudo apt-get remove ffmpeg x264

sudo apt-get autoremove

2.安装必要的支持

sudo apt-get install make automake g++ bzip2 python unzip patch subversion ruby build-essential git-core checkinstall yasm texi2html libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libsdl1.2-dev libtheora-dev libvdpau-dev libvorbis-dev libvpx-dev libx11-dev libxfixes-dev libxvidcore-dev zlib1g-dev

上面代码中automake后即为需要安装的支持，后面每个中间加空格。也可分开安装

3.在ffmpeg-0.11.1文件夹中进行编译

./configure --enable-gpl --enable-nonfree --enable-version3 --enable-shared --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libfaac --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-libvpx

其中gpl、nonfree、version3为后面文件的目录，shared表示编译为动态链接库，后面为编译所需要的解码器。enable前有2个-

出现如下图即表示成功（经尝试，最好每一个单独编译容易看出是否成功）

若这一步报错，如 xxx not found 则需要安装相应解码器。这里以我配置时遇到的x264编译未成功举例：

下载x264安装包：https://github.com/qupai/x264

进入该文件夹如：cd x264

./configure --enable-static --enable-shared

make

make install

ldconfig

若没有报错则可返回第三步编译。

4.安装

make

make install

Idconfig

5.测试

输入命令ffmpeg，若报错，可能与gstreamer的动态库冲突了,要卸载gstreamer.但是如果卸载了gstreamer多媒体软件就不能用了.所以可卸载了ffmpeg并重新编译成静态库.第三步中--enable-shared改为--enable-static（静态库更大不推荐）。

若仍有错可尝试在第三步中./configure后加--prefix=/usr/local/ffmpeg

若成功则如下图：

2）OpenCV-2.4.2配置

1.安装必要的支持

apt-get install pkg-config

export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig

apt-get install cmake

2.安装opencv

下载并解压，进入opencv2.4.2文件夹

mkdir release

cd release

cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_PYTHON_SUPPORT=ON ..

这一步后查看结果，如下图

之前代码无法运行就是这里的FFMPEG、codec、format、util、swacale后面为NO或0，若为YES或1则表示成功交叉编译。

最后

make

make install

ldconfig

3.设置环境变量

sudo vim /etc/ld.so.conf.d/opencv.conf

最后添加/usr/local/lib

（这里又涉及到vim的编辑命令，由于我初次接触linux，确实折腾了一番~~）

最后ldconfig

大功告成！

3）IDT源码运行

下载：http://lear.inrialpes.fr/people/wang/improved_trajectories

解压后进入文件夹输入

make

可看到出现了release文件夹，里面的Video和DenseTrackStab为可执行文件，接着输入：

./release/DenseTrackStab ./test_sequences/person01_boxing_d1_uncomp.avi

可看到终端内闪过一串串数字，若先修改DenseTrackStab.cpp中的

int show_track = 0;

将0改为1可获得更好的可视化效果

若要输出feature即输入：

./release/DenseTrackTrack ./test_sequences/person01_boxing_d1_uncomp.avi | gzip > out.features.gz

源码支持如下参数修改：

Usage: DenseTrack video_file [options]

Options:

-h Display this message and exit

-S [start frame] The start frame to compute feature (default: S=0 frame)

-E [end frame] The end frame for feature computing (default: E=last frame)

-L [trajectory length] The length of the trajectory (default: L=15 frames)

-W [sampling stride] The stride for dense sampling feature points (default: W=5 pixels)

-N [neighborhood size] The neighborhood size for computing the descriptor (default: N=32 pixels)

-s [spatial cells] The number of cells in the nxy axis (default: nxy=2 cells)

-t [temporal cells] The number of cells in the nt axis (default: nt=3 cells)

下面介绍一下feature的结构：

特征是一个接一个计算的每一个都是单独一列，由下面的格式给出：

frameNum mean_x mean_y var_x var_y length scale x_pos y_pos t_pos Trajectory HOG HOF MBHx MBHy

前十个部分是关于轨迹的：

frameNum: The trajectory ends on which frame

mean_x: The mean value of the x coordinates of the trajectory

mean_y: The mean value of the y coordinates of the trajectory

var_x: The variance of the x coordinates of the trajectory

var_y: The variance of the y coordinates of the trajectory

length: The length of the trajectory

scale: The trajectory is computed on which scale

x_pos: The normalized x position w.r.t. the video (0~0.999), for spatio-temporal pyramid

y_pos: The normalized y position w.r.t. the video (0~0.999), for spatio-temporal pyramid

t_pos: The normalized t position w.r.t. the video (0~0.999), for spatio-temporal pyramid

下面的5个部分是一个接一个连起来的：

Trajectory: 2x[trajectory length] (default 30 dimension)

HOG: 8x[spatial cells]x[spatial cells]x[temporal cells] (default 96 dimension)

HOF: 9x[spatial cells]x[spatial cells]x[temporal cells] (default 108 dimension)

MBHx: 8x[spatial cells]x[spatial cells]x[temporal cells] (default 96 dimension)

MBHy: 8x[spatial cells]x[spatial cells]x[temporal cells] (default 96 dimension)

每隔设定的帧长度会从零开始再计算。

附：由于我初次接触linux编程略显生疏，见笑:)

1 0