BTK工具使用

来源：互联网发布：元数据的作用编辑：程序博客网时间：2024/06/02 02:31

1.要先安装以下工具

· Python - scripting language interpreter

· Numpy - Matlab like extension to python

· GSL - GNU Scientific Library

· SWIG - simplified wrapper and interface generator

· Autoconf - automatic project configuration

· pkg-config - automatic project configuration

· libsndfile – C library for the sound file IO

2.下载BTK source code

svn checkout svn：//svn.code.sf.net/p/distantspeechrecognition/code/ ./

cd trunk/btk

3.编译BTK库

./autogen.sh --force

./configure --prefix=${HOME} CFLAGS='-O4' CXXFLAGS='-O4' CPPFLAGS='-O4' SWIGFLAGS='-O'

make

make install

4.配置环境变量打开bashrc文件，添加下列两行

export LD_LIBRARY_PATH=${HOME}/lib:/usr/local/lib:/usr/lib

export PYTHONPATH=${HOME}/lib/python2.6/site-packages:/usr/local/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages

其中${HOME}为编译库存放的路径。

5.测试是否成功

打开python输入

import btk.modulated

看是否有报错，没有即表示成功。

6.测试BTK库

处理Kinect 16 kHz数据的示例脚本可以从http://distantspeechrecognition.sourceforge.net/samples/SingleSpeakerTrackingSample.py下载。使用Kinect记录的音频文件也可从http://distantspeechrecognition.sourceforge.net/samples/ KINECT_M1001_U1050_2M.wav获得。将“SingleSpeakerTrackingSample.py”和“KINECT_M1001_U1050_2M.wav”放在同一目录中后，您将执行Python脚本：

>> python SingleSpeakerTrackingSample.py KINECT_M1001_U1050_2M.wav ./result/

它将显示每个帧的到达方向（DOA）的估计。

输出结果为：

Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size = 2048
FFT Feature Output Size = 4096
Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size = 2048
FFT Feature Output Size = 4096
Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size = 2048
FFT Feature Output Size = 4096
Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size = 2048
FFT Feature Output Size = 4096
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 0
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 1
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 2
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 3
Time 1.28 : theta = 86.63
Gate probability = 0.9500
Time 1.28 : theta = 84.12
Time 1.41 : theta = 84.90
Time 1.54 : theta = 85.17
Time 1.66 : theta = 84.87
Time 1.79 : theta = 84.70
Time 1.92 : theta = 84.59
Time 2.05 : theta = 84.52
Time 2.18 : theta = 84.46
Time 2.30 : theta = 84.65
Time 2.43 : theta = 84.59
Time 2.56 : theta = 84.54
Time 2.69 : theta = 84.50
Time 2.82 : theta = 84.47
Time 2.94 : theta = 84.60
Time 3.07 : theta = 84.56
Time 3.20 : theta = 84.59
Time 3.33 : theta = 84.56
Time 3.46 : theta = 84.53
Time 3.58 : theta = 84.51
Time 3.71 : theta = 84.48
Time 3.84 : theta = 84.46
Time 3.97 : theta = 84.45
Time 4.10 : theta = 84.43
Time 4.22 : theta = 84.41
Time 4.35 : theta = 84.49
Time 4.48 : theta = 84.47
Time 4.61 : theta = 84.49
Time 4.74 : theta = 84.55
Time 4.86 : theta = 84.62
Time 4.99 : theta = 84.68
Time 5.12 : theta = 84.68
Time 5.25 : theta = 84.68
Time 5.38 : theta = 84.74
Time 5.50 : theta = 84.79

在运行过程中可能会提示一些库确实，到时根据提示安装就好了。

0 0