BTK工具使用

来源:互联网 发布:元数据的作用 编辑:程序博客网 时间:2024/06/02 02:31

1.要先安装以下工具

·         Python - scripting language interpreter

·         Numpy - Matlab like extension to python

·         GSL - GNU Scientific Library

·         SWIG - simplified wrapper and interface generator

·         Autoconf - automatic project configuration

·         pkg-config - automatic project configuration

·         libsndfile – C library for the sound file IO

2.下载BTK  source code

svn checkout svn://svn.code.sf.net/p/distantspeechrecognition/code/ ./

cd  trunk/btk

3.编译BTK库

./autogen.sh --force

./configure --prefix=${HOME} CFLAGS='-O4' CXXFLAGS='-O4' CPPFLAGS='-O4' SWIGFLAGS='-O'

make

make install

4.配置环境变量 打开bashrc文件,添加下列两行

export LD_LIBRARY_PATH=${HOME}/lib:/usr/local/lib:/usr/lib

export PYTHONPATH=${HOME}/lib/python2.6/site-packages:/usr/local/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages

其中${HOME}为编译库存放的路径。

5.测试是否成功

打开python输入

import btk.modulated

看是否有报错,没有即表示成功。

6.测试BTK库

处理Kinect 16 kHz数据的示例脚本可以从http://distantspeechrecognition.sourceforge.net/samples/SingleSpeakerTrackingSample.py下载。使用Kinect记录的音频文件也可从http://distantspeechrecognition.sourceforge.net/samples/ KINECT_M1001_U1050_2M.wav获得将“SingleSpeakerTrackingSample.py”和“KINECT_M1001_U1050_2M.wav”放在同一目录中后,您将执行Python脚本:

>> python SingleSpeakerTrackingSample.py KINECT_M1001_U1050_2M.wav ./result/

它将显示每个帧的到达方向(DOA)的估计。

输出结果为:

Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size  = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size  = 2048
FFT Feature Output Size = 4096
Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size  = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size  = 2048
FFT Feature Output Size = 4096
Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size  = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size  = 2048
FFT Feature Output Size = 4096
Sample Feature Block Length 2048
Sample Feature Shift Length 2048
Hamming Feature Input Size  = 2048
Hamming Feature Output Size = 2048
FFT Feature Input Size  = 2048
FFT Feature Output Size = 4096
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 0
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 1
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 2
Reading file KINECT_M1001_U1050_2M.wav from 0.00 to -1.00 : Ch 3
Time   1.28 : theta =      86.63
Gate probability =     0.9500
Time   1.28 : theta =      84.12
Time   1.41 : theta =      84.90
Time   1.54 : theta =      85.17
Time   1.66 : theta =      84.87
Time   1.79 : theta =      84.70
Time   1.92 : theta =      84.59
Time   2.05 : theta =      84.52
Time   2.18 : theta =      84.46
Time   2.30 : theta =      84.65
Time   2.43 : theta =      84.59
Time   2.56 : theta =      84.54
Time   2.69 : theta =      84.50
Time   2.82 : theta =      84.47
Time   2.94 : theta =      84.60
Time   3.07 : theta =      84.56
Time   3.20 : theta =      84.59
Time   3.33 : theta =      84.56
Time   3.46 : theta =      84.53
Time   3.58 : theta =      84.51
Time   3.71 : theta =      84.48
Time   3.84 : theta =      84.46
Time   3.97 : theta =      84.45
Time   4.10 : theta =      84.43
Time   4.22 : theta =      84.41
Time   4.35 : theta =      84.49
Time   4.48 : theta =      84.47
Time   4.61 : theta =      84.49
Time   4.74 : theta =      84.55
Time   4.86 : theta =      84.62
Time   4.99 : theta =      84.68
Time   5.12 : theta =      84.68
Time   5.25 : theta =      84.68
Time   5.38 : theta =      84.74
Time   5.50 : theta =      84.79

在运行过程中可能会提示一些库确实,到时根据提示安装就好了。

0 0
原创粉丝点击