speex前置处理

来源：互联网发布：在淘宝买steam游戏编辑：程序博客网时间：2024/06/18 10:40

1.简介

语音在采集和传输过程中，由于语音源的差异、信道的衰减、噪声的干扰以及远近效应，导致信号幅度相差很大。所有在语音处理之前我们需要对语音数据进行前置处理，包括预处理（AGC、VAD、回音消除）、重采样和噪声抑制。

所有的代码都是基于speex开源库，具体内容可以参考http://speex.org/。

2.接口说明

预处理模块包括自动增益控制、静音检测和回音消除。以下是接口函数，具体参考speex\ speex_preprocess.h。

函数名称

功能简介

speex_preprocess_state_init

创建预处理器

speex_preprocess_state_destroy

销毁预处理器

speex_preprocess_run

处理一帧数据

speex_preprocess

处理一帧数据（废弃的）

speex_preprocess_estimate_update

更新预处理器

speex_preprocess_ctl

设置和读取预处理器的参数

2.1.1 speex_preprocess_state_init

函数原形

SpeexPreprocessState *speex_preprocess_state_init(int frame_size, int sampling_rate);

功能

创建预处理器

参数

Frmae_size [in]每帧的大小（建议帧长为20ms）

Sample_rate [in]采样率（支持8k、16k、44k）

返回值

成功返回预处理器指针，失败返回NULL

说明

加入是16k的语音数据，帧长20ms等于320个采样

2.1.2 speex_preprocess_state_destroy

函数原形

void speex_preprocess_state_destroy(SpeexPreprocessState *st);

功能

销毁预处理器

参数

St [in]处理器指针

返回值

Void

说明

2.1.3 speex_preprocess_run

函数原形

int speex_preprocess_run(SpeexPreprocessState *st, spx_int16_t *x);

功能

处理一帧语音数据

参数

St [in]处理器指针

X [in|out]数据缓存，处理后的数据也存入该缓存中

返回值

如果VAD打开，返回值为1表示有语音，为0表示静音或者噪音

说明

2.1.4 speex_preprocess

函数原形

int speex_preprocess(SpeexPreprocessState *st, spx_int16_t *x, spx_int32_t *echo);

功能

处理一帧语音数据（废弃的函数，简介调用speex_preprocess_run）

参数

St [in]处理器指针

X [in|out]数据缓存，处理后的数据也存入该缓存中

返回值

说明

2.1.5 speex_preprocess_estimate_update

函数原形

void speex_preprocess_estimate_update(SpeexPreprocessState *st, spx_int16_t *x);

功能

更新预处理器，不会计算输出语音

参数

St [in]处理器指针

X [in]数据缓存

返回值

Void

说明

2.1.6 speex_preprocess_ctl

函数原形

int speex_preprocess_ctl(SpeexPreprocessState *st, int request, void *ptr);

功能

设置预处理器的参数

参数

St [in]处理器指针

Request [in]参数的类型（由宏来代表不同的参数）

Ptr [in|out]参数的值（设置参数时为in，获得参数参数时为out，这个由宏决定）

返回值

成功返回0，失败返回-1（表明未知的请求参数）

说明

以下数标识参数类型的宏

/** Set preprocessor denoiser state 降噪*/

#define SPEEX_PREPROCESS_SET_DENOISE 0

/** Get preprocessor denoiser state */

#define SPEEX_PREPROCESS_GET_DENOISE 1

/** Set preprocessor Automatic Gain Control state 自动增益控制 */

#define SPEEX_PREPROCESS_SET_AGC 2

/** Get preprocessor Automatic Gain Control state */

#define SPEEX_PREPROCESS_GET_AGC 3

/** Set preprocessor Voice Activity Detection state 静音检测 */

#define SPEEX_PREPROCESS_SET_VAD 4

/** Get preprocessor Voice Activity Detection state */

#define SPEEX_PREPROCESS_GET_VAD 5

/** Set preprocessor Automatic Gain Control level (float) */

#define SPEEX_PREPROCESS_SET_AGC_LEVEL 6

/** Get preprocessor Automatic Gain Control level (float) */

#define SPEEX_PREPROCESS_GET_AGC_LEVEL 7

/** Set preprocessor dereverb state */

#define SPEEX_PREPROCESS_SET_DEREVERB 8

/** Get preprocessor dereverb state */

#define SPEEX_PREPROCESS_GET_DEREVERB 9

/** Set preprocessor dereverb level */

#define SPEEX_PREPROCESS_SET_DEREVERB_LEVEL 10

/** Get preprocessor dereverb level */

#define SPEEX_PREPROCESS_GET_DEREVERB_LEVEL 11

/** Set preprocessor dereverb decay */

#define SPEEX_PREPROCESS_SET_DEREVERB_DECAY 12

/** Get preprocessor dereverb decay */

#define SPEEX_PREPROCESS_GET_DEREVERB_DECAY 13

/** Set probability required for the VAD to go from silence to voice */

#define SPEEX_PREPROCESS_SET_PROB_START 14

/** Get probability required for the VAD to go from silence to voice */

#define SPEEX_PREPROCESS_GET_PROB_START 15

/** Set probability required for the VAD to stay in the voice state (integer percent) */

#define SPEEX_PREPROCESS_SET_PROB_CONTINUE 16

/** Get probability required for the VAD to stay in the voice state (integer percent) */

#define SPEEX_PREPROCESS_GET_PROB_CONTINUE 17

/** Set maximum attenuation of the noise in dB (negative number) */

#define SPEEX_PREPROCESS_SET_NOISE_SUPPRESS 18

/** Get maximum attenuation of the noise in dB (negative number) */

#define SPEEX_PREPROCESS_GET_NOISE_SUPPRESS 19

/** Set maximum attenuation of the residual echo in dB (negative number) */

#define SPEEX_PREPROCESS_SET_ECHO_SUPPRESS 20

/** Get maximum attenuation of the residual echo in dB (negative number) */

#define SPEEX_PREPROCESS_GET_ECHO_SUPPRESS 21

/** Set maximum attenuation of the residual echo in dB when near end is active (negative number) */

#define SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE 22

/** Get maximum attenuation of the residual echo in dB when near end is active (negative number) */

#define SPEEX_PREPROCESS_GET_ECHO_SUPPRESS_ACTIVE 23

/** Set the corresponding echo canceller state so that residual echo suppression can be performed (NULL for no residual echo suppression) 消除回声 */

#define SPEEX_PREPROCESS_SET_ECHO_STATE 24

/** Get the corresponding echo canceller state */

#define SPEEX_PREPROCESS_GET_ECHO_STATE 25

/** Set maximal gain increase in dB/second (int32) */

#define SPEEX_PREPROCESS_SET_AGC_INCREMENT 26

/** Get maximal gain increase in dB/second (int32) */

#define SPEEX_PREPROCESS_GET_AGC_INCREMENT 27

/** Set maximal gain decrease in dB/second (int32) */

#define SPEEX_PREPROCESS_SET_AGC_DECREMENT 28

/** Get maximal gain decrease in dB/second (int32) */

#define SPEEX_PREPROCESS_GET_AGC_DECREMENT 29

/** Set maximal gain in dB (int32) */

#define SPEEX_PREPROCESS_SET_AGC_MAX_GAIN 30

/** Get maximal gain in dB (int32) */

#define SPEEX_PREPROCESS_GET_AGC_MAX_GAIN 31

/* Can't set loudness */

/** Get loudness */

#define SPEEX_PREPROCESS_GET_AGC_LOUDNESS 33

/* Can't set gain */

/** Get current gain (int32 percent) */

#define SPEEX_PREPROCESS_GET_AGC_GAIN 35

/* Can't set spectrum size */

/** Get spectrum size for power spectrum (int32) */

#define SPEEX_PREPROCESS_GET_PSD_SIZE 37

/* Can't set power spectrum */

/** Get power spectrum (int32[] of squared values) */

#define SPEEX_PREPROCESS_GET_PSD 39

/* Can't set noise size */

/** Get spectrum size for noise estimate (int32) */

#define SPEEX_PREPROCESS_GET_NOISE_PSD_SIZE 41

/* Can't set noise estimate */

/** Get noise estimate (int32[] of squared values) */

#define SPEEX_PREPROCESS_GET_NOISE_PSD 43

/* Can't set speech probability */

/** Get speech probability in last frame (int32). */

#define SPEEX_PREPROCESS_GET_PROB 45

/** Set preprocessor Automatic Gain Control level (int32) */

#define SPEEX_PREPROCESS_SET_AGC_TARGET 46

/** Get preprocessor Automatic Gain Control level (int32) */

#define SPEEX_PREPROCESS_GET_AGC_TARGET 47

3.实例代码

3.1 AGC

#define NN 320

语音数据为单通道、16bit、16k

int _tmain(int argc, _TCHAR* argv[])

{

short in[NN];

int i;

SpeexPreprocessState *st;

int count=0;

float f;

st = speex_preprocess_state_init(NN, 16000);

i=1;

speex_preprocess_ctl(st, SPEEX_PREPROCESS_SET_AGC, &i);

f=16000;

speex_preprocess_ctl(st, SPEEX_PREPROCESS_SET_AGC_LEVEL, &f);

while (1)

{

int vad;

fread(in, sizeof(short), NN, stdin);

if (feof(stdin))

break;

vad = speex_preprocess_run(st, in);

//fprintf (stderr, "%d\n", vad);

fwrite(in, sizeof(short), NN, stdout);

count++;

}

speex_preprocess_state_destroy(st);

return 0;

}

0 0