阅读"voice conversion using deep bidirectional long short-term memory based recurrent neural networks"

来源：互联网发布：网络诽谤怎么处理编辑：程序博客网时间：2024/05/18 01:44

VC的两种方法：
- 基于规则：修改语音信号
  - ”Frequency warping based on mapping formant parameters”
  - “Weighted frequency warping for voice conversion”
- 基于统计：估计从源信号到目的信号的map函数
  - GMM：原理参考基于高斯混合模型的语音转换技术研究
    - ”Continuous probablistic transform for voice conversion”
    - “Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory” 使用动态特征和全局方差
  - DBN：
    - ”A fast learning algorithm for Deep Belief Nets”
    - “Acoustic modeling using Deep Belief Networks”
    - “Voice conversion in high-order eigen space using Deep Belief Nets”
  - RMB：
    - ”Joint spectral distribution modeling using Restricted Boltzmann Machines for voice conversion”
    - “Voice conversion using Deep Neural Networks with layer-wise
      generative training”
  - RNN：
    - High-order sequence modeling using speaker-dependent recurrent temporal Restricted Boltsmann Machines for voice conversion
    - 缺点：只能利用previous context而不是future context, 切由于vanishing and exploding gradients(参考”Learning long-
      term dependencies with gradient descent is difficult”)，也不能处理长序列
  - BLSTM-RNN：Bidirectional Long Short-Term Memory
    - ”Framewise phoneme classification with bidirectional LSTM and other neural network architectures”
    - “Long Short-Term Memory”
传统RNN模型
Bidirectional RNN模型：利用sequence
LSTM模型
BRNN与LSTM结合

0 0