CMUSphinx Wiki--Open Source Toolkit For Speech Recognition

来源:互联网 发布:givens矩阵变换 编辑:程序博客网 时间:2024/06/05 23:47

http://cmusphinx.sourceforge.net/wiki/


CMUSphinx Wiki

This page contains collaboratively developed documentation for the CMU Sphinx speech recognition engines.

Start User Documentation

This section contains links to documents which describe how to use Sphinx to recognize speech. Currently, we have very little in the way of end-user tools, so it may be a bit sparse for the forseeable future.

  • CMUSphinx Tutorial For Developers: Getting started with CMUSphinx for developers
    • Basic concepts of speech
    • Overview of the CMUSphinx toolkit
    • Before you start
    • Building application using pocketsphinx
    • Building application using sphinx4
    • Building language models
    • Adapting existing acoustic model
    • Building the acoustic model
    • Building a dictionary
    • Using pocketsphinx on Android

You are in trouble - read the Frequenty Asked Questions (FAQ)

See also some more docs:

  • Decoder Versions: Description of the software packages
  • Download Details: How to obtain CMUSphinx packages
  • How to get help and discuss things: How to get help and discuss things
  • http://cmusphinx.sourceforge.net/doc/speech.ppt Cool presentation done byHeather Dewey-Hagborg

If you want to find out where CMUSphinx works, see

  • Projects that use Sphinx: These projects, both commercial and free, use Sphinx in one form or another.

Advanced User Documentation

These documents either describe some particular aspect of the Sphinx codebase in detail, or they serve as adeveloper's guide to accomplishing some particular task.

  • Building on IPhone: Building Pocketsphinx on various platforms
  • Integrating CMUSphinx with Telephony Servers - Asterisk and Freeswitch: How to use pocketsphinx in Asterisk.
  • The Incomplete Guide to Sphinx-3 Performance Tuning: How to tune the decoder to be fast (or rather, not horribly slow)
  • Pocketsphinx optimizations for embedded devices.
  • Phoneme Recognition (caveat emptor): How to use Sphinx3 for phoneme recognition.
  • Segmentation and Diarization using LIUM tools: Using LIUM tools for speech segmentation and speaker diarization
  • Training an acoustic model with LDA and MLLT feature transforms: How to train acoustic models with LDA and MLLT feature transforms
  • Using PocketSphinx with GStreamer and Python (or Vala): How to use PocketSphinx withGStreamer and Python
  • InstallingPythonStuff: How to install Python and necessary modules for SphinxTrain development
  • MMIE Training in SphinxTrain: How to perform MMIE training.
  • http://www.speech.cs.cmu.edu/sphinx/tutorial.html Robust Group Tutorial (classic tutorial from CMU Speech Group website)

Decoder Space

  • Sphinx4 Space : Information about sphinx4, design, code, performance, history.

Reference

These documents describe the excruciating detail of APIs, or provide other useful background information for CMUSphinx developers.

  • Doxygen documentation for PocketSphinx
  • Doxygen documentation for SphinxBase
  • ePyDoc documentation for SphinxTrain Python Modules
  • JavaDocs for Sphinx4

Developer Documentation

This section contains various internal information for CMUSphinx developers. But we hope it will be still usable for you.

  • Sphinx-4 Regression Tests: How to run regression tests
  • Layout of SphinxTrain code: An overview of the SphinxTrain source code for researchers and developers
  • CMUCLMTK development: Development guide for the CMU-Cambridge Language Modeling Toolkit.
  • Language Features for SphinxBase, SphinxThree, and SphinxTrain
  • Upcoming CMU Sphinx Software Releases: Plans for upcoming releases of Sphinx
  • Release Check List: How to make a release
  • Web Site Layout: How to organize information

File formats

  • Acoustic Model Format
  • MFC files

Data sources:

  • Data Sources

Materials for GSOC

  • Information for Students: Students information
  • Tasks for Summer Of Code Projects: Ideas for students

GSoc Previous years

  • Google Summer of Code 2012 Projects: Google Summer of Code 2012 Projects

Speech Recognition Theory

This section tries to collect research ideas for specific problems in speech recognition

  • Lattices
  • WFST
  • Search Algorithms
  • Language Models
  • Features
  • Noise Robustness
  • Adaptation

0 0
原创粉丝点击