CMUSphinx Wiki--Open Source Toolkit For Speech Recognition

来源：互联网发布：givens矩阵变换编辑：程序博客网时间：2024/06/05 23:47

http://cmusphinx.sourceforge.net/wiki/

CMUSphinx Wiki

This page contains collaboratively developed documentation for the CMU Sphinx speech recognition engines.

Start User Documentation

This section contains links to documents which describe how to use Sphinx to recognize speech. Currently, we have very little in the way of end-user tools, so it may be a bit sparse for the forseeable future.

CMUSphinx Tutorial For Developers: Getting started with CMUSphinx for developers
- Basic concepts of speech
- Overview of the CMUSphinx toolkit
- Before you start
- Building application using pocketsphinx
- Building application using sphinx4
- Building language models
- Adapting existing acoustic model
- Building the acoustic model
- Building a dictionary
- Using pocketsphinx on Android

You are in trouble - read the Frequenty Asked Questions (FAQ)

Advanced User Documentation

These documents either describe some particular aspect of the Sphinx codebase in detail, or they serve as adeveloper's guide to accomplishing some particular task.

Building on IPhone: Building Pocketsphinx on various platforms
Integrating CMUSphinx with Telephony Servers - Asterisk and Freeswitch: How to use pocketsphinx in Asterisk.
The Incomplete Guide to Sphinx-3 Performance Tuning: How to tune the decoder to be fast (or rather, not horribly slow)
Pocketsphinx optimizations for embedded devices.
Phoneme Recognition (caveat emptor): How to use Sphinx3 for phoneme recognition.
Segmentation and Diarization using LIUM tools: Using LIUM tools for speech segmentation and speaker diarization
Training an acoustic model with LDA and MLLT feature transforms: How to train acoustic models with LDA and MLLT feature transforms
Using PocketSphinx with GStreamer and Python (or Vala): How to use PocketSphinx withGStreamer and Python
InstallingPythonStuff: How to install Python and necessary modules for SphinxTrain development
MMIE Training in SphinxTrain: How to perform MMIE training.
http://www.speech.cs.cmu.edu/sphinx/tutorial.html Robust Group Tutorial (classic tutorial from CMU Speech Group website)

Decoder Space

Sphinx4 Space : Information about sphinx4, design, code, performance, history.

Reference

These documents describe the excruciating detail of APIs, or provide other useful background information for CMUSphinx developers.

Doxygen documentation for PocketSphinx
Doxygen documentation for SphinxBase
ePyDoc documentation for SphinxTrain Python Modules
JavaDocs for Sphinx4

Developer Documentation

This section contains various internal information for CMUSphinx developers. But we hope it will be still usable for you.

Sphinx-4 Regression Tests: How to run regression tests
Layout of SphinxTrain code: An overview of the SphinxTrain source code for researchers and developers
CMUCLMTK development: Development guide for the CMU-Cambridge Language Modeling Toolkit.
Language Features for SphinxBase, SphinxThree, and SphinxTrain
Upcoming CMU Sphinx Software Releases: Plans for upcoming releases of Sphinx
Release Check List: How to make a release
Web Site Layout: How to organize information

File formats

Acoustic Model Format
MFC files

Data sources:

Data Sources

Materials for GSOC

Information for Students: Students information
Tasks for Summer Of Code Projects: Ideas for students

GSoc Previous years

Google Summer of Code 2012 Projects: Google Summer of Code 2012 Projects

Speech Recognition Theory

This section tries to collect research ideas for specific problems in speech recognition

Lattices
WFST
Search Algorithms
Language Models
Features
Noise Robustness
Adaptation

0 0