CMUSphinx Wiki--Open Source Toolkit For Speech Recognition
来源:互联网 发布:givens矩阵变换 编辑:程序博客网 时间:2024/06/05 23:47
http://cmusphinx.sourceforge.net/wiki/
CMUSphinx Wiki
This page contains collaboratively developed documentation for the CMU Sphinx speech recognition engines.
Start User Documentation
This section contains links to documents which describe how to use Sphinx to recognize speech. Currently, we have very little in the way of end-user tools, so it may be a bit sparse for the forseeable future.
- CMUSphinx Tutorial For Developers: Getting started with CMUSphinx for developers
- Basic concepts of speech
- Overview of the CMUSphinx toolkit
- Before you start
- Building application using pocketsphinx
- Building application using sphinx4
- Building language models
- Adapting existing acoustic model
- Building the acoustic model
- Building a dictionary
- Using pocketsphinx on Android
You are in trouble - read the Frequenty Asked Questions (FAQ)
See also some more docs:
- Decoder Versions: Description of the software packages
- Download Details: How to obtain CMUSphinx packages
- How to get help and discuss things: How to get help and discuss things
- http://cmusphinx.sourceforge.net/doc/speech.ppt Cool presentation done byHeather Dewey-Hagborg
If you want to find out where CMUSphinx works, see
- Projects that use Sphinx: These projects, both commercial and free, use Sphinx in one form or another.
Advanced User Documentation
These documents either describe some particular aspect of the Sphinx codebase in detail, or they serve as adeveloper's guide to accomplishing some particular task.
- Building on IPhone: Building Pocketsphinx on various platforms
- Integrating CMUSphinx with Telephony Servers - Asterisk and Freeswitch: How to use pocketsphinx in Asterisk.
- The Incomplete Guide to Sphinx-3 Performance Tuning: How to tune the decoder to be fast (or rather, not horribly slow)
- Pocketsphinx optimizations for embedded devices.
- Phoneme Recognition (caveat emptor): How to use Sphinx3 for phoneme recognition.
- Segmentation and Diarization using LIUM tools: Using LIUM tools for speech segmentation and speaker diarization
- Training an acoustic model with LDA and MLLT feature transforms: How to train acoustic models with LDA and MLLT feature transforms
- Using PocketSphinx with GStreamer and Python (or Vala): How to use PocketSphinx withGStreamer and Python
- InstallingPythonStuff: How to install Python and necessary modules for SphinxTrain development
- MMIE Training in SphinxTrain: How to perform MMIE training.
- http://www.speech.cs.cmu.edu/sphinx/tutorial.html Robust Group Tutorial (classic tutorial from CMU Speech Group website)
Decoder Space
- Sphinx4 Space : Information about sphinx4, design, code, performance, history.
Reference
These documents describe the excruciating detail of APIs, or provide other useful background information for CMUSphinx developers.
- Doxygen documentation for PocketSphinx
- Doxygen documentation for SphinxBase
- ePyDoc documentation for SphinxTrain Python Modules
- JavaDocs for Sphinx4
Developer Documentation
This section contains various internal information for CMUSphinx developers. But we hope it will be still usable for you.
- Sphinx-4 Regression Tests: How to run regression tests
- Layout of SphinxTrain code: An overview of the SphinxTrain source code for researchers and developers
- CMUCLMTK development: Development guide for the CMU-Cambridge Language Modeling Toolkit.
- Language Features for SphinxBase, SphinxThree, and SphinxTrain
- Upcoming CMU Sphinx Software Releases: Plans for upcoming releases of Sphinx
- Release Check List: How to make a release
- Web Site Layout: How to organize information
File formats
- Acoustic Model Format
- MFC files
Data sources:
- Data Sources
Materials for GSOC
- Information for Students: Students information
- Tasks for Summer Of Code Projects: Ideas for students
GSoc Previous years
- Google Summer of Code 2012 Projects: Google Summer of Code 2012 Projects
Speech Recognition Theory
This section tries to collect research ideas for specific problems in speech recognition
- Lattices
- WFST
- Search Algorithms
- Language Models
- Features
- Noise Robustness
- Adaptation
- CMUSphinx Wiki--Open Source Toolkit For Speech Recognition
- The Kaldi Speech Recognition Toolkit
- CMUSphinx Wiki
- TextUML Toolkit is an open-source IDE for UML
- GStreamer---From Huihoo Wiki - Open Source Wiki
- The Kaldi Speech Recognition Toolkit
- Speech Recognition
- CMUSphinx Learn - Overview of CMUSphinx toolkit
- speech recognition & Speech synthesis[zz]
- OpenEars 语音处理Welcome to OpenEars: free speech recognition and speech synthesis for the iPhone
- CMUSphinx Learn - Basic concepts of speech
- Atomatic Speech Recognition(ASR)
- iPhone speech recognition API?
- Speech.Recognition(语音识别)
- CMUSphinx Learn - Training Acoustic Model For CMUSphinx
- GitHub Open Source For iOS
- OpenTheme : An open source graphic user interface (GUI) toolkit
- Network Performance Toolkit: Using Open Source Testing Tools
- 如何创建一个自己的git服务器
- uva 517 - Word(暴力+周期)
- 关于续行符
- Linux学习笔记 - 主機的 IP 是如何設定的
- STL泛型编程学习之String系列容器
- CMUSphinx Wiki--Open Source Toolkit For Speech Recognition
- ORA-01795的原因及解决办法
- Linux学习笔记 - Gateway / Router
- linux下使用yum安装mysql
- kettle 源码分析
- Spring MVC 3.1多视图协商配置(json、xml、freemarker)
- log4j 和slf4j的比较
- Linux学习笔记 - ARP与RARP
- 线程的状态