ros下语音调试-2

来源:互联网 发布:被网络监控 编辑:程序博客网 时间:2024/06/10 09:53

1. Install pocketsphinx


  Sphinx是由美国卡内基梅隆大学开发的大词汇量、非特定人、连续英语语音识别系统。一个连续语音识别系统大致可分为四个部分:特征提取,声学模型训练,语言模型训练和解码器。    PocketSphinx是一个计算量和体积都很小的嵌入式语音识别引擎。在Sphinx-2的基础上针对嵌入式系统的需求修改、优化而来,是第一个开源面向嵌入式的中等词汇量连续语音识别项目。识别精度和Sphinx-2差不多。


  CMU Pocket Sphinx speech recognizer uses gstreamer to automatically split the incoming audio into utterances to be recognized, and offers services to start and stop recognition.Currently, the recognizer requires a language model and dictionary file. These can be automatically built from a corpus of sentances using the Online Sphinx Knowledge Base Tool. 
  sound_play provides a ROS node that translates commands on a ROS topic (robotsound) into sounds. The node supports built-in sounds, playing OGG/WAV files, and doing speech synthesis via festival. C++ and Python bindings allow this node to be used without understanding the details of the message format, allowing faster development and resilience to message format changes.
  The sound_play package uses the CMU Festival TTS library to generate synthetic speech.
  我们使用的是现成的语言模型和字典文件   
  语音识别基础知识:   http://blog.csdn.net/zouxy09/article/details/7941585  cmu sphinx官网 :   http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx  pocketsphinx安装编译: http://blog.csdn.net/zouxy09/article/details/7942784/
  1) 下载并解压到同一个目录 sphinxbase-5prealpha & pocketsphinx-5prealpha  http://cmusphinx.sourceforge.net/wiki/tutorialoverview
  2) 需要有这些依赖项     gcc, automake, autoconf, libtool, bison, swig at least version 2.0, python development package, pulseaudio development package
  3) 把没有的依赖项安装完后不再报错,进入解压后的sphinxbase-5prealpha文件夹  4) $ ./autogen.sh     $ ./configure     $ make     $ sudo make install
  5) 设置环境变量     $ export LD_LIBRARY_PATH=/usr/local/lib     $ export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
  6) 进入pocketsphinx-5prealpha文件夹     $ ./configure     $ make     $ sudo make install
   7) 测试是否能识别     $ pocketsphinx_continuous -inmic yes

2.Debs installation 安装 Turtlebot 及远程控制 Turtlebot (已安装 ros-hydro-desktop-full)

 http://www.cnblogs.com/cv-pr/p/5015657.html Turtlebot上搭载一台主机A,作为主机Master,有自带的电源和3D传感器,roscore在这台机器上启动。pc电脑远程连接A,和A通讯,pc不需要启动roscore,可以在远程pc上控制Turtlebot.


 $ sudo apt-get install ros-hydro-turtlebot ros-hydro-turtlebot-apps ros-hydro-turtlebot-viz ros-hydro-turtlebot-simulator ros-hydro-kobuki-ftdi
 添加底盘 Kobuki的 udev rules $ . /opt/ros/hydro/setup.bash
 $ rosrun kobuki_ftdi create_udev_rules  配置环境变量 $  echo "source /opt/ros/hydro/setup.bash" >> ~/.bashrc  时间同步

3.创建语音库

 任意文件夹下创建一个*.txt文本文档,将所需识别的句子写入该文档。写成单列,如: turn around go forward stop 注意:文档中不能有任何标点符号,如 将 don't 写成 do not 或dont,将54 写成 fifty four.保存退出。  利用在线工具LMTool建立语言模型和语音库 进入  http://www.speech.cs.cmu.edu/tools/lmtool-new.html 载入.txt文本,点击'Compile knowledge Base' 下载标注为'COMPRESSED TARBALL'的压缩文件,然后解压 进入解压后的文件夹,更改各个文件的名字,如  $ rename -f 's/3026/nav_commands/' *  测试: (pocketsphinx_continuous解码器用 -lm选项来指定要加载的语言模型,-dict来指定要加载的字典)  打开Terminal,输入命令pocketsphinx_continuous -inmic yes -dict /..(此处为上一步中提取的位置路径)/****.dic(上一步中获取的四位数字) -lm /..(此处为上一步中提取的位置路径)/****.lm(上一步中获取的四位数字),运行程序即可

4.GPSR 运行

 $ roscore 新的终端 $ cd catkin_ws/src/pi_speech_tutorial_master/launch $ roalaunch talkback_gpsr.launch 新的终端 $ pocketsphinx_continuous -inmic yes -dict /home/../Speech_test.dic -lm /home/../Speech_test.lm 新的终端 $ rostopic list 查看节点和话题 $rostopic echo /talkback   查看话题上的消息 socket是用来创建topic的,因为语音没有话题可以发出。
0 0
原创粉丝点击