科大讯飞平台语音和文字互转简单介绍
来源:互联网 发布:淘宝怎么做免费推广 编辑:程序博客网 时间:2024/05/16 10:09
首先来介绍一下科大讯飞开放平台吧?网址是:http://www.xfyun.cn/
科大讯飞开放平台提供了很多免费的服务。
在这边分享中主要是使用java平台介绍科大讯飞中的在线语音合成功能(包括语音转文字和文字转语音)。
首先大家下载该功能的SDK.下载前必须要首先注册一个科大讯飞平台的账户。并在控制台中创建一个与之对应的java项目。之后下载在线语音合成的SDK.
解压缩后,有以下文件。
doc文件夹中含有改功能的介绍和api的介绍。lib文件夹中包含了该功能所需要的架包和动态链接库。sample文件夹中包含了一下例子,可以直接导入使用,完全不需要修改。
lib文件夹中有:
把这些文件和lib文件夹中的文件全部复制到创建的java项目中的lib文件夹下(没有该文件夹就自己创建一个)。
在项目处点击右键进入项目的属性。
选择javaBuild Path→选择Libraries→点击AddExternal JARS导入项目中lib文件夹下的*.jar的架包。然后点击Source→选择NativeLibraries Location ->点击右边的Edit->选择项目的lib文件夹。这样就完成了架包的导入。
下面就是开始主题啦。
java平台功能介绍链接:http://www.xfyun.cn/doccenter/java
1.文字转语音的关键代码:
1. SpeechSynthesizer mTts= SpeechSynthesizer.createSynthesizer( ); 2. //2.合成参数设置,详见《iFlytek MSC Reference Manual》SpeechSynthesizer 类 3. mTts.setParameter(SpeechConstant.VOICE_NAME, "xiaoyan");//设置发音人 4. mTts.setParameter(SpeechConstant.SPEED, "50");//设置语速 5. mTts.setParameter(SpeechConstant.VOLUME, "80");//设置音量,范围0~100 6. //设置合成音频保存位置(可自定义保存位置),保存在“./iflytek.pcm” 7. //如果不需要保存合成音频,注释该行代码 8. mTts.setParameter(SpeechConstant.TTS_AUDIO_PATH, "./iflytek.pcm"); 9. //3.开始合成 10. mTts.startSpeaking("科大讯飞,让世界聆听我们的声音", mSynListener); 11. 12. //合成监听器 13. private SynthesizerListener mSynListener = new SynthesizerListener(){ 14. //会话结束回调接口,没有错误时,error为null 15. public void onCompleted(SpeechError error) {} 16. //缓冲进度回调 17. //percent为缓冲进度0~100,beginPos为缓冲音频在文本中开始位置,endPos表示缓冲音频在文本中结束位置,info为附加信息。 18. public void onBufferProgress(int percent, int beginPos, int endPos, String info) {} 19. //开始播放 20. public void onSpeakBegin() {} 21. //暂停播放 22. public void onSpeakPaused() {} 23. //播放进度回调 24. //percent为播放进度0~100,beginPos为播放音频在文本中开始位置,endPos表示播放音频在文本中结束位置. 25. public void onSpeakProgress(int percent, int beginPos, int endPos) {} 26. //恢复播放回调接口 27. public void onSpeakResumed() {} 28. };
2.语音转文字关键代码
//1.创建SpeechRecognizer对象 1. SpeechRecognizer mIat= SpeechRecognizer.createRecognizer( ); 2. //2.设置听写参数,详见《iFlytek MSC Reference Manual》SpeechConstant类 3. mIat.setParameter(SpeechConstant.DOMAIN, "iat"); 4. mIat.setParameter(SpeechConstant.LANGUAGE, "zh_cn"); 5. mIat.setParameter(SpeechConstant.ACCENT, "mandarin "); 6. //3.开始听写 7. mIat.startListening(mRecoListener); 8. //听写监听器 9. private RecognizerListener mRecoListener = new RecognizerListener(){ 10. //听写结果回调接口(返回Json格式结果,用户可参见附录); 11. //一般情况下会通过onResults接口多次返回结果,完整的识别内容是多次结果的累加; 12. //关于解析Json的代码可参见MscDemo中JsonParser类; 13. //isLast等于true时会话结束。 14. public void onResult(RecognizerResult results, boolean isLast) { 15. DebugLog.Log("Result:"+results.getResultString ()); 16. } 17. //会话发生错误回调接口 18. public void onError(SpeechError error) { 19. error.getPlainDescription(true) //获取错误码描述} 20. //开始录音 21. public void onBeginOfSpeech() {} 22. //音量值0~30 23. public void onVolumeChanged(int volume){} 24. //结束录音 25. public void onEndOfSpeech() {} 26. //扩展用接口 27. public void onEvent(int eventType,int arg1,int arg2,String msg) {} 28. };
知道关键代码后就直接上本项目的完成代码啦,主要是写了一个界面,功能代码在WordVoice类中。MainJFrame主要是界面的逻辑控制和点击事件,有注释了,应该很容易看懂。
MainJFrame.java文件内容为:
import com.iflytek.cloud.speech.SpeechRecognizer;/* * To change this license header, choose License Headers in Project Properties. * To change this template file, choose Tools | Templates * and open the template in the editor. *//** * * @author peiyuwang */public class MainJFrame extends javax.swing.JFrame {SpeechRecognizer mIat = null;/** * Creates new form MainJFrame */public MainJFrame() {initComponents();BtnStop.setEnabled(false);BtnEnd.setEnabled(false);TxtVoiceToWord.setEditable(false);}/** * This method is called from within the constructor to initialize the form. * WARNING: Do NOT modify this code. The content of this method is always * regenerated by the Form Editor. */@SuppressWarnings("unchecked")// <editor-fold defaultstate="collapsed" desc="Generated Code"> private void initComponents() { jLabel1 = new javax.swing.JLabel(); jLabel2 = new javax.swing.JLabel(); jScrollPane1 = new javax.swing.JScrollPane(); TxtWordToVoice = new javax.swing.JTextArea(); BtnPlay = new javax.swing.JButton(); jLabel3 = new javax.swing.JLabel(); BtnStart = new javax.swing.JButton(); BtnStop = new javax.swing.JButton(); BtnEnd = new javax.swing.JButton(); jScrollPane2 = new javax.swing.JScrollPane(); TxtVoiceToWord = new javax.swing.JTextArea(); setDefaultCloseOperation(javax.swing.WindowConstants.EXIT_ON_CLOSE); jLabel1.setText("多媒体实验2"); // NOI18N jLabel2.setText("文字转语音:"); TxtWordToVoice.setColumns(20); TxtWordToVoice.setLineWrap(true); TxtWordToVoice.setRows(5); jScrollPane1.setViewportView(TxtWordToVoice); BtnPlay.setText("播放"); BtnPlay.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { BtnPlayActionPerformed(evt); } }); jLabel3.setText("语音转文字:"); BtnStart.setText("开始录制"); BtnStart.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { BtnStartActionPerformed(evt); } }); BtnStop.setText("暂停录制"); BtnStop.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { BtnStopActionPerformed(evt); } }); BtnEnd.setText("结束录制"); BtnEnd.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { BtnEndActionPerformed(evt); } }); TxtVoiceToWord.setColumns(20); TxtVoiceToWord.setRows(5); jScrollPane2.setViewportView(TxtVoiceToWord); javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane()); getContentPane().setLayout(layout); layout.setHorizontalGroup( layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addGroup(layout.createSequentialGroup() .addGap(45, 45, 45) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addComponent(jLabel3) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING) .addGroup(layout.createSequentialGroup() .addComponent(jLabel2) .addGap(297, 297, 297) .addComponent(BtnPlay, javax.swing.GroupLayout.DEFAULT_SIZE, 76, Short.MAX_VALUE)) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING, false) .addComponent(jScrollPane2, javax.swing.GroupLayout.Alignment.LEADING) .addGroup(javax.swing.GroupLayout.Alignment.LEADING, layout.createSequentialGroup() .addComponent(BtnStart, javax.swing.GroupLayout.PREFERRED_SIZE, 100, javax.swing.GroupLayout.PREFERRED_SIZE) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED, 80, Short.MAX_VALUE) .addComponent(BtnStop, javax.swing.GroupLayout.PREFERRED_SIZE, 100, javax.swing.GroupLayout.PREFERRED_SIZE) .addGap(72, 72, 72) .addComponent(BtnEnd, javax.swing.GroupLayout.PREFERRED_SIZE, 100, javax.swing.GroupLayout.PREFERRED_SIZE)) .addComponent(jScrollPane1, javax.swing.GroupLayout.Alignment.LEADING)))) .addContainerGap(22, Short.MAX_VALUE)) .addGroup(layout.createSequentialGroup() .addGap(209, 209, 209) .addComponent(jLabel1) .addContainerGap(javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)) ); layout.setVerticalGroup( layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addGroup(layout.createSequentialGroup() .addContainerGap() .addComponent(jLabel1, javax.swing.GroupLayout.PREFERRED_SIZE, 27, javax.swing.GroupLayout.PREFERRED_SIZE) .addGap(9, 9, 9) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING) .addComponent(jLabel2) .addComponent(BtnPlay)) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jScrollPane1, javax.swing.GroupLayout.PREFERRED_SIZE, 103, javax.swing.GroupLayout.PREFERRED_SIZE) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED) .addComponent(jLabel3) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE) .addComponent(BtnStart) .addComponent(BtnStop) .addComponent(BtnEnd)) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jScrollPane2, javax.swing.GroupLayout.DEFAULT_SIZE, 113, Short.MAX_VALUE) .addContainerGap()) ); pack(); }// </editor-fold> //文字转语音中的播放按钮的执行事件private void BtnPlayActionPerformed(java.awt.event.ActionEvent evt) {String strWord = TxtWordToVoice.getText().toString();if (!strWord.isEmpty()) {BtnPlay.setEnabled(false);BtnPlay.setText("正在缓冲...");WordVoice.WordToVoice(strWord, BtnPlay);}}//语音转文字的开始录音按钮的事件private void BtnStartActionPerformed(java.awt.event.ActionEvent evt) {BtnStart.setEnabled(false);BtnStart.setText("正在录音...");BtnStop.setEnabled(true);BtnEnd.setEnabled(true);//当为空时,说明是没有录过音或者已经按了结束录音,此时为第二次录音,所以应该先置空if(mIat==null){TxtVoiceToWord.setText("");}mIat = WordVoice.VoiceToWord(TxtVoiceToWord);}//语音转文字的暂停录音按钮的事件private void BtnStopActionPerformed(java.awt.event.ActionEvent evt) {BtnStart.setEnabled(true);BtnStart.setText("继续录音");BtnStop.setEnabled(false);if (mIat != null) {mIat.stopListening();}}//语音转文字的结束录音按钮的事件private void BtnEndActionPerformed(java.awt.event.ActionEvent evt) {BtnStart.setEnabled(true);BtnStart.setText("开始录音");BtnStop.setEnabled(false);BtnEnd.setEnabled(false);if (mIat != null) {mIat.stopListening();mIat = null;}}/** * @param args * the command line arguments */public static void main(String args[]) {/* Set the Nimbus look and feel */// <editor-fold defaultstate="collapsed" desc=" Look and feel setting// code (optional) ">/* * If Nimbus (introduced in Java SE 6) is not available, stay with the * default look and feel. For details see * http://download.oracle.com/javase/tutorial/uiswing/lookandfeel/plaf. * html */try {for (javax.swing.UIManager.LookAndFeelInfo info : javax.swing.UIManager.getInstalledLookAndFeels()) {if ("Nimbus".equals(info.getName())) {javax.swing.UIManager.setLookAndFeel(info.getClassName());break;}}} catch (ClassNotFoundException ex) {java.util.logging.Logger.getLogger(MainJFrame.class.getName()).log(java.util.logging.Level.SEVERE, null,ex);} catch (InstantiationException ex) {java.util.logging.Logger.getLogger(MainJFrame.class.getName()).log(java.util.logging.Level.SEVERE, null,ex);} catch (IllegalAccessException ex) {java.util.logging.Logger.getLogger(MainJFrame.class.getName()).log(java.util.logging.Level.SEVERE, null,ex);} catch (javax.swing.UnsupportedLookAndFeelException ex) {java.util.logging.Logger.getLogger(MainJFrame.class.getName()).log(java.util.logging.Level.SEVERE, null,ex);}// </editor-fold>/* Create and display the form */java.awt.EventQueue.invokeLater(new Runnable() {public void run() {new MainJFrame().setVisible(true);}});}// Variables declaration - do not modifyprivate javax.swing.JButton BtnEnd;private javax.swing.JButton BtnPlay;private javax.swing.JButton BtnStart;private javax.swing.JButton BtnStop;private javax.swing.JTextArea TxtVoiceToWord;private javax.swing.JTextArea TxtWordToVoice;private javax.swing.JLabel jLabel1;private javax.swing.JLabel jLabel2;private javax.swing.JLabel jLabel3;private javax.swing.JScrollPane jScrollPane1;private javax.swing.JScrollPane jScrollPane2;// End of variables declaration}
WordVoice.java文件内容为:
import java.util.ArrayList;import java.util.Iterator;import java.util.List;import javax.swing.JButton;import javax.swing.JTextArea;import com.alibaba.fastjson.JSON;import com.alibaba.fastjson.JSONObject;import com.iflytek.cloud.speech.RecognizerListener;import com.iflytek.cloud.speech.RecognizerResult;import com.iflytek.cloud.speech.SpeechConstant;import com.iflytek.cloud.speech.SpeechError;import com.iflytek.cloud.speech.SpeechSynthesizer;import com.iflytek.cloud.speech.SpeechUtility;import com.iflytek.cloud.speech.SynthesizerListener;import com.iflytek.cloud.speech.Version;import com.iflytek.cloud.speech.SpeechRecognizer;public class WordVoice {public static void WordToVoice(String strContent, JButton button) {// 将“12345678”替换成您申请的APPID,申请地址:http://open.voicecloud.cnSpeechUtility.createUtility(SpeechConstant.APPID + "=58493c56");// 1.创建SpeechSynthesizer对象SpeechSynthesizer mTts = SpeechSynthesizer.createSynthesizer();// 2.合成参数设置,详见《iFlytek MSC Reference Manual》SpeechSynthesizer 类mTts.setParameter(SpeechConstant.VOICE_NAME, "xiaoyan");// 设置发音人mTts.setParameter(SpeechConstant.SPEED, "50");// 设置语速mTts.setParameter(SpeechConstant.VOLUME, "80");// 设置音量,范围0~100// 设置合成音频保存位置(可自定义保存位置),保存在“./iflytek.pcm”// 如果不需要保存合成音频,注释该行代码// mTts.setParameter(SpeechConstant.TTS_AUDIO_PATH, "./iflytek.pcm");// 合成监听器SynthesizerListener mSynListener = new SynthesizerListener() {// 会话结束回调接口,没有错误时,error为nullpublic void onCompleted(SpeechError error) {button.setText("播放");button.setEnabled(true);}// 缓冲进度回调// percent为缓冲进度0~100,beginPos为缓冲音频在文本中开始位置,endPos表示缓冲音频在文本中结束位置,info为附加信息。public void onBufferProgress(int percent, int beginPos, int endPos, String info) {}// 开始播放public void onSpeakBegin() {button.setText("正在播放");}// 暂停播放public void onSpeakPaused() {}// 播放进度回调// percent为播放进度0~100,beginPos为播放音频在文本中开始位置,endPos表示播放音频在文本中结束位置.public void onSpeakProgress(int percent, int beginPos, int endPos) {}// 恢复播放回调接口public void onSpeakResumed() {}};// 3.开始合成mTts.startSpeaking(strContent, mSynListener);}public static SpeechRecognizer VoiceToWord(JTextArea text) {// 将“12345678”替换成您申请的APPID,申请地址:http://open.voicecloud.cnSpeechUtility.createUtility(SpeechConstant.APPID + "=58493c56");// 1.创建SpeechRecognizer对象SpeechRecognizer mIat = SpeechRecognizer.createRecognizer();// 2.设置听写参数,详见《iFlytek MSC Reference Manual》SpeechConstant类mIat.setParameter(SpeechConstant.DOMAIN, "iat");mIat.setParameter(SpeechConstant.LANGUAGE, "zh_cn");mIat.setParameter(SpeechConstant.ACCENT, "mandarin ");// 听写监听器RecognizerListener mRecoListener = new RecognizerListener() {public void onResult(RecognizerResult results, boolean isLast) {//使用了阿里巴巴fastjson的架包来解析返回的json字符串。//对于这部分可能大家要看看fastjson的使用方法才看得懂哦Root root=JSON.parseObject(results.getResultString(), Root.class);Iterator<Ws> list=root.getWs().iterator();while(list.hasNext()){Iterator<Cw> listCw=list.next().getCw().iterator();while(listCw.hasNext()){text.append(listCw.next().getW());}}}// 会话发生错误回调接口public void onError(SpeechError error) {System.out.println("错误" + error.getErrorCode() + " " + error.getErrorDesc());}// 开始录音public void onBeginOfSpeech() {}// 音量值0~30public void onVolumeChanged(int volume) {}// 结束录音public void onEndOfSpeech() {}// 扩展用接口public void onEvent(int eventType, int arg1, int arg2, String msg) {}};// 3.开始听写mIat.startListening(mRecoListener);return mIat;}}
以下的三个java文件主要是为了方便解析语音转文字时对json字符串的解析。为什么要这几个文件,因为阿里巴巴的fastjson架包是按类来解析json字符串的。每个字段对应一个成员变量。大家想了解更多的,可以自行查阅。
Root.java内容
import java.util.List;public class Root {private int sn;private boolean ls;private int bg;private int ed;private List<Ws> ws ;public void setSn(int sn){this.sn = sn;}public int getSn(){return this.sn;}public void setLs(boolean ls){this.ls = ls;}public boolean getLs(){return this.ls;}public void setBg(int bg){this.bg = bg;}public int getBg(){return this.bg;}public void setEd(int ed){this.ed = ed;}public int getEd(){return this.ed;}public void setWs(List<Ws> ws){this.ws = ws;}public List<Ws> getWs(){return this.ws;}}
Ws.java文件内容
import java.util.List;public class Ws {private int bg;private List<Cw> cw ;public void setBg(int bg){this.bg = bg;}public int getBg(){return this.bg;}public void setCw(List<Cw> cw){this.cw = cw;}public List<Cw> getCw(){return this.cw;}}
Cw,java文件
public class Cw {private double sc;private String w;public void setSc(double sc){this.sc = sc;}public double getSc(){return this.sc;}public void setW(String w){this.w = w;}public String getW(){return this.w;}}
执行之后显示是这样的:
项目下载地址:http://dl.download.csdn.net/down11/20161210/b507c99c8261d0c5b6a63fc1dfe4c55a.zip?response-content-disposition=attachment%3Bfilename%2A%3D%22utf8%27%27WordVoice.zip%22&OSSAccessKeyId=9q6nvzoJGowBj4q1&Expires=1481381643&Signature=UzKuy0IdVXL8C5gHtr8IN2YWnlA%3D
项目下载页面:http://download.csdn.net/detail/peiyuwang_2015/9707706
- 科大讯飞平台语音和文字互转简单介绍
- 科大讯飞实现“文字转语音”和“语音转文字”
- 科大讯飞文字转语音功能
- 科大讯飞和百度语音平台语音识别Java调用记录
- 使用(科大讯飞)文字转语音播放
- iOS 文字转语音
- 文字转语音
- Android文字转语音
- Android文字转语音
- iOS文字 转 语音
- 文字转语音AVSpeechSynthesizer
- iOS-文字转语音
- 文字转语音
- 文字转语音软件
- 文字转语音地址
- unity 文字转语音
- 文字转语音免费导出,语音来至讯飞
- .NET 语音转文字 文字转语音
- 精通 CSS+DIV 网页样式与布局 109
- php 基础知识
- 公众号稳步钱进的说明
- dex文件结构
- 精通 CSS+DIV 网页样式与布局 110
- 科大讯飞平台语音和文字互转简单介绍
- Java在File类里定义列出系统目录的方法
- ConvertUtils转换器的使用
- 349. Intersection of Two Arrays
- QT中setLayout无效的问题
- 关于在win8.1(64位)上编写汇编教学
- Pheonix学习笔记 --- Blk Data Loading,Pheonix导如CSV文件
- codeforces 246 D. Colorful Graph (set)
- 领域驱动设计——浅析VO、DTO、DO、PO的概念、区别和用处