记一个语音识别框架(SpeechKit.faramework)

来源:互联网 发布:如何用阿里云搭建ss 编辑:程序博客网 时间:2024/06/04 22:17

此框架为收费版的,有3个月的试用期,每天只能处理500条正确语音信息

官方下载地址(国外的) 下载地址

我自己上传在csdn的下载地址(只有IOS版的)下载地址

此框架主要功能识别语音,显示在text上语音内容,读取text内容,播放出来

IOS开发指南


Recognizing Speech¶


1.Before you use speech recognition, ensure that you have set up the core Speech Kit framework with the setupWithID:host:port:useSSL:delegate: method。

2.Then create and initialize a SKRecognizer object:

recognizer = [[SKRecognizer alloc] initWithType:SKSearchRecognizerType                                      detection:SKShortEndOfSpeechDetection                                      language:@"eng-USA"                                      delegate:self];

3.The initWithType:detection:language:delegate method initializes a recognizer and starts the speech recognition process.

  • The type parameter is an NSString *, generally one of the recognition type constants defined in the Speech Kit framework and available in the header SKRecognizer.h. Nuance may provide you with a different value for your unique recognition needs, in which case you will enter the rawNSString.

  • The detection parameter determines the end-of-speech detection model and must be one of theSKEndOfSpeechDetection types.

  • The language parameter defines the speech language as a string in the format of the ISO 639-2 three-letter language code, followed by a dash “-”, followed by the ISO 3166-1 (alpha 3) three-letter country code.


    4.The delegate  rectives the recognition results  or error messages,as described below.

    Receiving Recognition Results

    To retrieve the recognition results, implement the recognizer:didFinishWithResults: delegate method.
    - (void)recognizer:(SKRecognizer *)recognizer didFinishWithResults:(SKRecognition *)results {    [recognizer autorelease];    // Perform some action on the results}

    This delegate method will be called only on successful completion, and the results list will have zero or more results. The first result can always be retrieved with the firstResult method. Even in the absence of an error, there may be a suggestion, present in the recognition results object, from the speech server. This suggestion should be presented to the user.

    Handling Errors

    To be informed of any recognition errors, implement the recognizer:didFinishWithError:suggestion:delegate method. In the case of errors, only this method will be called; conversely, on success this method will not be called. In addition to the error, a suggestion, as described in the previous section, may or may not be present.

    - (void)recognizer:(SKRecognizer *)recognizer didFinishWithError:(NSError *)error suggestion:(NSString *)suggestion {    [recognizer autorelease];    // Inform the user of the error and suggestion}

    Managing Recording State Changes

    Optionally, to be informed when the recognizer starts or stops recording audio, implement therecognizerDidBeginRecording: and recognizerDidFinishRecording: delegate methods. There may be a delay between initialization of the recognizer and the actual start of recording, so therecognizerDidBeginRecording: message can be used to signal to the user when the system is listening.

    - (void)recognizerDidBeginRecording:(SKRecognizer *)recognizer {    // Update the UI to indicate the system is now recording}

    The recognizerDidFinishRecording: message is sent before the speech server has finished receiving and processing the audio, and therefore before the result is available.

    - (void)recognizerDidFinishRecording:(SKRecognizer *)recognizer {    // Update the UI to indicate that recording has stopped and the speech is still being processed}

    This message is sent both with and without end-of-speech detection models in place. The message is sent regardless, whether recording was stopped due to calling the stopRecording method or due to detecting end-of-speech.

    Setting Earcons (Audio Cues)

    Optionally, to play audio cues before and after recording and after cancelling a recognition session, you can use earcons. You need to create an SKEarcon object and set it using the setEarcon:forType: method of the Speech Kit framework. The following example shows how to set earcons in the recognizer sample app.

    - (void)setEarcons {        // Set earcons to play        SKEarcon* earconStart   = [SKEarcon earconWithName:@"earcon_listening.wav"];        SKEarcon* earconStop    = [SKEarcon earconWithName:@"earcon_done_listening.wav"];        SKEarcon* earconCancel  = [SKEarcon earconWithName:@"earcon_cancel.wav"];        [SpeechKit setEarcon:earconStart forType:SKStartRecordingEarconType];        [SpeechKit setEarcon:earconStop forType:SKStopRecordingEarconType];        [SpeechKit setEarcon:earconCancel forType:SKCancelRecordingEarconType];}

    When the code block above is called, after you have set up the core Speech Kit framework using thesetupWithID:host:port:useSSL:delegate: method, it plays the earcon_listening.wav audio file before recording starts and plays the earcon_done_listening.wav audio file when the recording is completed. In the case of cancellation, the earcon_cancel.wav file is played to the user.

    The earconWithName: method works only with audio files that are supported by the device.


    Converting Text to Speech


    Initiating Text-To-Speech

    1. Before you use speech synthesis, ensure that you have set up the core Speech Kit framework with thesetupWithID:host:port:useSSL:delegate: method.

    2. Then create and initialize a SKVocalizer object to perform text-to-speech conversion:

      vocalizer = [[SKVocalizer alloc] initWithLanguage:@"en_US"                                         delegate:self];


      3.The initWithLanguage:delegate: method initializes a text-to-speech synthesizer with a default language:

  •      The language parameter is a NSString * that defines the spoken language in the format of theISO 639-1 two-letter language code, followed by an underscore “_”, followed by the ISO 3166-1 (alpha 2) two-letter country code. For example, for the English language as spoken in the United States, you would use the four-character underscore code “en_US”. Each supported language has one or more uniquely defined voices, either male or female.

    Note

    For text-to-speech supported languages, refer to the list of supported languages on the NDEV Mobile website. This list of supported languages will be updated when new language support is added. The new languages will not necessarily require updating an existing Dragon Mobile SDK.

  • The delegate parameter defines the object to receive status and error messages from the speech synthesizer.

    4.The initWithLanguage:delegate: method uses a default voice chosen by Nuance. To select a different voice, use the initWithVoice:delegate: method instead.

    The voice parameter is an NSString * that defines the voice model. For example, the female U.S. English voice is Samantha.

    Note

    The up-to-date list of supported voices is provided with the supported languages. Look for the Supported Languages list on the NDEV Mobile website.

    5.To begin converting text to speech, use either the speakString: or the speakMarkupString: method. These methods send the requested string to the speech server and start streaming and playing audio on the device.

    [vocalizer speakString:@"Hello world."]

    1. Note

      The speakMarkupString method is used in exactly the same manner as speakString except that it takes an NSString * filled with SSML, a markup language tailored for use in describing synthesized speech. An advanced discussion of SSML is beyond the scope of this document; however, you can find more information from the W3C at http://www.w3.org/TR/speech-synthesis/. For exceptions to the SSML 1.0 Recommendation, see SSML Compliance.

    As speech synthesis is a network-based service, these methods are all asynchronous, and in general an error condition is not immediately reported. Any errors are reported as messages to the delegate.

    Managing Text-To-Speech Feedback

    The synthesized speech will not immediately start playback. Instead, there will be a brief delay as the request is sent to the speech server and speech is streamed back. For UI coordination, to indicate when audio playback begins, the optional delegate method vocalizer:willBeginSpeakingString: is provided.

    - (void)vocalizer:(SKVocalizer *)vocalizer willBeginSpeakingString:(NSString *)text {    // Update UI to indicate that text is being spoken}

    The NSString * in the message is a reference to the original string passed to one of the speakString orspeakMarkupString methods and may be used track sequences of playback when sequential text-to-speech requests are made.

    On completion of the speech playback, the vocalizer:didFinishSpeakingString:withError message is sent. This message is always sent on successful completion and on error. In the success case, error is nil.

    - (void)vocalizer:(SKVocalizer *)vocalizer didFinishSpeakingString:(NSString *)text withError:(NSError *)error {    if (error) {        // Present error dialog to user    } else {        // Update UI to indicate speech is complete    }}


  • The type parameter is an NSString *, generally one of the recognition type constants defined in the Speech Kit framework and available in the header SKRecognizer.h. Nuance may provide you with a different value for your unique recognition needs, in which case you will enter the rawNSString.

  • The detection parameter determines the end-of-speech detection model and must be one of theSKEndOfSpeechDetection types.

  • The language parameter defines the speech language as a string in the format of the ISO 639-2 three-letter language code, followed by a dash “-”, followed by the ISO 3166-1 (alpha 3) three-letter country code.

  • The language parameter is a NSString * that defines the spoken language in the format of theISO 639-1 two-letter language code, followed by an underscore “_”, followed by the ISO 3166-1 (alpha 2) two-letter country code. For example, for the English language as spoken in the United States, you would use the four-character underscore code “en_US”. Each supported language has one or more uniquely defined voices, either male or female.

    Note

    For text-to-speech supported languages, refer to the list of supported languages on the NDEV Mobile website. This list of supported languages will be updated when new language support is added. The new languages will not necessarily require updating an existing Dragon Mobile SDK.

  • The delegate parameter defines the object to receive status and error messages from the speech synthesizer.

    The voice parameter is an NSString * that defines the voice model. For example, the female U.S. English voice is Samantha.

    Note

    The up-to-date list of supported voices is provided with the supported languages. Look for the Supported Languages list on the NDEV Mobile website.

  • 0 0