speech_recognition
Pythonic module for speech recognition using the Google Speech Recognition API.
SpeechRecognition 2.2.0 : Python Package Index library for performing speech recognition with the google speech recognition api.
I have wrist pain when I type and I would like to start writing SQL statements, stored procedure, and views using speech recognition.
Source: (StackOverflow)
I tried a lot but can´t find it out, so I hope you can help me.
I am trying to build my own voice recognition app, which doesn´t show up the dialog.
I already wrote some code and it works quite fine, but my problem is that the recognizer seems to stop without any errors or other messanges in the LogCat.
A strange fact is that the "onRmsChanged" from the "RecognitionListener" interface is still called all the time, but no "onBeginningOfSpeech" is called anymore.
If I speak just after the speech recognition has started it works.
But it doesn´t if I wait a few seconds.
The used API is 4.0.3 and I installed it on my Nexus 7 with the Version 4.2.1
I would really appreciate if you have some good ideas.
Some code snippets:
My class:
class SpeechListener implements RecognitionListener
{
public void onBeginningOfSpeech()
{
Log.d(TAG, "onBeginningOfSpeech()");
}
public void onBufferReceived(byte[] buffer)
{
Log.d(TAG, "onBufferReceived()");
}
public void onEndOfSpeech()
{
Log.d(TAG, "onEndOfSpeech()");
}
public void onError(int error)
{
Log.d(TAG, "onError(): " + error);
if(error == SpeechRecognizer.ERROR_NO_MATCH)
{
}
else if(error == SpeechRecognizer.ERROR_SPEECH_TIMEOUT)
{
}
else
{
tvOutput.setText("Error: " + error);
}
}
public void onEvent(int eventType, Bundle params)
{
Log.d(TAG, "onEvent()");
}
public void onPartialResults(Bundle partialResults)
{
Log.d(TAG, "onPartialResults()");
}
public void onReadyForSpeech(Bundle params)
{
Log.d(TAG, "onReadyForSpeech()");
}
public void onResults(Bundle results)
{
Log.d(TAG, "onResults(): " + results);
String str = new String();
ArrayList<String> data = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
for(int i = 0; i < data.size(); i++)
{
str += data.get(i) + "\n";
}
tvOutput.setText(tvOutput.getText().toString() + "\n\n" + "Results: " + str);
}
public void onRmsChanged(float rmsdB)
{
Log.d(TAG, "onRmsChanged()");
}
}
And my implementation in the MainActivity:
this.srSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
this.srSpeechRecognizer.setRecognitionListener(new SpeechListener());
this.iSpeechIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
this.iSpeechIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
this.iSpeechIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, "voice.recognition.test");
this.iSpeechIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 10);
And so it´s started:
srSpeechRecognizer.startListening(iSpeechIntent);
Logs with speaking:
12-16 13:50:53.576: D/DreamManagerService(485): Dream finished: android.os.Binder@415bbf38
12-16 13:50:53.576: I/DreamManagerService(485): Leaving dreamland.
12-16 13:50:53.576: I/DreamController(485): Stopping dream: name=ComponentInfo{com.google.android.deskclock/com.android.deskclock.Screensaver}, isTest=false, userId=0
12-16 13:50:53.586: I/PowerManagerService(485): Waking up from dream...
12-16 13:50:53.616: I/ActivityManager(485): No longer want com.google.android.gsf.login (pid 13171): empty #17
12-16 13:50:56.796: I/GoogleRecognitionServiceImpl(1461): #startListening [de-DE]
12-16 13:50:56.806: I/ActivityManager(485): Start proc com.google.android.gsf.login for service com.google.android.gsf.login/com.google.android.gsf.loginservice.GoogleLoginService: pid=13343 uid=10019 gids={50019, 3003, 1007, 1028, 1015, 2001, 3006}
12-16 13:50:56.866: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:56.886: D/dalvikvm(1461): GC_FOR_ALLOC freed 516K, 12% free 8706K/9892K, paused 18ms, total 18ms
12-16 13:50:56.906: D/dalvikvm(1461): GC_CONCURRENT freed 160K, 9% free 9015K/9892K, paused 3ms+2ms, total 21ms
12-16 13:50:56.906: I/AudioService(485): AudioFocus requestAudioFocus() from android.media.AudioManager@4135e960com.google.android.speech.audio.AudioController$1@41261910
12-16 13:50:56.916: I/VS.G3EngineManager(1461): create_rm: m=ENDPOINTER_VOICESEARCH,l=en-US
12-16 13:50:56.916: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:56.916: I/VS.G3EngineManager(1461): Brought up new g3 instance :/system/usr/srec/en-US/endpointer_voicesearch.config for: en-USin: 3 ms
12-16 13:50:56.926: I/ConnectionFactoryImpl(1461): Opening SSL connection: vs.google.com:14259
12-16 13:50:56.966: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.016: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.066: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.116: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.166: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.216: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.266: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.316: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.366: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.416: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.466: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.516: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.566: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.616: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.666: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.716: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.766: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.816: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.866: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.916: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:57.966: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.016: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.066: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.116: I/MainActivity/SpeechListener(13268): onBeginningOfSpeech()
12-16 13:50:58.126: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.176: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.226: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.276: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.326: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.376: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.426: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.476: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.526: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.576: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.626: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.676: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.726: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.776: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.826: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.876: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.926: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:58.976: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.026: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.076: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.126: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.176: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.236: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.286: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.336: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.386: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.436: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.486: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.536: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.586: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.636: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.646: I/MicrophoneInputStream(1461): mic_close
12-16 13:50:59.666: I/AudioService(485): AudioFocus abandonAudioFocus() from android.media.AudioManager@4135e960com.google.android.speech.audio.AudioController$1@41261910
12-16 13:50:59.666: D/dalvikvm(1461): threadid=37: thread exiting, not yet detached (count=0)
12-16 13:50:59.666: I/MainActivity/SpeechListener(13268): onEndOfSpeech()
12-16 13:50:59.676: I/decoder(1461): INFO: recognition time wall: 2.732 sec user: 0.54 sec sys: 0.08 sec
12-16 13:50:59.686: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.736: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.786: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.826: I/MainActivity/SpeechListener(13268): onResults(): Bundle[mParcelledData.dataSize=292]
12-16 13:50:59.836: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.886: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.936: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:50:59.986: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:51:00.046: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:51:00.096: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:51:00.146: D/MainActivity/SpeechListener(13268): onRmsChanged()
12-16 13:51:00.196: D/MainActivity/SpeechListener(13268): onRmsChanged()
Logs without speaking:
12-16 13:53:39.246: I/GoogleRecognitionServiceImpl(1461): #startListening [de-DE]
12-16 13:53:39.296: D/dalvikvm(1461): GC_FOR_ALLOC freed 567K, 12% free 8708K/9892K, paused 21ms, total 21ms
12-16 13:53:39.316: D/dalvikvm(1461): GC_CONCURRENT freed 164K, 9% free 9017K/9892K, paused 3ms+2ms, total 21ms
12-16 13:53:39.316: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.316: I/AudioService(485): AudioFocus requestAudioFocus() from android.media.AudioManager@4135e960com.google.android.speech.audio.AudioController$1@41261910
12-16 13:53:39.326: I/VS.G3EngineManager(1461): create_rm: m=ENDPOINTER_VOICESEARCH,l=en-US
12-16 13:53:39.326: I/ConnectionFactoryImpl(1461): Opening SSL connection: vs.google.com:14259
12-16 13:53:39.326: I/VS.G3EngineManager(1461): Brought up new g3 instance :/system/usr/srec/en-US/endpointer_voicesearch.config for: en-USin: 5 ms
12-16 13:53:39.366: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.416: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.466: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.516: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.576: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.626: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.676: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.726: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.776: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.826: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.876: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.926: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:39.976: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.026: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.076: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.136: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.176: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.226: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.286: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.336: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.386: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.436: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.486: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.536: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.586: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.636: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.686: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.736: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.786: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.836: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.886: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.936: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:40.986: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.046: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.096: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.146: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.196: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.246: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.296: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.346: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.396: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.446: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.496: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.546: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.596: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.646: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.696: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.746: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.796: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.846: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.896: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.946: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:41.996: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.046: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.096: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.146: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.196: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.246: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.296: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.356: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.406: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.456: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.506: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.556: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.606: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.656: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.706: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.756: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.806: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.856: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.906: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:42.956: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.006: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.056: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.116: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.156: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.216: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.266: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.316: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.366: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.416: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.466: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.516: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.566: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.616: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.666: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.716: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.766: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.816: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.866: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.916: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:43.966: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.016: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.066: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.116: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.166: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.226: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.276: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.326: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.376: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.426: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.476: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.526: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.576: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.626: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.676: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.726: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.776: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.826: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.876: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.926: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:44.976: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.026: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.076: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.126: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.176: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.226: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.276: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.326: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.376: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.426: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.476: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.526: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.576: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.636: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.676: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.736: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.786: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.836: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.886: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.936: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:45.986: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.036: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.086: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.136: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.186: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.236: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.286: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.336: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.386: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.436: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.486: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.536: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.596: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.636: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.696: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.746: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.796: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.846: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.896: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.946: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:46.996: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.046: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.096: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.146: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.196: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.246: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.296: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.346: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.396: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.446: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.496: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.556: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.596: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.656: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.696: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.746: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.796: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.856: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.906: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:47.956: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.006: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.056: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.106: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.156: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.206: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.256: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.306: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.356: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.406: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.456: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.506: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.556: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.616: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.656: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.706: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.766: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.816: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.866: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.916: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:48.966: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.016: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.066: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.116: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.166: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.216: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.266: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.316: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.366: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.416: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.466: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.516: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.566: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.616: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.666: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.716: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.776: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.816: D/dalvikvm(1461): GC_FOR_ALLOC freed 106K, 9% free 9025K/9892K, paused 32ms, total 32ms
12-16 13:53:49.816: I/dalvikvm-heap(1461): Grow heap (frag case) to 9.282MB for 320656-byte allocation
12-16 13:53:49.836: D/dalvikvm(1461): GC_FOR_ALLOC freed 156K, 11% free 9182K/10208K, paused 19ms, total 19ms
12-16 13:53:49.836: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.886: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.936: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:49.986: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:50.036: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:50.086: D/MainActivity/SpeechListener(13441): onRmsChanged()
12-16 13:53:50.136: D/MainActivity/SpeechListener(13441): onRmsChanged()
Source: (StackOverflow)
Has anyone had success with Dragon Naturally Speaking voice recognition software when it comes to programming?
I am wondering because I think it would be a lot faster than me typing by hand, and easier on my carpol-tunnel.
I program from day to day in visual basic 6 ide, visual studio 2008 ide + team explorer, writing emails, and chatting over Windows Live IM.
I have a need for a command-based interface where I can bind voice commands to keystrokes, switch between spelling / saying words / saying words without spaces, etc.
Any comments are much appreciated.
Source: (StackOverflow)
Is this possible without modify the android APIs?
I've found a article about this.
There's one a comment that I should do modifications to the android APIs.
But it didn't say how to do the modification.
Can anybody give me some suggestions on how to do that?
Thanks!
I've found this article;
SpeechRecognizer
His needs is almost the same as mine.
It is a good reference for me!
I've totally got this problem solved.
I googled a usable sample code from this China website
Here's my source code
package voice.recognition.test;
import android.app.Activity;
import android.content.Intent;
import android.os.Bundle;
import android.view.View;
import android.view.View.OnClickListener;
import android.speech.RecognitionListener;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.widget.Button;
import android.widget.TextView;
import java.util.ArrayList;
import android.util.Log;
public class voiceRecognitionTest extends Activity implements OnClickListener
{
private TextView mText;
private SpeechRecognizer sr;
private static final String TAG = "MyStt3Activity";
@Override
public void onCreate(Bundle savedInstanceState)
{
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
Button speakButton = (Button) findViewById(R.id.btn_speak);
mText = (TextView) findViewById(R.id.textView1);
speakButton.setOnClickListener(this);
sr = SpeechRecognizer.createSpeechRecognizer(this);
sr.setRecognitionListener(new listener());
}
class listener implements RecognitionListener
{
public void onReadyForSpeech(Bundle params)
{
Log.d(TAG, "onReadyForSpeech");
}
public void onBeginningOfSpeech()
{
Log.d(TAG, "onBeginningOfSpeech");
}
public void onRmsChanged(float rmsdB)
{
Log.d(TAG, "onRmsChanged");
}
public void onBufferReceived(byte[] buffer)
{
Log.d(TAG, "onBufferReceived");
}
public void onEndOfSpeech()
{
Log.d(TAG, "onEndofSpeech");
}
public void onError(int error)
{
Log.d(TAG, "error " + error);
mText.setText("error " + error);
}
public void onResults(Bundle results)
{
String str = new String();
Log.d(TAG, "onResults " + results);
ArrayList data = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
for (int i = 0; i < data.size(); i++)
{
Log.d(TAG, "result " + data.get(i));
str += data.get(i);
}
mText.setText("results: "+String.valueOf(data.size()));
}
public void onPartialResults(Bundle partialResults)
{
Log.d(TAG, "onPartialResults");
}
public void onEvent(int eventType, Bundle params)
{
Log.d(TAG, "onEvent " + eventType);
}
}
public void onClick(View v) {
if (v.getId() == R.id.btn_speak)
{
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,"voice.recognition.test");
intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,5);
sr.startListening(intent);
Log.i("111111","11111111");
}
}
}
Be sure to delete the annoying Logs after debugging!
Source: (StackOverflow)
It looks as though Google has made offline speech recognition available from Google Now for third-party apps. It is being used by the app named Utter.
Has anyone seen any implementations of how to do simple voice commands with this offline speech rec? Do you just use the regular SpeechRecognizer API and it works automatically?
Source: (StackOverflow)
I want to develop an Speech recognizer in android, which should work in offline. As the android's built-in speech recognizer uses google server which needs internet, i want an alternative which works in the absence of internet.
Please suggest me some way to achieve the above feature.
Source: (StackOverflow)
There are two similar namespaces and assemblies for speech recognition in .NET. I’m trying to understand the differences and when it is appropriate to use one or the other.
There is System.Speech.Recognition from the assembly System.Speech (in System.Speech.dll). System.Speech.dll is a core DLL in the .NET Framework class library 3.0 and later
There is also Microsoft.Speech.Recognition from the assembly Microsoft.Speech (in microsoft.speech.dll). Microsoft.Speech.dll is part of the UCMA 2.0 SDK
I find the docs confusing and I have the following questions:
System.Speech.Recognition says it is for "The Windows Desktop Speech Technology", does this mean it cannot be used on a server OS or cannot be used for high scale applications?
The UCMA 2.0 Speech SDK ( http://msdn.microsoft.com/en-us/library/dd266409%28v=office.13%29.aspx ) says that it requires Microsoft Office Communications Server 2007 R2 as a prerequisite. However, I’ve been told at conferences and meetings that if I do not require OCS features like presence and workflow I can use the UCMA 2.0 Speech API without OCS. Is this true?
If I’m building a simple recognition app for a server application (say I wanted to automatically transcribe voice mails) and I don’t need features of OCS, what are the differences between the two APIs?
Source: (StackOverflow)
I have need to write an application which uses a speech recognition engine -- either the built in vista one, or a third party one -- that can display a word or phrase, and recognise when the user reads it (or an approximation of it). I also need to be able to switch quickly between languages, without changing the language of the operating system.
The users will be using the system for very short periods. The application needs to work without the requirement of first training the recognition engine to the users' voices.
It would also be fantastic if this could work on Windows XP or lesser versions of Windows Vista.
Optionally, the system needs to be able to read information on the screen back to the user, in the user's selected language. I can work around this specification using pre-recorded voice-overs, but the preferred method would be to use a text-to-speech engine.
Can anyone recommend something for me?
Source: (StackOverflow)
I have managed to get continuous speech recognition working (using the SpeechRecognizer class) as a service on all Android versions up to 4.1. My question concerns getting it working on versions 4.1 and 4.2 as it is known there is a problem in that the API doesn't do as documented in that a few seconds after voice recognition is started, if no voice input has been detected then it's as if the speech recogniser dies silently. (http://code.google.com/p/android/issues/detail?id=37883)
I have found a question which proposes a work-around to this problem (Voice Recognition stops listening after a few seconds), but I am unsure as how to implement the Handler required for this solution. I am aware of the 'beep' that will happen every few seconds that this workaround will cause, but getting continuous voice recognition is more important for me.
If anyone has any other alternative workarounds then I'd like to hear those too.
Source: (StackOverflow)
I am trying to save in a file the audio data listened by speech recognition service of android.
Actually I implement RecognitionListener
as explained here:
Speech to Text on Android
save the data into a buffer as illustrated here:
Capturing audio sent to Google's speech recognition server
and write the buffer to a Wav file, as in here.
Android Record raw bytes into WAVE file for Http Streaming
My problem is how to get appropriate audio settings to save in the wav file's headers.
In fact when I play the wav file only hear strange noise, with this parameters,
short nChannels=2;// audio channels
int sRate=44100; // Sample rate
short bSamples = 16;// byteSample
or nothing with this:
short nChannels=1;// audio channels
int sRate=8000; // Sample rate
short bSamples = 16;// byteSample
What is confusing is that looking at parameters of the speech recognition task from logcat I find first Set PLAYBACK sample rate to 44100 HZ:
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Set PLAYBACK PCM format to S16_LE (Signed 16 bit Little Endian)
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Using 2 channels for PLAYBACK.
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Set PLAYBACK sample rate to 44100 HZ
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Buffer size: 2048
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Latency: 46439
and then aInfo.SampleRate = 8000 when it plays the file to send to google server:
12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::InitWavParser
12-20 14:41:36.152: DEBUG/(2364): File open Succes
12-20 14:41:36.152: DEBUG/(2364): File SEEK End Succes
...
12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::ReadData
12-20 14:41:36.152: DEBUG/(2364): Data Read buff = RIFF?
12-20 14:41:36.152: DEBUG/(2364): Data Read = RIFF?
12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::ReadData
12-20 14:41:36.152: DEBUG/(2364): Data Read buff = fmt
...
12-20 14:41:36.152: DEBUG/(2364): PVWAVPARSER_OK
12-20 14:41:36.156: DEBUG/(2364): aInfo.AudioFormat = 1
12-20 14:41:36.156: DEBUG/(2364): aInfo.NumChannels = 1
12-20 14:41:36.156: DEBUG/(2364): aInfo.SampleRate = 8000
12-20 14:41:36.156: DEBUG/(2364): aInfo.ByteRate = 16000
12-20 14:41:36.156: DEBUG/(2364): aInfo.BlockAlign = 2
12-20 14:41:36.156: DEBUG/(2364): aInfo.BitsPerSample = 16
12-20 14:41:36.156: DEBUG/(2364): aInfo.BytesPerSample = 2
12-20 14:41:36.156: DEBUG/(2364): aInfo.NumSamples = 2258
So, how can I find out the right parameters to save the audio buffer in a good wav audio file?
Source: (StackOverflow)
I have a particular situation:
a service started by a broadcast receiver starts an activity. I want to make it possible for this activity to communicate back to the service. I have chosen to use AIDL to make it possible. Everything seems works good except for bindService()
method called in onCreate()
of the activity. bindService(), in fact, throws a null pointer exception because onServiceConnected()
is never called while onBind()
method of the service is. Anyway bindService()
returns true.
The service is obviously active because it starts the activity.
I know that calling an activity from a service could sound strange, but unfortunately this is the only way to have speech recognition in a service.
Thanks in advance
Source: (StackOverflow)
I have 15 audio tapes, one of which I believe contains an old recording of my grandmother and myself talking. A quick attempt to find the right place didn't turn it up. I don't want to listen to 20 hours of tape to find it. The location may not be at the start of one of the tapes. Most of the content seems to fall into three categories -- in order of total length, longest first: silence, speech radio, and music.
I plan to convert all of the tapes to digital format, and then look again for the recording. The obvious way is to play them all in the background while I'm doing other things. That's far too straightforward for me, so: Are there any open source libraries, or other code, that would allow me to find, in order of increasing sophistication and usefulness:
- Non-silent regions
- Regions containing human speech
- Regions containing my own speech (and that of my grandmother)
My preference is for Python, Java, or C.
Failing answers, hints about search terms would be appreciated since I know nothing about the field.
I understand that I could easily spend more than 20 hours on this.
Source: (StackOverflow)
Is there any well known established framework for C or Java or PHP to do speech recognition applications? Microphone audio input and it will recognize English words. Such as pseudo code:
Speech s = new Speech();
s.input(micStream);
result = s.recognise("Hello");
if (result) { printf("Matched hello"); } else { printf("No match found"); }
Follow up:
Download this: sphinx4/1.0%20beta6/
Add the libraries
Copy & paste code:
a) xml file put somewhere, which can be loaded from the code:
https://gist.github.com/2551321
b) use this:
package edu.cmu.sphinx.demo.hellowrld;
import edu.cmu.sphinx.frontend.util.Microphone;
import edu.cmu.sphinx.recognizer.Recognizer;
import edu.cmu.sphinx.result.Result;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import models.Tts;
public class Speech {
public static void main(String[] args) {
ConfigurationManager cm;
if (args.length > 0) {
cm = new ConfigurationManager(args[0]);
} else {
///tmp/helloworld.config.xml
cm = new ConfigurationManager(Speech.class.getResource("speech.config.xml"));
}
Recognizer recognizer = (Recognizer) cm.lookup("recognizer");
recognizer.allocate();
Microphone microphone = (Microphone) cm.lookup("microphone");
if (!microphone.startRecording()) {
System.out.println("Cannot start microphone.");
recognizer.deallocate();
System.exit(1);
}
System.out.println("Say: (Hello | call) ( Naam | Baam | Caam | Some )");
while (true) {
System.out.println("Start speaking. Press Ctrl-C to quit.\n");
Result result = recognizer.recognize();
if (result != null) {
String resultText = result.getBestFinalResultNoFiller();
System.out.println("You said: " + resultText + '\n');
Tts ts = new Tts();
try {
ts.load();
ts.say("Did you said: " + resultText);
} catch (IOException ex) {
}
} else {
System.out.println("I can't hear what you said.\n");
}
}
}
}
Source: (StackOverflow)
In my voice recognition based app, I sometimes receive ERROR_RECOGNIZER_BUSY. Intuitively, this calls for... retries, right?
The problem is that this error is very undocumented, so obviously I have questions that perhaps someone more experienced in the field is able to answer:
- What triggers such an error? Is it
really only busy server (at Google)?
or this could also hint at a bug in my app?
- Do I have to explicitly close/reopen
a session before a retry?
- How often to retry? once every
1-second? every 5-seconds? Other?
Your experienced insights are most welcome. Thanks.
Source: (StackOverflow)