I need some help on conversion of speech to text using Android Speech API. The API is giving me correct results on my Device (Android Version 2.3.5) but when I tested it on Device Having Android Version 4.1.2, it is giving me abnormal results. Like the result is being repeated multiple times. If somebody have faced this problem can you tell me how to cater this issue ?
Following is the code I am using:
public class MainActivity extends Activity {
protected static final int RESULT_SPEECH = 1;
protected static final String TAG = "MY_TAG";
private TextView spokenText;
private Button spkButton;
private Button stopButton;
private SpeechRecognizer sR;
private ClickListener clickListener;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
clickListener = new ClickListener();
spokenText = (TextView) findViewById(R.id.spokenText);
spokenAnswer = (TextView) findViewById(R.id.spokenAnswer);
spkButton = (Button) findViewById(R.id.speakButton);
stopButton = (Button) findViewById(R.id.stopButton);
spkButton.setOnClickListener(clickListener);
stopButton.setOnClickListener(clickListener);
sR = SpeechRecognizer.createSpeechRecognizer(this);
sR.setRecognitionListener(new listener());
}
@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
getMenuInflater().inflate(R.menu.main, menu);
return true;
}
public void startListening()
{
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, "en-US");
intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,1);
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,this.getPackageName());
sR.startListening(intent);
}
public void stopListening()
{
sR.stopListening();
}
class ClickListener implements OnClickListener
{
@Override
public void onClick(View v) {
// TODO Auto-generated method stub
if(v == spkButton)
{
startListening();
}
else if(v == stopButton)
{
stopListening();
}
}
}
class listener implements RecognitionListener{
@Override
public void onRmsChanged(float rmsdB) {
// TODO Auto-generated method stub
//Log.d(TAG, "onRmsChanged");
}
@Override
public void onResults(Bundle results) {
// TODO Auto-generated method stub
String str = new String();
//Log.d(TAG, "onResults " + results);
ArrayList<String> data = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
for (int i = 0; i < data.size(); i++)
{
//Log.d(TAG, "result " + data.get(i));
str += data.get(i);
}
Log.d(TAG, str);
spokenText.setText(str);
}
@Override
public void onReadyForSpeech(Bundle params) {
// TODO Auto-generated method stub
//Log.d(TAG, "onReadyForSpeech");
}
@Override
public void onPartialResults(Bundle partialResults) {
// TODO Auto-generated method stub
//Log.d(TAG, "onPartialResults");
}
@Override
public void onEvent(int eventType, Bundle params) {
// TODO Auto-generated method stub
//Log.d(TAG, "onEvent");
}
@Override
public void onError(int error) {
// TODO Auto-generated method stub
String mError = "";
switch (error) {
case SpeechRecognizer.ERROR_NETWORK_TIMEOUT:
mError = " network timeout";
break;
case SpeechRecognizer.ERROR_NETWORK:
mError = " network" ;
return;
case SpeechRecognizer.ERROR_AUDIO:
mError = " audio";
break;
case SpeechRecognizer.ERROR_SERVER:
mError = " server";
break;
case SpeechRecognizer.ERROR_CLIENT:
mError = " client";
break;
case SpeechRecognizer.ERROR_SPEECH_TIMEOUT:
mError = " speech time out" ;
break;
case SpeechRecognizer.ERROR_NO_MATCH:
mError = " no match" ;
break;
case SpeechRecognizer.ERROR_RECOGNIZER_BUSY:
mError = " recogniser busy" ;
break;
case SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS:
mError = " insufficient permissions" ;
break;
}
//Log.d(TAG, "Error: " + error + " - " + mError);
//startListening();
}
@Override
public void onEndOfSpeech() {
// TODO Auto-generated method stub
//Log.d(TAG, "onEndOfSpeech");
//startListening();
}
@Override
public void onBufferReceived(byte[] buffer) {
// TODO Auto-generated method stub
//Log.d(TAG, "onBufferReceived");
}
@Override
public void onBeginningOfSpeech() {
// TODO Auto-generated method stub
}
}
}
Following is the output i am seeing - The results are abnormal, it should have shown a single time rather that 3 times ..
Here is a chunk of response from one of the google-speech-api on android. Note the JSON array in the 'hypothesis' field...
{"status":0,"id":"a4ca9654c6cc684dc3279cd1aaa00cc7-1","hypotheses":[{"utterance":"map of the state of California","confidence":0.87869847}]}
You need to know the details of the api's response body you are using and , if necessary , how to parse JSON arrays in the response like the 'hypothesis' field above.
If it is an array as i suspect it is , then you just need a little parsing of the array to get the proper response without the duplication issue.