Search code examples
pythongoogle-cloud-platformgoogle-cloud-vertex-aigoogle-cloud-speechgoogle-cloud-python

Calling Google Cloud Speech to Text API regional recognizers, using Python Client library, showing error 400 and 404


The goal: The goal is to use Python client libraries to convert a speech audio file to text through a Chirp recognizer.

Steps to recreate the error: I'm creating a recognizer following the steps in the link below, I am following the instruction and the Python code in the below link to perform Speech to Text using GCP Speech API, https://cloud.google.com/speech-to-text/v2/docs/transcribe-client-libraries the code is as below,

from google.cloud.speech_v2 import SpeechClient
from google.cloud.speech_v2.types import cloud_speech


def speech_to_text(project_id, recognizer_id, audio_file):
    # Instantiates a client
    client = SpeechClient()

    request = cloud_speech.CreateRecognizerRequest(
        parent=f"projects/{project_id}/locations/global",
        recognizer_id=recognizer_id,
        recognizer=cloud_speech.Recognizer(
            language_codes=["en-US"], model="latest_long"
        ),
    )

    # Creates a Recognizer
    operation = client.create_recognizer(request=request)
    recognizer = operation.result()

    # Reads a file as bytes
    with open(audio_file, "rb") as f:
        content = f.read()

    config = cloud_speech.RecognitionConfig(auto_decoding_config={})

    request = cloud_speech.RecognizeRequest(
        recognizer=recognizer.name, config=config, content=content
    )

    # Transcribes the audio into text
    response = client.recognize(request=request)

    for result in response.results:
        print(f"Transcript: {result.alternatives[0].transcript}")

    return response

It works fine with the multi-regional global models. However, as of now(June of 2023), the Chirp model is only available in the us-central1 region.

The issue: When you're using the same code for the regional recognizers it outputs a 404 error indicating that the recognizer doesn't exist in the project. When you change the recognizer's name from "projects/{project_id}/locations/global/recognizers/{recognizer_id}" to "projects/{project_id}/locations/us-central1/recognizers/{recognizer_id}" or anything with non-global location, it shows 400 error saying that the location is expected to be global.

Question: How can I call a regional recognizer through the GCP Python client library?


Solution

  • For regional APIs, you must set Client Options' API end-point to a local end-point. There is information about setting the API end-points in the following document in Speech API v1, https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.services.speech.SpeechClient

    To use a regional API for Speech API v2, you need to set the regional API to us-central1 using the following code,

    from google.api_core import client_options
    
    def speech_to_text(project_id, recognizer_id, audio_file):
         client_options_var = client_options.ClientOptions(
        api_endpoint="us-central1-speech.googleapis.com"
        )
        client = SpeechClient(client_options=client_options_var)
        recog_name = f"projects/{project_id}/locations/us-central1/recognizers/{recognizer_id}"
    
        config = cloud_speech.RecognitionConfig(
            auto_decoding_config={},
            )
    
        # Reads a file as bytes
        with open(audio_file, "rb") as f:
            content = f.read()
    
        # Send request
        request = cloud_speech.RecognizeRequest(
            recognizer=recog_name, config=config, content=content
        )
    
        response = client.recognize(request=request)
    
        return response
    

    The client_options_var = client_options.ClientOptions(api_endpoint="us-central1-speech.googleapis.com") sets the API endpoint to speech API in us-central1. Then you'll be able to call a regional recognizer located in "projects/{project_id}/locations/us-central1/recognizers/{recognizer_id}" without getting a 404 error.