c#.net google-cloud-platform google-ai-platform

Add retry settings for google prediction service client - .NET

I want to implement Retry logic in my application when using Google Cloud AI Platform's PredictionServiceClient from nuget - https://www.nuget.org/packages/Google.Cloud.AIPlatform.V1 . Retry scenarios include the case when the quota limit for call exceeds (status 429 too many requests) or the Service is Unavailable. I want to use the same retry and timeout properties for all calls using the PredictionServiceClient. Earlier In same application, I have used azure openai nuget(https://www.nuget.org/packages/Azure.AI.OpenAI/) where I was able to provide the retry properties in the client creation phase like below -

    OpenAIClientOptions opts = new OpenAIClientOptions();
    opts.Retry.MaxRetries = 2;
    opts.Retry.Delay = TimeSpan.FromSeconds(60);
    opts.Retry.NetworkTimeout = TimeSpan.FromSeconds(100);
    opts.Retry.Mode = (RetryMode)connDetails.AzureOpenAIRetryMode.Value;
client = new OpenAIClient(new Uri("sampleUrl"),new AzureKeyCredential("sampleKey"), opts);

I am using the Google Cloud AI Platform's PredictionServiceClient to stream content generation like below -

Google.Cloud.AIPlatform.V1.PredictionServiceClient.StreamGenerateContentStream response = myPredictionServiceClient.StreamGenerateContent(generateContentRequest);

What I have tried - Below is the code I wrote for retry -

        PredictionServiceSettings settings = new PredictionServiceSettings
        {
            CallSettings = CallSettings.FromRetry(RetrySettings.FromExponentialBackoff(
                                            maxAttempts: 2,
                                            initialBackoff: TimeSpan.FromSeconds(1),
                                            maxBackoff: TimeSpan.FromSeconds(10),
                                            backoffMultiplier: 2,
                                            retryFilter: RetrySettings.FilterForStatusCodes(StatusCode.Unavailable)
                                            )).WithTimeout(TimeSpan.FromSeconds(100))
        };

        bardClient = await new PredictionServiceClientBuilder
        {
            Settings = settings,
            Endpoint = connDetails.AzureOpenAIResourceUrl
        }.BuildAsync();

An error is thrown at the StreamGenerateContent method call. The Google API does not permit retries for this type of operation. Below is the error message -

  HResult=0x80131509
  Message=Retry not permitted for this operation type
  Source=Google.Api.Gax.Grpc
  StackTrace:
   at Google.Api.Gax.Grpc.CallSettingsExtensions.ValidateNoRetry(CallSettings callSettings)
   at Google.Api.Gax.Grpc.ApiServerStreamingCall.<>c__DisplayClass0_0`2.<Create>b__1(TRequest req, CallSettings cs)
   at Google.Cloud.AIPlatform.V1.PredictionServiceClientImpl.StreamGenerateContent(GenerateContentRequest request, CallSettings callSettings) in Google.Cloud.AIPlatform.V1\PredictionServiceClientImpl.cs:line 250

It originates from this method in the google API code - https://github.dev/googleapis/gax-dotnet/blob/ee799ad91309ef3102dee60f1baa67d7ec772548/Google.Api.Gax.Grpc/CallSettingsExtensions.cs#L192

My question is how can I implement retry logic in service calls which are using streaming in PredictionServiceClient. Any hint to direct me to correct resource or sample code would be helpful.

Solution

For non-streaming responses retry settings work as expected. For streaming scenario, refer this.