Search code examples
firebasegoogle-cloud-vertex-aigoogle-geminifirebase-vertex-ai

Firebase: Gemini API using Vertex AI - Limit Usage by Users for Client Side API


Question about https://firebase.google.com/docs/vertex-ai: How can I control usage by users if I am using client-side (iOS) API?

Cloud functions, we have full control. Client-side DB calls, we have security rules. What do we have for this? For instance, how can I ensure a user only calls Vertex AI 10 times in a day? Or how do I control token size for each call? I just don't want to wake up to some insane bill.

Thanks!


Solution

  • The Firebase Vertex API does support some degree of per-user quota limiting. If you go to the Google Cloud Console

    https://console.cloud.google.com/apis/api/firebaseml.googleapis.com/quotas?project=YOUR_PROJECT

    you can adjust the "Generate content requests per minute per project per user per minute per user per region" quota limit. This allows you to limit the number of requests per minute that any individual user can do. It’s not a perfect solution because you can only apply per-minute limits (so you can’t say limit to 10 requests per day) and you can’t give different users different quota limits (eg it wouldn’t support saying give my "paid" users X quota and my "free" users Y quota). If you need to get more granular than that you’ll probably have to write some bespoke logic in the client (and enable AppCheck to make sure only your actual client can call the service).

    If you’re concerned about runaway token size you can use the GenerativeModel.CountTokens() operation. That will perform a pre-flight check and give you an estimate on the number of tokens that would be generated by the given prompt. You can then block any that are over whatever limit you’re comfortable with.