Search code examples
firebasegoogle-cloud-platformgoogle-cloud-firestorevectorembedding

Firestore Bandwidth costs for downloading documents with large embedding vectors


Background

I'm developing an application that uses vector embeddings for similarity search. I'm considering storing these embeddings directly in Firestore documents, but I'm concerned about potential bandwidth costs.

From the Firestore pricing documentation, I understand there are charges for "Network bandwidth" or "Outbound data transfer" with a free quota of 10GB/month.

My specific concerns

  • If I store embedding vectors in Firestore documents (each vector could be several MB in size), will I be charged for the full size of these vectors when downloading documents?

  • For example, if I have 1,000 documents, each containing a 1MB
    embedding vector, and I need to download all of them, would that
    count as ~1GB against my outbound transfer quota?

  • The pricing document states: "Firestore calculates response size
    based on a serialized message format." Does this mean the actual
    bandwidth charge might be even higher due to serialization overhead?

Potential alternative approaches I'm considering

  • Store only embedding IDs in Firestore, with the actual vectors in Cloud Storage
  • Use a specialized vector database alongside Firestore
  • Implement aggressive caching to minimize repeated downloads

Technical details

  • Vector dimension: ~1000-5000 dimensions per vector (float32)
  • Expected number of vectors: 10,000+ initially, growing over time
  • Query pattern: Frequent similarity searches requiring vector retrieval Client: Web application and mobile app

Has anyone implemented a similar system with Firestore? What was your experience with bandwidth costs and what architecture would you recommend?


Solution

  • If you're reading a document, you're charged a document read for that document, and a bandwidth charge for the part of the document that you read. The client-side SDKs always download full documents, but the server-side SDKs have the option to select what fields to return - so you can reduce the bandwidth charge that way.

    Say your documents are indeed 1MB (the maximum size) and you download 1,000 of them fully, that'll charge you (on the us-east4 region) (all pricing from here):

    • $0.00033 for document reads (which cost $0.033 per 100,0000 documents)
    • $0.12 for bandwidth at the outbound data rate

    If you use Cloud Run or something similar, you can reduce the bandwidth by selecting a subset of the fields, but also consider that network transfer within the same multi-region is free - so running your code in a Google data center is typically a good way to reduce outbound bandwidth costs.

    And there may indeed be some overhead (both up and down) in transfer size.

    Given the growth rate you're talking about, also consider the storage cost, as the vector indexes probably add up too. For more on this, see the documentation on how document size is calculated, specifically the section on index entry size.

    If this is too costly for your use-case, consider looking at other APIs where the cost/pricing model better matches your requirements.