Search code examples
nlphuggingface-transformersllama-indexfaiss

Finding embedding dimentions of the HuggingFace model


I try to figure out how to use faiss Vectore Store with LlamaIndex.

Instruction says, that I must indicate vector dimensions in advance. Here is the code:

    import faiss

    # dimensions of text-ada-embedding-002
    d = 1536
    faiss_index = faiss.IndexFlatL2(d)

So, dimensions of text-ada-embedding-002 model is 1536.

I want to use BAAI/bge-small-en-v1.5 model.

What does embedding vector dimensions it outputs? How do I find the output vector dimensions of other transformer models? Do I need to run the models and measure the results, or is there a simpler way?


Solution

  • The dimension for bge-small-en-v1.5 is 384. You can find it on the model page https://huggingface.co/BAAI/bge-small-en-v1.5, you will find a table with dimension, sequence length and scores.

    Also, when loading a model via transformers.AutoModel you can more details on the loaded model using model.eval() likes input dimensions, layers, output, etc.