Search code examples
pythonc++swigfaiss

faiss: How to retrieve vector by id from python


I have a faiss index and want to use some of the embeddings in my python script. Selection of Embeddings should be done by id. As faiss is written in C++, swig is used as an API.

I guess the function I need is reconstruct :

/** Reconstruct a stored vector (or an approximation if lossy coding)
     *
     * this function may not be defined for some indexes
     * @param key         id of the vector to reconstruct
     * @param recons      reconstucted vector (size d)
     */
    virtual void reconstruct(idx_t key, float* recons) const;

Therefore, I call this method in python, for example:

vector = index.reconstruct(0)

But this results in the following error:

vector = index.reconstruct(0) File "lib/python3.8/site-packages/faiss/init.py", line 406, in replacement_reconstruct self.reconstruct_c(key, swig_ptr(x)) File "lib/python3.8/site-packages/faiss/swigfaiss.py", line 1897, in reconstruct return _swigfaiss.IndexFlat_reconstruct(self, key, recons)

TypeError: in method 'IndexFlat_reconstruct', argument 2 of type 'faiss::Index::idx_t' python-BaseException

Has someone an idea what is wrong with my approach?


Solution

  • This is the only way I found manually.

    import faiss
    import numpy as np
    
    a = np.random.uniform(size=30)
    a = a.reshape(-1,10).astype(np.float32)
    d = 10
    index = faiss.index_factory(d,'Flat', faiss.METRIC_L2)
    index.add(a)
    
    xb = index.xb
    print(xb.at(0) == a[0][0])
    

    Output:

    True
    

    You can get any vector with a loop

    required_vector_id = 1
    vector = np.array([xb.at(required_vector_id*index.d + i) for i in range(index.d)])
        
    print(np.all(vector== a[1]))
    

    Output:

    True