Search code examples
javascriptnode.jspinecone

Is there a method to fetch all the vectors of a namespace in pinecone


How can I fetch all the vectors of a namespace in pinecone, as the fetch method expects the ids of the vectors. Is there any method to get all the ids of the vectors.


Solution

  • ok I have struggled with this alot but finally I have found the solution, I was just trying to make my first hello world with pinecone and I added some data and to make sure it is really upserted I wanted to get all vectors back from a namespace

    Just make a query on pinecone and set topK to its maximum and you will all vectors no matter what is your query

    for example I have only 54 vectors in my pinecone index so if I set topK to 100 it returns me all documents no matter what I give in query or I leave empty text in query,

    here is my code for the reference sorry it is in ES Module (javascript) but I am sure it will work the same in python:

    const queryPineconeIndex = async (queryText, numberOfResults) => {
    
            const response = await openai.createEmbedding({
                model: "text-embedding-ada-002",
                input: queryText,
            });
            const vector = response?.data?.data[0]?.embedding
            console.log("vector: ", vector);
            // [ 0.0023063174, -0.009358601, 0.01578391, ... , 0.01678391, ]
    
            const index = pinecone.Index(process.env.PINECONE_INDEX_NAME);
            const queryResponse = await index.query({
                queryRequest: {
                    vector: vector,
                    // id: "vec1",
                    topK: numberOfResults,
                    includeValues: true,
                    includeMetadata: true,
                    namespace: process.env.PINECONE_NAME_SPACE
                }
            });
    
            queryResponse.matches.map(eachMatch => {
                console.log(`score ${eachMatch.score.toFixed(1)} => ${JSON.stringify(eachMatch.metadata)}\n\n`);
            })
            console.log(`${queryResponse.matches.length} records found `);
        }
        
        queryPineconeIndex("any text or empty string", 100)
    

    if you don't know how many vectors you have in an index you can also get it like this:

    const getIndexStats = async () => {
    
            const indexesList = await pinecone.listIndexes();
            console.log("indexesList: ", indexesList);
    
            const index = pinecone.Index(process.env.PINECONE_INDEX_NAME);
            const indexStats = await index.describeIndexStats({
                describeIndexStatsRequest: {
                    filter: {},
                },
            });
            console.log("indexStats: ", indexStats);
        }
        // getIndexStats()
    

    complete code in my github repo: https://github.com/mInzamamMalik/vector-database-hello-world