Search code examples
python-3.xvector-databasemilvus

How to get the data present in the milvus collection?


I have created a Milvus connection where i have hosted in an external server. I am not aware of what data is stored in the collection and want to retrieve the data stored in the collection. below is the code that i am using to get the data from the collection but it is not working

from pymilvus import MilvusClient

# 1. Set up a milvus client
client = MilvusClient(
    uri="http://localhost:19530",
    token="root:Milvus"
)
res = client.query(
    collection_name="test_collection",
    filter="_id >= 0",
    limit=5,
) 

print(res)

Error:

MilvusException: <MilvusException: (code=0, message=cannot parse expression: _id >= 0, error: there is no dynamic json field in schema, need to specified field name)>

Then I tried to describe the collection

collection_info = client.describe_collection("test_collection")  

print(collection_info)

and collection info is:

{'collection_name': 'test_collection',
         'auto_id': False,
         'num_shards': 1,
         'description': '',
         'fields': [{'field_id': 100,
           'name': 'id',
           'description': '',
           'type': <DataType.INT64: 5>,
           'params': {},
           'is_primary': True},
          {'field_id': 101,
           'name': 'embedding',
           'description': '',
           'type': <DataType.FLOAT_VECTOR: 101>,
           'params': {'dim': 384}}],
         'aliases': [],
         'collection_id': 450513023365271287,
         'consistency_level': 0,
         'properties': {},
         'num_partitions': 1,
         'enable_dynamic_field': False}

Can anyone help me resolving this?


Solution

  • calling client.describe_collection("test_collection") just returns schema, shard numbers, consistency_level, number of partitions and some information about collection. To get number of entities in collection

    collection.num_entities
    

    To get number of entities in specific partition(if you have created any partition) :

    collection.partition(partition_name=partition_name).num_entities
    

    to retrieve specific ids in collection:

    expr = id + " in " + str(ids)
    collection.query(expr=expr, output_fields=["id", "embedding"])
    

    to do ANN search in collection:

    collection.search(data=search_vector, anns_field="embedding", param=query_param, output_fields=["id"], limit=50, consistency_level="Bounded")