Search code examples
pythonvector-databaseqdrantqdrantclient

Filter by list's element in payload


In Qdrant DB I have a payload containing a list. How can I filter results of a search limiting to the ones where the list contain a specific element?

As example, if the set of points is:

[
  { "id": 1, "Fruit": ["apple", "banana", "orange"] },
  { "id": 2, "Fruit": ["pear", "orange" ] },
  { "id": 3, "test": "empty" },
  { "id": 4,  "Fruit": ["apple", "orange", "pear"] }
]

And I want to filter the results containing at least an Apple AND an Orange, i.e. the ones with ID 1 and 4.

How can I build such filter?

I already had a look at the documentation at this link, without success.

In the docs there is no my case. What I tried is:

Filter(should=[FieldCondition(key='Fruit', match=MatchValue(value='Banana'), range=None, geo_bounding_box=None, geo_radius=None, values_count=None), FieldCondition(key='Fruit', match=MatchValue(value='Apple'), range=None, geo_bounding_box=None, geo_radius=None, values_count=None)], must=None, must_not=None)

But the result that I know is in the DB does not show up

Thank you in advance.


Solution

  • I created some dashboard console queries to setup your example.

    Example setup

    Create a fruit collection.

    PUT collections/fruit
    {
      "vectors": {
        "size": 3,
        "distance": "Cosine"
      }
    }
    

    Create points in the fruit collection.

    PUT collections/fruit/points
    {
      "batch": {
        "ids": [
          1,
          2,
          3,
          4
        ],
        "vectors": [
          [
            0, 1, 2
          ],
          [3, 2, -1],
          [-1, -1, -1],
          [0, 2, 3]
        ],
        "payloads": [
          {
            "Fruit": [
              "apple",
              "banana",
              "orange"
            ]
          },
          {
            "Fruit": [
              "pear",
              "orange"
            ]
          },
          {
            "test": "empty"
          },
          {
            "Fruit": [
              "apple",
              "orange",
              "pear"
            ]
          }
        ]
      }
    }
    

    Search for points where payload includes "apple" and "orange" in "Fruit" field.

    POST collections/fruit/points/search
    {
      "limit": 10,
      "vector": [
        0,
        1,
        2
      ],
      "filter": {
        "must": [
          {
            "key": "Fruit",
            "match": {
              "text": "apple"
            }
          },
          {
            "key": "Fruit",
            "match": {
              "text": "orange"
            }
          }
        ]
      },
      "with_payload": true
    }
    

    Providing a list of filter conditions in the must filter behaves similarly to an and.

    This translates to Python as:

    Filter(
        must=[
            FieldCondition(key='Fruit[]', match=MatchValue(value='apple'),
            FieldCondition(key='Fruit[]', match=MatchValue(value='orange'),
        ]
    )