Search code examples
pythonarraysmongodbpymongo

PyMongo: Filter data from MongoDB by values from numpy.ndarray


I'm using a pymongo to get data from collection "example" in mongodb.

This collection contains next fields:

  • _id
  • nums (it can be: "1000", "13833", "1205" or something like this)
  • content

Example of document:

{
  "_id": {
    "$oid": "12345"
  },
  "nums": "1000",
  "content": [
    "systematic",
    "review",
    "meta-analysis",
  ],
 
}

Also I have an numpy.ndarray named "areas": ['13833' '5773' '12882' '18955' '12561' '1307' '5024' '27076']

I need to filter data from collection example by field "nums", where values of "nums" is values from numpy.ndarray "areas" and then save it to pd.DataFrame.

i.e. i want to get all documents from collection "example" where field "nums" is '13833' or '5773' or '12882' and etc from "areas".

I try something like:

df = pd.DataFrame(list(collection.find({"nums":{"$in":["areas"]}})))

And it doesnt't work. I get Empty DataFrame.


Solution

  • You can't directly use the numpy array in pymongo filter, convert it to a list using tolist() method

    areas = np.array(['13833', '5773', '12882', '18955', '12561', '1307', '5024', '27076'])
    docs = collection.find({'nums': {'$in': areas.tolist()}})
    df = pd.DataFrame(docs)
    
    print(df)
    

    Output:

                            _id   nums                              content
    0  64520fe0bc45c14712c6f6f5  13833  [systematic, review, meta-analysis]
    1  64520fe0bc45c14712c6f6f6  12882  [systematic, review, meta-analysis]
    2  64520fe0bc45c14712c6f6f7  12561  [systematic, review, meta-analysis]
    3  64520fe0bc45c14712c6f6f8   5024  [systematic, review, meta-analysis]