Search code examples
mongodbimportpymongomongoimportdatabase

MongoDB import to different collections set by a field


I have a file called data.json and extracted with mongoexport, with the following structure:

{"id":"63","name":"rcontent","table":"modules"}
{"id":"81","name":"choicegroup","table":"modules"}
{"id":"681","course":"1242","name":"Requeriments del curs","timemodified":"1388667164","table":"page"}
{"id":"682","course":"1242","name":"Guia d'estudi","timemodified":"1374183513","table":"page"}

What I need is to import this file into my local mongodb with a command like mongoimport or with pymongo, but storing every line in the collection named after the table value.

For example, the collection modules would contain the documents

{"id":"63","name":"rcontent"} and {"id":"81","name":"choicegroup"}

I've tried with mongoimport but I haven't seen any option which allows that. Does anyone know if there is a command or a method to do that?

Thank you


Solution

  • The basic steps for this using python are:

    1. parse the data.json file to create python objects

    2. extract the table key value pair from each document object

    3. insert the remaining doc into a pymongo collection

    Thankfully, pymongo makes this pretty straightforward, as below:

    import json
    
    from pymongo import MongoClient
    
    client = MongoClient()  # this will use default port and host
    db = client['test-db']  # select the db to use
    with open("data.json", "r") as json_f:
        for str_doc in json_f.readlines():
            doc = json.loads(str_doc)
            table = doc.pop("table")  # remove the 'table' key 
            db[table].insert(doc)