Search code examples
mongodbmongosh

Insert big json file into a mongo collection using mongosh


I need to insert a 181 KB json file into a mongo collection, the restriction is that it needs to be done using mongosh.

json_file_path=/some/path/file.json
json_file_string=$(cat $json_file_path)

/mongo/path/monhosh $databaseName \
    --tlsCertificateKeyFile someAuthDetails \
    --tlsCAFile someMoreAuthDetails
    --quiet \
    --eval "db.test_collection.insertOne(${json_file_string})" 

this seems to be working for smaller files, but with 181 KB, which is not that big, it fails as follows:

-ksh: /mongo/path/monhosh: cannot execute [Arguments list too long]

ive increased ulimit -s from 8000 to 65000, which seems to be the maximum with no luck. any ideas ?


Solution

  • You have correctly identified the issue: the size of the data exceeds what can be passed in as arguments/params/data to any script or program. What you're doing is loading the whole file's data into a shell variable and expanding that in eval.

    Instead, that can be built as a script, since you are restricted to only using shell commands & mongosh. And that script .js file will be passed to mongosh using the --file param. Since the file will be read directly and doesn't need the contents to be passed as a param, the file can get as big as it needs to be.

    Example contents of my_file.json with cat my_file.json:

    {
        "_id": 123,
        "first": "Name",
        "last": "something"
    }
    

    1. Select the DB you want to use:

    echo 'db = db.getSiblingDB("stacko");' > temp.js
    
    • note the use of > so that it creates a new temp.js file
    • replace 'stacko' with whichever DB you want to use.
    • since you have the DB in a variable, use that var with double-quotes and use single-inner-quotes for the:
    echo "db = db.getSiblingDB('$databaseName');" > temp.js
    

    2. Add an insertOne statement:

    echo 'db.test_collection.insertOne(' >> temp.js
    
    • note the use of double >> so that it appends to the temp.js file
    • replace "test_collection" with your collection name

    3. Put the contents of your json file into temp.js:

    cat my_file.json >> temp.js
    

    4. Add the closing parens and semicolon after that:

    echo ');' >> temp.js
    

    5. [Optional] Check the file with cat temp.js:

    db = db.getSiblingDB('stacko');
    db.test_collection.insertOne(
    {
        "_id": 123,
        "first": "Name",
        "last": "something"
    }
    );
    

    6. Execute temp.js it with the --file option to mongosh:

    mongosh <connection options as above> --file=temp.js
    

    In your case, that would be:

    /mongo/path/monhosh $databaseName \
     --tlsCertificateKeyFile someAuthDetails \
     --tlsCAFile someMoreAuthDetails
     --file temp.js
    

    If you want to insert many existing documents from .json files like this, then repeat steps 2, 3, 4 before finally doing step 6. Or create separate .js scripts to execute with mongosh --file some_file.js.