Search code examples
mongodbazureazure-machine-learning-service

Best way to import MongoDB data in Azure Machine Learning


I have a MongoDB database (the Bitnami one) hosted on Azure. I want to import the data there to use it in my Azure Machine Learning experiment.

Currently, I am exporting the data to .csv using mongoexport and then copy/pasting it to the "Enter Manually Data" module. This is fine for small amounts of data but I would prefer to have a more robust technique for larger databases.

I also thought about using the "Import Data" module from http url along with the http port (28017) of my mongodb instance but read this was not the recommended use of the http mongodb feature.

Finally, I have installed cosmosDB instead of my bitnami MongoDB and it worked fine but this thing costs an arm when used with sitecore (it reaches around 100€ per day) and we can't afford it so I switched back to by Mongo.

So is there a better way to export data from Mongo to Azure ML ?


Solution

  • one way is to use a Python code block in AzureML, something like this:

    import pandas as p
    import pymongo as m
    
    def azureml_main():
        c = m.MongoClient(host='host_IP')
        a = p.DataFrame(c.database_names())
        return a