Search code examples
pythonmongodbtype-conversionpymongostring-to-datetime

PyMongo - Python List to MongoDB datatype conversion


I am getting json response in string format from requests as below:

results = requests.request("POST", url, data=json.dumps(payload), headers=header).json()['product']

Sample output: print(results) - object type = <class 'list'>

[
{
 'id': '123456',
 'product': 'XYZ',
 'exp_date': '03/01/2020',
 'amount': '30.5',
 'qty': '1'
},
{
 'id': '789012',
 'product': 'ABC',
 'exp_date': '04/15/2020',
 'amount': '22.57',
 'qty': '3'
},
{
 'id': '56789',
 'product': 'AAA',
 'exp_date': '03/29/2020',
 'amount': '',
 'qty': ' '
}
]

Need to convert all of these fields to specific datatype first and then insert into MongoDB as documents.

  1. exp_date to date/time
  2. amount to float()
  3. qty to int()

What is the efficient way to do the datatype conversion ?

Was thinking if it's possible something like below, also need to know if there's any empty, null or blank string values, then how to replace it with some default value while during datatype conversion ?

new_result = []
for i in enumerate(results):
    i[exp_date] = datetime.strptime(i[exp_date],'%m/%d%Y').replace(hour=0, minute=0, second=0, microsecond=0)                         #check for empty/null/blank values and replace with default date
    new_result.append(i[exp_date])

for i in enumerate(results):
    i[amount] = float(i[amount])   #check for empty/null/blank values and replace with 0.00
    new_result.append(i[amount])

for i in enumerate(results):
    i[qty] = int(i[qty])           #check for empty/null/blank values and replace with 0
    new_result.append(i[qty])

db.collection.insert_many(new_result)

new list output should look like: print(new_result)

[
{
 "id": "123456",
 "product": "XYZ",
 "exp_date": 2020-03-01 00:00:00,
 "amount": 30.5,
 "qty": 1
},
{
 "id": "789012",
 "product": "ABC",
 "exp_date": 2020-04-15 00:00:00,
 "amount": 22.57,
 "qty": 3
},
{
 "id": "56789",
 "product": "AAA",
 "exp_date": 2020-03-29 00:00:00,
 "amount": 0.0,
 "qty": 0
}
]

Solution

  • You could do something like this:

    import datetime
    
    input_lst = [
    {
     "id": "123456",
     "product": "XYZ",
     "exp_date": "03/01/2020",
     "amount": "30.5",
     "qty": "1"
    },
    {
     "id": "789012",
     "product": "ABC",
     "exp_date": "04/15/2020",
     "amount": "22.57",
     "qty": "3"
    },
    {
     "id": "56789",
     "product": "AAA",
     "exp_date": "03/29/2020",
     "amount": "",
     "qty": " "
    }
    ]
    
    output_lst = []
    
    
    for dct in input_lst:
        tmp_dct = dct.copy()
        # amount - float, qty - int4
        try:
            tmp_dct['amount'] = float(dct['amount'])
        except:
            pass
    
        try:
            tmp_dct['qty'] = int(dct['qty'])
        except:
            pass
    
        try:
            tmp_dct['exp_date'] = datetime.datetime.strptime(tmp_dct['exp_date'],'%m/%d/%Y').replace(hour=0, minute=0, second=0, microsecond=0)                         #check for empty/null/blank values and replace with default date
            output_lst.append(tmp_dct)
        except:
            pass
    
    
    print(output_lst)
    

    This is more efficient because you are only looping once.