Trying to send a lot of data to MongoDB through MongoEngine. I start with a DataFrame that I write to JSON like this:
result = df.to_json(orient="index")
parsed = json.loads(result)
json_data = json.dumps(parsed, indent=4)
I then make it a little prettier using this:
json_object = json.loads(json_data)
json_formatted_str = json.dumps(json_object, indent=2)
print(json_formatted_str)
This is the result:
{
"0": {
"Address": " Bursvej 30 ",
"Zip/city": "4930 Maribo",
"Price": " 148.000kr. ",
"Date": 1673371545635
},
"1": {
"Address": " Garrdesmuttevej 20 ",
"Zip/city": "9550 Mariager",
"Price": " 148.000kr. ",
"Date": 1673371545635
},
"2": {
"Address": " Norrevej 21 ",
"Zip/city": "6990 Ulfborg",
"Price": " 150.000kr. ",
"Date": 1673371545635
},
But when i try to send it to Mongo:
MD = [MarketData(**data) for data in json_formatted_str]
MarketData.objects.insert(MD, load_bulk=False)
I get this error: TypeError: main.MarketData() argument after ** must be a mapping, not str
Is there any other way to do this? I have been trying with PyMongo for several hours but gave up. Should I go back to that? Would prefer MongoEngine to be honest.
Thanks in advance
EDIT>
Dataframe
Address Zip/city Price Date
0 Bursøvej 30, Bursø 4930 Maribo 148.000 kr. 2023-01-10 17:25:45.635483
1 Gærdesmuttevej 20 9550 Mariager 148.000 kr. 2023-01-10 17:25:45.635483
2 Nørrevej 21 6990 Ulfborg 150.000 kr. 2023-01-10 17:25:45.635483
3 Egernvænget 54 4733 Tappernøje 195.000 kr. 2023-01-10 17:25:45.635483
4 Egernvænget 56 4733 Tappernøje 195.000 kr. 2023-01-10 17:25:45.635483
And my schema
class MarketData(Document):
#answers = DictField()
Address = DynamicField(required=False)
city = DynamicField(required=False)
Price = DynamicField(required=False)
date = DynamicField(required=False)
def json(self):
market_dict = {
"username": self.username,
"city": self.city,
"Price": self.price
}
return json.dumps(market_dict)
You probably want to bypass all the conversion of this data to and from a string. To that end, let's just convert to a dictionary. Of course you could also just iterate over the rows of your dataframe as well but let's start with:
MD = [
MarketData(**data)
for data
in df.to_dict(orient="records")
]