Search code examples
mongodbpymongo

MongoDB: Remove "," from string, with conditional field set, in mixed datatype field


I am fairly new with MongoDB, so apologizes for what might be a basic question.

I have a bunch of documents in a collection (call it CollA), and taking a subset of it and the particular field in question would look like:

docs = [
{Value: "1,000,000+"},
{Value: "40,000"},
{Value: "2,000,000+"},
{Value: 500000},
{Value: 400000}]

Note the mixed float and string datatypes within the field "Value".

I am trying to find a way to:

  1. Convert "Value" to a float (so I assume remove the "," and do a $ToDouble on it)
  2. In the cases where the string contains a "+", I would like to set a new field "Truncated" to "true".

The desired output would be

docs = [
{Value: 1,000,000,
Truncated: true},
{Value: 40,000},
{Value: 2,000,000,
Truncated: true}},
{Value: 500000},
{Value: 400000}]

So now they would all be "doubles", and the new field would be there in the documents where the "+" existed.

Thanks so much!


Solution

  • Query

    • if not string (number) convert to double
    • else(if string)
    • split on + (first $let)
    • save the first part, removing , and to double (second $let)
    • if 2 parts add the first part and Truncated
    • else add only the first part

    Playmongo

    aggregate(
    [{"$replaceRoot": 
       {"newRoot": 
         {"$cond": 
           [{"$not": [{"$eq": [{"$type": "$Value"}, "string"]}]},
             {"Value": {"$toDouble": "$Value"}},
             {"$let": 
               {"vars": {"parts": {"$split": ["$Value", "+"]}},
                "in": 
                 {"$let": 
                   {"vars": 
                     {"parts1": 
                       {"$toDouble": 
                         {"$replaceAll": 
                           {"input": {"$arrayElemAt": ["$$parts", 0]},
                            "find": ",",
                            "replacement": ""}}}},
                    "in": 
                     {"$cond": 
                       [{"$arrayElemAt": ["$$parts", 1]},
                         {"Value": "$$parts1", "Truncated": true},
                         {"Value": "$$parts1"}]}}}}}]}}}])