Search code examples
javascriptmongodbmongoosefind

Perform a query removing special characters to detect duplicate documents


I have a document collection that has a Business Model, it has a tin field, this tin is a brazilian type of document, and i have in the DB documents with different formats that include special characters

ex: {tin : '34.545.646/4245-24'} or {tin : '34,545,646/424524'} or {tin : '34545646/4245-24'}

i want to perform a query like that Business.findOne({tin: tin})

i want to remove all special characters to tin input. and no matter what format are in the database i can query with only the numbers.

how can i use query to see if a document exists no matter the format in db, only number matters\

how to remove special characters in query to look for a document

i want to know if a document exists no matter the format, only numbers


Solution

  • This query is going to be slow, but here's one way to do it.

    db.collection.find({
      "$expr": {
        "$eq": [
          // "filtered" number string goes here
          "34545646424524",
          { // strip everything except digits from tin
            "$reduce": {
              "input": {"$range": [0, {"$strLenCP": "$tin"}]},
              "initialValue": "",
              "in": {
                "$let": {
                  "vars": {
                    "char": {"$substrCP": ["$tin", "$$this", 1]}
                  },
                  "in": {
                    "$cond": [
                      {
                        "$regexMatch": {
                          "input": "$$char",
                          "regex": "\\d"
                        }
                      },
                      {"$concat": ["$$value", "$$char"]},
                      "$$value"
                    ]
                  }
                }
              }
            }
          }
        ]
      }
    })
    

    Try it on mongoplayground.net.