Search code examples
javascriptmysqlsqlnode.jsnode-mysql

How do I map tags to tag synonyms?


I am building a system where users associate tags with posts, not unlike SO. I am having a spot of bother implementing tag synonyms.

Here I have a table called Tags:

| TagName    |
|------------|
| Python     |
| JavaScript |
| Node       |

And I have another called TagSynonyms:

| SynonymId | SourceTagName | TargetTagName |
|-----------|---------------|---------------|
| 1         | Py            | Python        |
| 2         | Python2       | Python        |

The server is implemented using Node and the user enters some tags as a comma-delimited string:

var input = 'Py,Flask'
var tags = request.tags.split(',');

In this case, the user has entered the tag Py which, according to the TagSynonyms table, should be mapped to the tag Python. The second tag, Flask has no synonym and should remain the same.

I managed to implement this functionality using imperative code:

tags.forEach(function (tag) {
  connection.query('SELECT TargetTagName FROM TagSynonyms WHERE SourceTagName = ?', tag, function(err, rows) {
    if (rows.length !== 0) {
      console.log(rows[0].TargetTagName);
    } else {
      console.log(tag);
    }
  });
});

Here, the user input

['Py','Flask'] 

results in the following output

Python
Flask

What I want to do is, defer this logic to the database engine, as I think using a loop is a code smell. I also think the database engine will be more performant. What is an appropriate query to do this?


Solution

  • You need a UNION and a join:

      select TagName
        from Tags
        where TagName in (?,?,?,...)
    union
      select TagName
        from Tags
        join TagSynonyms
          on Tags.TagName = TagSynonyms.TargetTagName
        where TagSynonyms.SourceTagName in (?,?,?,...)
    

    Note that union can be slow since it will try to remove duplicates. If that's the case for you, use union all and remove duplicates in the application code.

    (?,?,?,...) stands for a list of input values; check your DB driver documentation for the exact syntax that you need to use to avoid SQL injection.

    Update: Here is what the implementation will look like in Node:

    var query = 
    'SELECT TagName \
    FROM Tags \
    WHERE TagName IN (?) \
    UNION \
    SELECT TagName \
    FROM Tags \
    JOIN TagSynonyms \
      ON Tags.TagName = TagSynonyms.TargetTagName \
    WHERE TagSynonyms.SourceTagName IN (?)'
    
    connection.query(query, [tags, tags], function(err, rows) {
      tags = rows.map(function(row) { 
        return row.TagName
      });
    });