Search code examples
sqlt-sqldataexplorer

Find tags containing characters that aren't [a-z] or [0-9]


I want to create a query for the Stack Exchange Data Explorer which returns any tag containing special characters. (Specifically, any character that is not a-z, 0-9, -, or +)

The search should return tags like , , , and .

I've tried a few different solutions, but each seems to have it's own set of issues.


SELECT * FROM tags WHERE tagname NOT REGEXP '^[a-zA-Z0-9]*$'

Source: Oracle SQL - REGEXP_LIKE contains characters other than a-z or A-Z

Error: REGEXP_LIKE is not a recognized built-in function name.


SELECT * FROM tags WHERE tagname NOT REGEXP '^[a-zA-Z0-9]*$'

Source: How to get all rows that contain characters others than [a-zA-Z] in MySQL

Error: Incorrect syntax near REGEXP


Solution

  • Instead of trying to phrase "contains only non-special characters" in SQL and negating that, it's easier to phrase "contains a special character":

    SELECT * FROM tags WHERE tagname LIKE '%[^-+a-z0-9]%'
    

    You'll find that this contains a lot of tags containing .. You may wish to add this to the characters to ignore as well.

    SELECT * FROM tags WHERE tagname LIKE '%[^-+a-z0-9.]%'