Search code examples
t-sqlpatindex

PATINDEX incorrect result when looking for dash character "-"


This simple example shows the issue I've run into, but I don't understand why...

I'm testing for the location of the first character that is either a lower or upper case letter, a single dash, or a period in a string parameter passed to me.

These two pattern matches appear to check the same thing, and yet run this code yourself and it will print a 0 then a 3:

PRINT PATINDEX ( '%[a-z,A-Z,-,.]%', '16-82') 
PRINT PATINDEX ( '%[-,a-z,A-Z,.]%', '16-82') 

I don't understand why it works only if the dash character is the first one we check for.

Is this a bug? Or working as designed and I missed something... I'm using SQL Server 2016, but I don't think that matters.


Solution

  • A dash within a character group may play either of the two roles:

    • It may denote the dash itself, like it does in the expression [-abc]
    • It may denote the "everything inbetween" operator, like it does in the expression [a-z].

    In your particular example, the character group [a-z,A-Z,-,.] denotes the following:

    • Everything from a to z
    • Comma ,
    • Everything from A to Z
    • Everything from , to , (i.e. just the comma again).
    • Dot .

    In fact, you probably wanted to write [-a-zA-Z.]