Search code examples
sqliteunicodeblobinstr

SQLite Instr provides different value than expected


I'm trying to pull a substring in a query using instr. The following command:

SELECT instr(ZSYNCPAYMENT, '{"t') 
FROM ZPAYMENT;

provides a result of 64.

I copied and pasted the output of SELECT ZSYNCPAYMENT FROM ZPAYMENT into a hex editor and selected the number of bytes up to and including the { symbol, which is part of my instr function above. The selected number of bytes shows as decimal length of 71. Why is my instr output showing a value of 64? The screenshot below is the output of the SELECT ZSYNCPAYMENT above.

enter image description here


Solution

  • From INSTR():

    The instr(X,Y) function finds the first occurrence of string Y within string X and returns the number of prior characters plus 1, or 0 if Y is nowhere found within X. Or, if X and Y are both BLOBs, then instr(X,Y) returns one more than the number bytes prior to the first occurrence of Y, or 0 if Y does not occur anywhere within X.

    In the hex editor you use you see the difference in bytes of the position of the string '{"t' from the start of the value of ZSYNCPAYMENT.
    This is not the same as the difference in characters when a string contains unicode characters, which I suspect is the case with the string you posted in the image.

    If you want the difference in bytes cast both ZSYNCPAYMENT and '{"t' to BLOBs:

    SELECT INSTR(
             CAST(ZSYNCPAYMENT AS BLOB), 
             CAST('{"t' AS BLOB)
           ) 
    FROM ZPAYMENT 
    

    See a simplified demo.