Search code examples
sqlsqliteinsertdistinct

strange behavior i am getting from sqlite. Same values in column but SELECT DISTINCT sees different values


I am getting very strange behavior from SQLite.

INSERT into tbltest (category,question) VALUES ('aa','t1'),('aa','t2'),('Ding 1','d1'),('Ding 1','d2');

SELECT DISTINCT category 
FROM tbltest;

I get this data returned:

category
aa
aa
Ding 1

I don't understand it. I should get only aa and Ding 1 in my category. Where does the extra aa come from? This happens as I copy the string from another program. Somehow if it

but unfortunately when i copy it from here and paste it back it behaves correctly so i created a link to the sqlite db. please run "select distinct category from tbltest" on it

sql lite db link


Solution

  • This is probably Byte order mark (UNICODE 65279), as you correctly noticed in the comment section. To get rid of this in sqlite database you need to make use of trim and replace functions.

    If you want to get rid on the import phaze, your code should look like:

    INSERT into tbltest (category,question) VALUES 
    (TRIM(REPLACE('put_your_category_value_here', '\uFEFF', '')),'t1'), 
    (TRIM(REPLACE('put_your_category_value_here', '\uFEFF', '')),'t2'),
    (TRIM(REPLACE('put_your_category_value_here', '\uFEFF', '')),'d1'),
    (TRIM(REPLACE('put_your_category_value_here', '\uFEFF', '')),'d2')
    

    Or, if you just want a correct projection, you could use that SELECT query:

    SELECT DISTINCT TRIM(REPLACE(category, '\uFEFF', ''))
    FROM tbltest;
    

    A little bit of explanation:

    '\uFEFF' stands for BOM aka. "ZERO WIDTH NO-BREAK SPACE"