Search code examples
postgresqlencodingutf-8latin1

PostgreSQL ignores dashes when ordering


I have a PostgreSQL 8.4 database that is created with the da_DK.utf8 locale.

dbname=> show lc_collate;
 lc_collate
------------
 da_DK.utf8
(1 row)

When I select something from a table where I order on a character varying column I get a strange behaviour IMO. When ordering the result PostgreSQL ignores dashes that prefixes the value, e.g.:

 select name from mytable order by name asc;

May return something like

 name
 ----------------
 Ad...
 Ae...
 Ag...
 - Ak....
 At....

The dash prefix seems to be ignored.

I can fix this issue by converting the column to latin1 when ordering:

 select name from mytable order by convert_to(name, 'latin1') asc;

The I get the expected result as:

 name
 ----------------
 - Ak....
 Ad...
 Ae...
 Ag...
 At....

Why does the dash prefix get ignored by default? Can that behavior be changed?


Solution

  • A workaround that will work in my specific case is to replace dashes with exclamation points. I happen to know that I will never get exclamation points and it will be sorted before any letters or digits.

    select name from mytable order by translate(name, '-', '!') asc
    

    It will certainly affect performance so I may look into creating a special column for sorting but I really don't like that either...