Search code examples
javamysqljpajpql

Database independent string comparison with JPA


I'm working with JPA (Hibernate as provider) and an underlying MySQL database.

I have a table containing every name of streets in Germany. Every streetname has a unique number. For one task I have to find out the number of the name. To do this I wrote a JPQL-Query like the following

"SELECT s FROM Strassencode s WHERE s.name = :name"

Unfortunately there are streetnames in Germany which only differ in small and capital letters like "Am kleinen Feld" and "Am Kleinen Feld". Obviously one of them is gramatical wrong, but as a name for a street, both spellings are allowed and both of them have a different number.

When I send my JPQL-Query to the database, I'm using

query.getSingleResult();

because of the fact, that every streetname in the table is unique. But with the underlying MySQL-Database, I'll get a NonUniqueResultException, because a mySQL database is doing a case insensitive comparison by default. The common way to force mySQL is to write a query like

SELECT 'abc' LIKE BINARY 'ABC';

as discribed in chapter 11.5.1 in the mySQL Reference Manual, but there is no corresponding keyword in JPQL.

My first attempt was to annotade the field name in the class with

@Column(nullable = false, columnDefinition = "varbinary(128)")

Annotated like this, the database is automatically doing a case sensitive comparison, because one of the strings is binary. But when I use this solution, I'm running into troubles when I'm reading the names out of the databse, to write them into a file, because letters like ä, ö, ü are interpreted wrong then.

I know, that the field name should get a uniqueConstraint and it's one of my goals to do so, but this is only possible if the database will do a case sensitive string comparison.

Is there a solution, where I can configure JPA to do case sensitive string comparison without having to manually fine-tune the database?


Solution

  • Seems like there is no way in configuring JPA to solve this problem. But what I found is, that it is possible to set the collation not only on the table-level, you can also set it for the whole database as discribed in the Reference Manual 12.1.10 CREATE DATABASE SYNTAX and 9.1.3.2. Database Character Set and Collation

    CREATE {DATABASE | SCHEMA} [IF NOT EXISTS] db_name
    [create_specification] ...
    
    create_specification:
        [DEFAULT] CHARACTER SET [=] charset_name
      | [DEFAULT] COLLATE [=] collation_name
    

    I only had to create the database with:

    CREATE DATABASE db_name CHARACTER SET latin1 COLLATE latin1_general_cs;
    

    With this, I could put a uniqueConstraint on the field name and insert both "Am kleinen Feld" and "Am Kleinen Feld" and when I query for one of them, I'll only receive one.

    However, thanks for the help