Search code examples
c#.net-coreentity-framework-corefirebirdfirebird2.5

.Net Core 3 Entity Framework with Firebird database and charset none


I have a legacy Firebird (2.5.9) database that was created with no charset and since it is in production all data was inserted using WIN1252. I'm migrating to .Net Core and I'm using FirebirdSql.Data.FirebirdClient as provider (in both .Net Framework and .Net Core).

In .Net Framework and EF 6 I just had to set the charset WIN1252 in the connection string.

In .Net Core 3.1 the connection string's charset is used only while reading as I have to set the column type in code first like this:

entity.Property(e => e.description).HasMaxLength(255).HasColumnType("VARCHAR(255) CHARACTER SET WIN1252");

to make it save strings with the same encoding.

The problems I'm facing now are related to the fact that all query parameters seem to use UTF8 that is the new default encoding in .Net Core and some special characters are, of course, different.

So, for example, assuming I have a record with accented characters in the field description like "èèè".

The following query fails to retrieve it:

string filter = "èèè";
Product p = context.Products.Where(x => x.description == filter).FirstOrDefault();

This query gets translated into:

SELECT "a"."product_id", ... , "a"."description"
FROM "products" AS "a"
WHERE "a"."description" = CAST(@__filter_0 AS VARCHAR(8191))

That cast happens in .Net Framework too but queries like this work correctly there.

Also if there are multiple parameters I get the error

Implementation limit exceeded. block size exceeds implementation restriction

that is exactly what is described in this question even if that was in EF5 and .Net Framework. Basically it seems that you easily hit the row size limit of 64k because that VARCHAR(8191) cast weights more in terms of bytes as UTF8.

I'm sure that one solution would be to upgrade to a new database with UTF8 encoding and encode all data properly but it's something I would like to avoid if possible at the moment. Also I don't think that would solve the block size error.

Is there a way to make it work like in .Net Framework?


Solution

  • I just got an answer from Jiří Činčura in firebird-net-provider Google Group

    The reason behind the different behavior between .Net Framework and .Net Core is, in fact, due to the default encoding of .Net Core (UTF8).

    My database had the column charsets set to none. So what happens is that even if you set the column type like this:

    .HasColumnType("VARCHAR(255) CHARACTER SET WIN1252");
    

    the provider always uses the column charset in the database when comparing values. This means that if you set the column type in code first you only fix the string saving but when the provider needs to compare values he checks the columns configuration in the database.

    So the solution is to properly define/fix charsets on the columns/database and check/copy the data.

    Personally, instead of forcing the WIN1252 charset on all the columns, I went ahead and converted the whole database in UTF8 so that, even if I keep the column charset to none, strings are already saved in the same .Net Core default encoding and it's also more future proof.

    The row size limit problem is somehow unrelated, it's just something that will happen when you start using UTF8. Fortunately you can set this option in the context configuration

    new FbDbContextOptionsBuilder(optionsBuilder).WithExplicitParameterTypes(false);
    

    to reduce those varchar casts.

    Additional Info

    To convert the database I used fbclone, since it can clone a database pumping data from one database to another using the same structure and at the same time handling different charsets.

    Also, you may need to check your indexes since in Firebird 2.x the maximum size is a quarter of the page size (source) and strings in UTF8 are stored internally as 4 bytes/char so if you come from WIN1252 (1 byte/char) like me you can hit that limit easily.