Search code examples
wcfentity-frameworkrdbmspersistence-ignorance

Entity Framework - Schema Upgrade, Multiple DBMS, and Code First


I'm looking into using Microsoft's Entity Framework in an upcoming project which is a point release of an existing product. Our current product supports two DBMS (Oracle and SQL Server), the schema of each is maintained in separate .sql script files.

The entity framework (4.1) looks appealing because it allows various scenarios to be implemented automatically via code generation, reflection, etc. However, as far as I can tell, some of these benefits appear to be mutually exclusive of others.

For example, to support multiple DMBSes, I am inferring that I would need to use a model or code first design, in which case EF would generate the schema for each according to the model (I have seen little to no posts or documentation on this, so I may be wrong). This means that our existing schema would need to be either abandoned (model-first), or mapped (code-first). Additionally, updating the schema would require manual scripts as EF does not appear to support schema upgrades (without wiping out data).

  1. Are model-first and code-first the only viable means of supporting multiple DBMSes in EF? I realize that technically it would be impossible to guarantee that two arbitrary schemae are the same, so I am thinking this is true.
  2. Are there any potential pitfalls of code-first and mapping to multiple DBMS systems? For example, Oracle does not have auto-increment columns; you have to use sequences. How is this mapped in the DbContext? Do I need to create separate maps for each DBMS?
  3. Does EF support any mechanism to upgrade an existing DBMS schema to one of which is representative of the EF model (schema recreation =/= upgrade), or am I limited to doing this manually?
  4. I did come up with one possible way to use database first and support multiple DBMSes, however it is a maintenance nightmare. The idea was to add another layer of abstraction to the two generated data models and create converter classes for each of the EF generated models. This seems like the best way of doing it so that each DBMS could potentially have its own model, yet my code would handle the mapping. But in doing this, what am I really gaining from EF? Maybe query generation, but is that worth it?

Solution

  • Actually both the model-first and the database-first have same constraints. Both these approaches are using an EDMX file which contains SSDL (a description of store = a database layer) part related directly to a single database provider so if you want to have two different database providers you must have two different SSDL parts and keep them in sync. You can use single CSDL (a description of conceptual layer = your model classes) and a single or two MSLs (a description of mapping between SSDL and CSDL - a single file is possible only if tables and columns will have exactly same names in both SSDLs). As I know EDMX file can consists only from single SSDL, CSDL and MSL parts so I expect that the designer has no support for this scenario and you will have to modify second SSDL manually or use two EDMXs = model each change twice.

    The code-first approach can make this much more simple but the question is how good is Oracle provider when using the code-first and the database generation. The provider is responsible for correctly interpreting needed features like sequences in case of auto increment columns.

    EF itself currently has no support for upgrading existing DB. When using EDMX the process of the database generation is controlled either by T4 template or Workflow so it can be customized and there is already separate feature called Entity Designer Database Generation Power Pack which allow incremental building of the database with the model-first approach. The problem is that this feature is using VS Database tools. I think these tools works only with SQL server. I never like these automated tools so I still think that database upgrade should be controlled manually with help of some tools to get difference script between the current and the last deployed database versions. You should need diff script only when deploying new the new version to a production environment. In a testing and a development environment you can always recreate the whole database.

    There should be no abstraction needed when working with two EDMX models. Models must produce the same conceptual layer. In such case you need only a single set of POCO classes which are mapped by conventions (same class name as the entity, same properties with same types and accesibility) so they will work with both models.

    Edit:

    Based on @Tridus answer I'm just adding that you can create databases first and use fluentAPI from EF 4.1 to map them. Your databases must have exactly the same schema (table names, column names, etc.), they can't use any specific features (I hope sequences will not be the problem because it is just the way how Oracle handles auto increment columns).