I apologise if this has been asked already, I am struggling greatly with the terminology of what I am trying to find out about as it conflicts with functionality in Entity Framework.
I would like to create an application that on setup gives the user to use 1 database as a "trial"/"startup" database, i.e. non-production database. This would allow a user to trial the application but would not have backups etc. in no way would this be a "production" database. This could be SQLite for example.
When the user is then ready, they could then click "convert to production" (or similar), and give it the target of the new database machine/database. This would be considered the "production" environment. This could be something like MySQL, SQLServer or.. whatever else EF connects to these days..
Does EF support this type of migration/data transfer live? Would it need another app where you could configure the EF source and EF destination for it to then run through the process of conversion/seeding/population of the data source to another data source?
I have tried to search for things around this topic, but transferring/migration brings up subjects totally non-related, so any help would be much appreciated.
From what you describe I don't think there is anything out of the box to support that. You can map a DbContext to either database, then it would be a matter of fetching and detaching entities from the evaluation DbContext and attaching them to the production one.
For a relatively simple schema / object graph this would be fairly straight-forward to implement.
ICollection<Customer> customers = new List<Customer>();
using(var context = new AppDbContext(evalConnectionString))
{
customers = context.Customers.AsNoTracking().ToList();
}
using(var context = new AppDbContext(productionConnectionString))
{ // Assuming an empty database...
context.Customers.AddRange(customers);
}
Though for more complex models this could take some work, especially when dealing with things like existing lookups/references. Where you want to move objects that might share the same reference to another object you would need to query the destination DbContext for existing relatives and substitute them before saving the "parent" entity.
ICollection<Order> orders = new List<Order>();
using(var context = new AppDbContext(evalConnectionString))
{
orders = context.Orders
.Include(x => x.Customer)
.AsNoTracking()
.ToList();
}
using(var context = new AppDbContext(productionConnectionString))
{
var customerIds = orders.Select(x => x.Customer.CustomerId)
.Distinct().ToList();
var existingCustomers = context.Customers
.Where(x => customerIds.Contains(x.CustomerId))
.ToList();
foreach(var order in orders)
{ // Assuming all customers were loaded
var existingCustomer = existingCustomers.SingleOrDefault(x => x.CustomerId == order.Customer.CustomerId);
if(existingCustomer != null)
order.Customer = existingCustomer;
else
existingCustomers.Add(order.Customer);
context.Orders.Add(order);
}
}
This is a very simple example to outline how to handle scenarios where you may be inserting data with references that may, or may not exist in the target DbContext. If we are copying across Orders and want to deal with their respective Customers we first need to check if any tracked customer reference exists and use that reference to avoid a duplicate row being inserted or throwing an exception.
Normally loading the orders and related references from one DbContext should ensure that multiple orders referencing the same Customer entity will all share the same entity reference. However, to use detached entities that we can associate with the new DbContext via AsNoTracking()
, detached references to the same record will not be the same reference so we need to treat these with care.
For example where there are 2 orders for the same customer:
var ordersA = context.Orders.Include(x => x.Customer).ToList();
Assert.AreSame(orders[0].Customer, orders[1].Customer); // Passes
var ordersB = context.Orders.Include(x => x.Customer).AsNoTracking().ToList();
Assert.AreSame(orders[0].Customer, orders[1].Customer); // Fails
Even though in the 2nd example both are for the same customer. Each will have a Customer reference with the same ID, but 2 different references because the DbContext is not tracking the references used. One of the several "gotchas" with detached entities and efforts to boost performance etc. Using tracked references isn't ideal since those entities will still think they are associated with another DbContext. We can detach them, but that means diving through the object graph and detaching all references. (Do-able, but messy compared to just loading them detached)
Where it can also get complicated is when possibly migrating data in batches (disposing of a DbContext regularly to avoid performance pitfalls for larger data volumes) or synchronizing data over time. It is generally advisable to first check the destination DbContext for matching records and use those to avoid duplicate data being inserted. (or throwing exceptions)
So simple data models this is fairly straight forward. For more complex ones where there is more data to bring across and more relationships between that data, it's more complicated. For those systems I'd probably look at generating a database-to-database migration such as creating INSERT statements for the desired target DB from the data in the source database. There it is just a matter of inserting the data in relational order to comply with the data constraints. (Either using a tool or rolling your own script generation)