Given that document databases, such as RavenDB, are non-relational, how do you avoid duplicating data that multiple documents have in common? How do you maintain that data if it's okay to duplicate it?
With a document database you have to duplicate your data to some degree. What that degree is will depend on your system and use cases.
For example if we have a simple blog and user aggregates we could set them up as:
public class User
{
public string Id { get; set; }
public string Name { get; set; }
public string Username { get; set; }
public string Password { get; set; }
}
public class Blog
{
public string Id { get; set; }
public string Title { get; set; }
public class BlogUser
{
public string Id { get; set; }
public string Name { get; set; }
}
}
In this example I have nested a BlogUser class inside the Blog class with the Id and Name properties of the User Aggregate associated with the Blog. I have included these fields as they are the only fields the Blog class is interested in, it doesn't need to know the users username or password when the blog is being displayed.
These nested classes are going to dependant on your systems use cases, so you have to design them carefully, but the general idea is to try and design Aggregates which can be loaded from the database with a single read and they will contain all the data required to display or manipulate them.
This then leads to the question of what happens when the User.Name gets updated.
With most document databases you would have to load all the instances of Blog which belong to the updated User and update the Blog.BlogUser.Name field and save them all back to the database.
Raven is slightly different as it support set functions for updates, so you are able to run a single update against RavenDB which will up date the BlogUser.Name property of the users blogs without you have to load them and update them all individually.
The code for doing the update within RavenDB (the manual way) for all the blog's would be:
public void UpdateBlogUser(User user)
{
var blogs = session.Query<Blog>("blogsByUserId")
.Where(b.BlogUser.Id == user.Id)
.ToList();
foreach(var blog in blogs)
blog.BlogUser.Name == user.Name;
session.SaveChanges()
}
I've added in the SaveChanges just as an example. The RavenDB Client uses the Unit of Work pattern and so this should really happen somewhere outside of this method.