Search code examples
c#nhibernatefluent-nhibernateidentityequality

How to deal with different entity instances when using NHibernate?


There is a well-known problem with ORMs and object identity. As far as the ORM is concerned, entities are equal if they have the same ID. Of course, this doesn't apply to transient instances which are considered non-existant.

But as far as OO code is concerned, object references are considered equal if they refer to the same instance. That is, unless Equals and/or == are overridden.

That is all good, but what does it mean in practice? Here is a very simple example domain model:

namespace TryHibernate.Example
{
public abstract class Entity
{
    public int Id { get; set; }
}

public class Employee : Entity
{
    public string Name { get; set; }

    public IList<Task> Tasks { get; set; }

    public Employee()
    {
        Tasks = new List<Task>();
    }
}

public class Task : Entity
{
    public Employee Assignee { get; set; }

    public Job Job { get; set; }
}

public class Job : Entity
{
    public string Description { get; set; }
}
} // namespace

And here is example code that uses it:

using (ISessionFactory sessionFactory = Fluently.Configure()
    .Database(SQLiteConfiguration.Standard.UsingFile("temp.sqlite").ShowSql())
    //.Cache(c => c.UseSecondLevelCache().UseQueryCache().ProviderClass<HashtableCacheProvider>())
    .Mappings(m => m.AutoMappings.Add(
        AutoMap.AssemblyOf<Entity>()
            .Where(type => type.Namespace == typeof(Entity).Namespace)
            .Conventions.Add(DefaultLazy.Never())
            .Conventions.Add(DefaultCascade.None())
            //.Conventions.Add(ConventionBuilder.Class.Always(c => c.Cache.ReadWrite()))
        ).ExportTo("hbm")
    ).ExposeConfiguration(c => new SchemaExport(c).Create(true, true))
    .BuildSessionFactory())
{
    Job job = new Job() { Description = "A very important job" };
    Employee empl = new Employee() { Name = "John Smith" };
    Task task = new Task() { Job = job, Assignee = empl };
    using (ISession db = sessionFactory.OpenSession())
    using (ITransaction t = db.BeginTransaction())
    {
        db.Save(job);
        db.Save(empl);
        empl.Tasks.Add(task);
        db.Save(task);
        t.Commit();
    }
    IList<Job> jobs;
    using (ISession db = sessionFactory.OpenSession())
    {
        jobs = db.QueryOver<Job>().List();
    }
    IList<Employee> employees;
    using (ISession db = sessionFactory.OpenSession())
    {
        employees = db.QueryOver<Employee>().List();
    }
    jobs[0].Description = "A totally unimportant job";
    Console.WriteLine(employees[0].Tasks[0].Job.Description);
}

Of course, it prints “A very important job”. Enabling 2nd level cache (commented out) does not change it, although it reduces database hits in some cases. Apparently that's because NHibernate caches data, not object instances.

And, of course, overriding equality / hash code doesn't help here because it's not equality that causes problems. It's the very fact that I have two instances of the same thing here.

That is all good, but how to handle it? There are several options, but neither seems too appealing to me:

  1. Introduce an intermediate service layer that would cache instances in hash tables and traverse entity graphs after loading them from the repository. I don't really like it because it's a whole lot of work, prone to errors and sounds like I'm doing ORM's job. I'd rather to implement the whole persistence manually if I ever want to go this way, but I don't.

  2. Introduce a single aggregate root and pull that from the DB once instead of fetching several parts. It could work, as my application is relatively simple and I can handle working with the whole graph. As a matter of fact, I'm working with it anyway. But I don't like this because it introduces unnecessary entities. Jobs should be jobs, employees should be employees. Of course, I could name the god entity “organization” or something. Another reason I don't like it is that it can get unwieldy if the data grows in the future. For example, I may wish to archive old jobs and tasks.

  3. Use a single session for everything. Right now, I'm opening and closing sessions as I need to load / store something. I could work with a single session, and NHibernate's identity map would guarantee reference identity (as long as I don't use lazy load). This seems the best option, but an application may be running for a while (it's a WPF desktop app) and I don't like the idea of leaving the session open for too long.

  4. Manually update all instances. For example, if I want to change a job description, I call some service method which searches for job instances having the same ID and updates them all. This can get very messy because that service has to have access to basically everything, essentially becoming a kind of god service.

Any other options I've missed? It surprises me how little information on this issue is around. The best I could find is this post, but it just handles equality issue.


Solution

  • Alright, here is what I came up with. Basically, that's the option 3 with some adjustments. My repo class looks something like this now:

    public class Repo : IDisposable
    {
        private ISessionFactory sessionFactory;
        private ISession session;
    
        public Repo(ISessionFactory sessionFactory)
        {
            this.sessionFactory = sessionFactory;
            this.session = sessionFactory.OpenSession();
            session.Disconnect();
        }
    
        public void Dispose()
        {
            try
            {
                session.Dispose();
            }
            finally
            {
                sessionFactory.Dispose();
            }
        }
    
        public void Save(Entity entity)
        {
            using (Connection connection = new Connection(session))
            using (ITransaction t = session.BeginTransaction())
            {
                session.Save(entity);
                t.Commit();
            }
        }
    
        public IList<T> GetList<T>() where T : Entity
        {
            using (Connection connection = new Connection(session))
            {
                return session.QueryOver<T>().List();
            }
        }
    
        private class Connection : IDisposable
        {
            private ISession session;
    
            internal Connection(ISession session)
            {
                this.session = session;
                session.Reconnect();
            }
    
            public void Dispose()
            {
                session.Disconnect();
            }
        }
    }
    

    And it's used like this:

    using (Repo db = new Repo(sessionFactory))
    {
        Job job = new Job() { Description = "A very important job" };
        Employee empl = new Employee() { Name = "John Smith" };
        Task task = new Task() { Job = job, Assignee = empl };
        db.Save(job);
        db.Save(empl);
        empl.Tasks.Add(task);
        db.Save(task);
        IList<Job> jobs;
        jobs = db.GetList<Job>();
        IList<Employee> employees;
        employees = db.GetList<Employee>();
        jobs[0].Description = "A totally unimportant job";
        Console.WriteLine(employees[0].Tasks[0].Job.Description);
    }
    

    Never mind the lack of transaction, of course in real life it's a bit more complicated than this, I just left out irrelevant parts. The important thing is that it works, and it doesn't keep an open connection indefinitely.

    I still don't like this, though. Looks like I'm abusing ORM here, and methods Disconnect() and Reconnect() look like a hack to me. But other options don't look appealing at all.

    Callum's suggestion to load everything in a single session and then save later in another session is much better, but it doesn't suit my particular application. It will be too complicated to figure out what the changes were. I could get around that by using the Command and/or Unit of Work patterns. Using the Command pattern would also give me a nice ability to undo changes. But I'm not willing to go that far yet.