Search code examples
c#entity-framework-coremany-to-many

EF Core Many to Many self relationship not loading correctly


I have the following class:

[PrimaryKey(nameof(PartNumber), nameof(Revision), nameof(SerialNumber))]
public class TraceablePart(string partNumber, string revision, string serialNumber)
{
    [MaxLength(13)]
    public string PartNumber { get; set; } = partNumber;
        
    [MaxLength(3)]
    public string Revision { get; set; } = revision;
        
    [MaxLength(16)]
    public string SerialNumber { get; set; } = serialNumber;

    public virtual BindingList<TraceablePart> ChildParts { get; set; } = [];
    public virtual BindingList<TraceablePart> ParentParts { get; set; } = [];

    public override string ToString()
    {
        return $"{PartNumber} - {Revision} ({SerialNumber})";
    }
}

I've added data to the database, indicating that a part with key (1234, 1, 1) should have a child part (4321, 1, 1):

(1234,1,1) has one reference in the link table

But when I try to load it through the DbContext, I don't get any ChildParts

private void Load(MouseEventArgs e)
{
    Database.TraceablePart? traceablePart = context.TraceableParts.FirstOrDefault(t => t.PartNumber == PartNumber && t.Revision == Revision && t.SerialNumber == SerialNumber);
    Parts = traceablePart?.ChildParts;
}

traceabelPart.ChildParts has count 0

What am I doing wrong?


Solution

  • That is not a typical many-to-many relationship, and it will be compounded by using a composite key. Many-to-many relationships are usually used for association relationships, not parent-child relationships. (which are more commonly one-to-many) Also, when working with Entities, do not mix concerns between data domain and user domain. Entities should solely be responsible for data domain. Things like BindingList are User Interface domain and belong in a view model. Collection navigation properties should be declared as ICollection<TEntity> without a setter:

    public virtual ICollection<TraceablePart> ChildParts { get; } = [];
    

    EF will expect to proxy this for change tracking. We explicitly avoid exposing a setter as having any code "set" the collection will screw things up for EF change tracking. When it comes to UI binding/validation etc. use Select to project Entities into classes suited to managing UI concerns. When updating, fetch the entity by ID and copy applicable values across.

    To have something like a many-to-many parent-child relationship EF needs to form a joining table. A normal relationship between A's and B's would have a joining table with AId + BId as the PK with each serving as a FK to the A and B respectively in the relationship. Any A can be associated with any number of B's, while any B can be associated with any number of A's.

    The first challenge is the fact that you want a self-referencing many-to-many. This means you would have a linking table between A's only, meaning something like ParentAId + ChildAId. The second challenge is that you are using composite keys. So your joining table will need to be set up like:

    ParentPartNumber + ParentRevision + ParentSerialNumber + ChildPartNumber + ChildRevision + ChildSerialNumber

    The next tricky bit is to map out the relationships to the collections. In the A + B example, A contains a collection of B's while B contains a collection of A's. In a self-referencing relationship the two collections would be within the single A entity but when you map that relationship you need to be sure that you configure EF with exactly which side in the many-to-many joining table corresponds to which collection. By default in the A + B example, EF will resolve FK's by Type using convention. Since the Type of both relationships are "TraceablePart" you need to set up this relationship explicitly. Normally with many-to-many joining tables you can get away without declaring the joining entity, but with the composite key structure I'd say it's recommended to map it out, and it may be required:

    [PrimaryKey(nameof(ParentPartNumber), nameof(ParentRevision), nameof(ParentSerialNumber), nameof(ChildPartNumber), nameof(ChildRevision), nameof(ChildSerialNumber))]
    public class TraceablePartLink
    {
        public string ParentPartNumber { get; protected set; }
        public string ParentRevision { get; protected set; }
        public string ParentSerialNumber { get; protected set; }
    
        public string ChildPartNumber { get; protected set; }
        public string ChildRevision { get; protected set; }
        public string ChildSerialNumber { get; protected set; }
    }
    

    Then when configuring the TraceablePart entity:

    modelBuilder.Entity<TraceablePart>()
        .HasMany(e => e.Children)
        .WithMany(e => e.Parents)
        .UsingEntity<TraceablePartLink>(
            l => l.HasOne<TraceablePart>().WithMany().HasForeignKey(e => new {e.ChildPartNumber, e.ChildRevision, e.ChildSerialNumber}),
            r => r.HasOne<TraceablePart>().WithMany().HasForeignKey(e => new {e.ParentPartNumber, e.ParentRevision, e.ParentSerialNumber}));
    

    Disclaimer: This may not work exactly as written and need some adjustment. The composite key makes things a lot more complicated to work with, so I would highly recommend defining a meaningless unique primary key and leave the composite values to serve as an index.

    MS's documentation for relationships (https://learn.microsoft.com/en-us/ef/core/modeling/relationships/many-to-many) cover off pretty much any scenario. If you find you need something that doesn't fit the typical scenario, the typical best advice would be to adjust the approach to conform to the proper normalization and relational nature supported by the DB.

    When reading an entity, to fetch the parents and children you will need to use Include to eager load them. When it comes to navigating up or down the relationships you will need to re-query. I.e. navigating something like .Parent.Children etc. won't produce expected results. Lazy loading is another option to allow that kind of navigation but that comes with a very stern warning as it will almost certainly come with a significant performance cost, especially if code gets sloppy with iterating through references. It is also a massive performance / exception sink as soon as you introduce something like serialization. (I.e. to JSON)