I am trying to develop a data model for a very diverse set of interconnected objects. As the application matures, the types of objects supported will increase significantly. I want to avoid having to modify the model/schema whenever new object types are added.
As a simple example, let's say I'm starting with a model of people and buildings. A building can have multiple owners; a person can own multiple buildings; a person can live in a house and work in an office... Future versions might add cars and corporations. Cars can have owners, corporations can manufacture cars, people can work for corporations, etc. Most of the relationships will be many-to-many, some will be one-to-many, very few will be one-to-one.
While concepts like "owner", "employer", or "manufacture" can be considered properties of a "building", "corporation", or "car" object, I don't want to redefine the data model to support a new property type.
My current idea is to model this similar to a graph, where each piece of data is its own node. The node object would be very simple:
Extending the previous example, the possible node types would be:
A relationship would be:
I have a few questions:
Is there an existing pattern or model that describes this?
What you describe sounds like a network data model, also known as an object or object-oriented data model.
Are there any drawbacks to this approach?
Your model doesn't support ternary and higher relationships. It also creates fixed access paths between nodes, which supports node-to-node navigation, but which can make many queries convoluted. I also don't see any support for subtyping.
Without composite determinants, some situations will be difficult to model or query. You don't support predicates like (Object, Language) -> Name
(or (Company, Role) -> Person
, etc). One way is to create special relationship types, but your model is going to be asymmetric and more complicated to query.
Are there better approaches?
The relational model of data handles n-ary relations between object types / domains, and allows for the representation of complex predicates. N-ary relations mean it supports object hypergraphs, and user-defined joins mean ad-hoc access paths. Composite determinants are supported, and most implementations support a variety of integrity constraints.
In particular, look at Object-Role Modeling (http://www.orm.net, https://www.ormfoundation.org).
I want to avoid having to modify the model/schema whenever new object types are added.
Try doing a web search for "universal schema for knowledge representation". Facts about the world aren't limited to simple atomic observations like "John Smith has a dog named Spot". We have to deal with facts like "Company A is not allowed to distribute product B in regions within 100km of point C after date D if that product contains ingredients E or F". The most powerful knowledge representation we've got so far is natural language, and as far as I know we don't yet have a simple model of its structure.
I'm currently reading Ologs: A Categorical Framework For Knowledge Representation. Perhaps this will be of interest to you too.