Search code examples
djangodatabasedjango-modelspsql

One large model or two smaller models


I'm developing a database that holds financial information.

I have two models:

  1. Company (~30 fields)
  2. CompanyFinacials (~120 fields)

CompanyFinacials includes a field with Company as a primary key to maintain the relationship.

I need to 10 add investment rounds to CompanyFinacials. Each round will have 12 fields.

So, the question is, do i build it like this:

class CompanyFinancials(models.Model):
    company = models.OneToOneField(Company, on_delete=models.CASCADE, primary_key=True)
    date_added = models.DateTimeField(auto_now=True)
    financial_start_year = models.CharField(max_length=2000, default='NONE', blank=True) ...
   ...
   ...
   ...
    
class fundingRound(models.Model):
    company_obj = models.OneToOneField(Company, on_delete=models.CASCADE, primary_key=True)
    funding_stage = models.CharField(max_length=2000, default='NONE', blank=True)
    target_size_of_raise = models.CharField(max_length=2000, default='NONE', blank=True)
    invest_amount = models.CharField(max_length=2000, default='NONE', blank=True)
    investment_vehicle = models.CharField(max_length=2000, default='NONE', blank=True)
    discount_percent = models.CharField(max_length=2000, default='NONE', blank=True)
    pre_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
    dilution = models.CharField(max_length=2000, default='NONE', blank=True)
    post_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
    equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)

or like this:

class CompanyFinancials(models.Model):
    company = models.OneToOneField(Company, on_delete=models.CASCADE, primary_key=True)
    date_added = models.DateTimeField(auto_now=True)
    financial_start_year = models.CharField(max_length=2000, default='NONE', blank=True)
    ...
    ...
    ...
   round1_funding_stage = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_target_size_of_raise = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_invest_amount = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_investment_vehicle = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_discount_percent = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_pre_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_dilution = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_post_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_funding_stage = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_target_size_of_raise = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_invest_amount = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_investment_vehicle = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_discount_percent = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_pre_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_dilution = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_post_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)
    ...
    ...
    ...
   round10_equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)

I can see that the DB model will very large if I manually put all 10 rounds into it, but is there really a disadvantage? It would be easier to work with an object that contains all the data.

Should I add a separate object for fundingRound, or keep it all in the CompanyFinancials model? And why?


Solution

  • Should I add a separate object for FundingRound, or keep it all in the CompanyFinancials model? And why?

    Yes. As one of my professors said: in modeling there are usually only three constants zero, one and infinity. The fact that you enter nine fields per round, and ten round results in 90 fields. It results in a database where querying will be cumbersome. If you want to check if any round got more than $10M in investment, that will require a query with nine ORs in between.

    Even if that all would work, it is not said that ten round will be enough, eventually it is possible some Company turns up with eleven rounds, or more, so then there is no space anymore.

    It will introduce a lot of NULLs, a lot of duplicated code to validate all these elements, and eventually make it harder.

    Usually modeling is the other way around: linearizing. Splitting things into rows, not columns. This makes summing up funding rounds, checking the amount of funding rounds, guaranteeing data integrity a lot harder.

    In fact the modeling violates the first normal form of databases [wiki] that aims to linearize data. Another thing that should probably be fixed is using dedicated fields to store data, a DecimalField [Django-doc] for example for percentages.