Search code examples
databasedatabase-designmarkup

At which point should I use markup to limit the amount of Database Tables


I'm a bit new to database design and wondering what I should store in tables and what should stored with a markup-language in a text field that could then later be processed by the server. My tendency is to make a table for essentially everything but I've noticed many applications like say mediaWiki use markup to store their data in a large text field. Does this save time by limiting the amount of Database Queries and the size of Tables? Or does the server side processing take up more time?


My page structure is as follows:

Page contains:
id
page_image_id
user_id
title
path
deleted

Page has many Sections

-

Section contains:
id
page_id
position

Section has many Paragraphs
Section has many Images
Section has many Page_Links

-

Paragraph contains:
id
section_id
body_text
position

-

Image contains:
id
section_id
user_id
image_path
description
position

-

Page_link contains:
id
page_id
section_id
description
position

I like to design the database in mind of the possibility of reaching 100 000+ pages. Noting that, the last 3 tables could get very large very quickly. Would it be more efficient to store all that information in markup in the section table then processing the information using server-side code? Or am I underestimating the speed of database queries and above is more efficient as the data there ready to be manipulated? Is there a point at which I should stop making tables and use markup?


Solution

  • This whole issue boils down to: what is atomic from the data management perspective?

    In other words, do you need to query for or modify individual paragraphs or other page elements while they are in the database?

    • If yes, you are on the right track. Fine granularity at the database level will allow you to index and retrieve each individual piece of data, and to modify one piece without disturbing others.
    • If no, just store the whole page in a single field (presumably BLOB or CLOB). It doesn't matter if you process (say) individual paragraphs at the application level - if you always fetch the whole page from the database (and always write the whole page into the database), then it should be stored as an atomic piece of data in the database.