Search code examples
mysqlsqlstock

What is the best way to structure a database for a stock exchange?


I am trying to make a stock market simulator and i want to be as real as possible.

My question is: Nasdaq has 3000+ companies and in their database of stocks, right?! but is it one entry line for every share of every symbol on the sql db like the following example?

Company Microsoft = MSFT

db `companies_shares`

    ID symbol price owner_id* company_id last_trade_datetime
    1  msft   58.99 54334     101        2019-06-15 13:09:32
    2  msft   58.99 54334     101        2019-06-15 13:09:32
    3  msft   58.91 2231      101        2019-06-15 13:32:32
    4  msft   58.91 544       101        2019-06-15 13:32:32

*owner_id = user id of the person that last bought the share.

Or is it calculate based on the shares available to trade and demand to buy and sell provided by the market maker? for exemple:

I've already tried the first example by it takes a lot space in my db and i'm concerned about the band width of all those trades, especially when millions of requests(trades) are being made every minute.

I've already tried the first example by it takes a lot space in my db and i'm concerned about the band width of all those trades, especially when millions of requests(trades) are being made every minute.

What is the best solution? Database or math?

Thanks in advance.


Solution

  • You might also want to Google many to many relationships.

    Think about it this way. One person might own many stocks. One stock might be held by many people. That is a many to many relationship and usually modelled using three tables in a physical database. This is often written as M:M

    Also, people might buy or sell a single company on multiple occasions this would likely be modelled using another table. From the person perspective there will be many transactions so we have a new type of relationship one (person) to many (transactions). This is often written as a 1:M relationship.

    As to what to store, as a general rule it is best to store the atomic pieces of data. For example for a transaction, store the customer I'd, transaction date/time, the quantity bought or sold and the price at the very least.

    You might also want to read up about normalization. Usually 3rd normal form is a good level to strive for, but a lot of this is a "it depends upon your circumstance and what you need to do". Often people will denormalize for speed of access at the expense of more storage and potentially more complicated updating....

    You also mentioned performance, more often than not big companies such as NASDAQ. will use multiple layers if IT infrastructure. Each layer will have a different role and thus different functional characteristics and performance characteristics. Often they will be multiple servers operating together in a cluster. For example they might use a NoSQL system to manage the high volume of trading. From there there might be a feed (e.g. kafka) into other systems for other purposes (e.g. fraud prevention, analytics, reporting and so on).

    You also mention data volumes. I do not know how much data you are talking about, but at one financial customer I've worked at have several peta bytes of storage (1 peta byte = 1000 TB) running on over 300 servers just for their analytics platform. They were probably on the medium to large size as far as financial institutions go.

    I hope this helps point you in the right direction.