Search code examples
pythondatabasedatasetstock-data

Speed - CSV vs MariaDB fetching stock data (python)


I like the idea of having my historical stock data stored in a database instead of CSV. Is there a speed penalty for fetching large data sets from MariaDB compared to CSV


Solution

  • Quite the opposite. Whenever you fetch data from a CSV, unless you have a stopping condition (for example, take the first entry with x = 3) you must parse every single line in the file. This is an expensive operation because not only do you have to read all of the lines (making it O(n)), but in general, you will be typecasting as well. In a database, you have already processed all of the lines, and if in this case there is an index on x or whatever attribute you are searching by, the database will be able to find the information in O(log(n)) time and will not look at the vast majority of entries.