database mongodb postgresql database-design large-files

loading chunks of large text files from DB

So, as the topic says - i need to serve chunks of multiple large text files loaded in DB. The files would be anywhere from 100 lines to 10.000.000 lines. i would serve it to multiple instances of processing script in chunks of 10/50/100 lines. Question is - what is the proper way to store and serve them? assuming there could be hundreds of such files.

What DB and possible additional tools is the best choice? MongoDB + GridFS? or there is an efficient way to serve them in such small chunks in case if i store them as BLOB in PostgreSQL?

Solution

If you go for a relational database, I would store the files line by line in a table. That way it is easy to fetch lines:

SELECT line FROM documents
WHERE docname = 'mydoc'
  AND line_nr > 100
ORDER BY line_nr
FETCH FIRST 50 ROWS ONLY;

A b-tree index on (docname, line_nr) would make the query very efficient.

If you want to keep the table from getting too large, use range partitioning on docname.