Search code examples
javajpafilestreams

java : Is it a bad practice to store file streams in a database?


I'm reading a file streams of certain group of files and storing it in a database as bytea type. But when I try to read the streams from the database and write those streams into a file, It is really taking long to do it and finally I get an out of memory exception. Is there any other alternative where it can be done more efficiently with or without database involved?


Solution

  • Databases were designed with a key problem in mind:

    When having a bunch of data, where we don't know the kinds of reports
    that will be generated, how can we store the data in a manner that
    preserves the data's inner relationships and permits any reporting
    format we can think of. a
    

    Files lack a few key characteristics of databases. Files consistently have a single structure of "characters in order". They also lack any means of integrated report building, and the reporting is often confined to simple searches, which have little context without the result being shown in the rest of the file.

    In short, if you aren't using the database's features, please don't use the database.

    Many people do store files in databases; because, they have one handy, and instead of writing support for a filesystem storage, they cut-and-paste the database storage code. Let's explore the consequences:

    1. Backups and restores become problematic because the database grows in size very quickly, and the bandwidth to do the backup and restore is a function of the size of the database.
    2. Replication rebuilds in fail-safe databases take longer (I've seen some go so long that redundancy couldn't catch up to the rate of change in the primary database).
    3. Queries that (accidentally) reference the files in bulk spike the CPU, possibly starving access to the rest of the system (depends on database).
    4. Bandwidth of returning the results of those queries steals system resources preventing other queries from communicating their results (better on some databases, worse on others).