Search code examples
pythondatabaseflat-file

When is it appropriate to use a database , in Python


I am making a little add-on for a game, and it needs to store information on a player:

  • username
  • ip-address
  • location in game
  • a list of alternate user names that have came from that ip or alternate ip addresses that come from that user name

I read an article a while ago that said that unless I am storing a large amount of information that can not be held in ram, that I should not use a database. So I tried using the shelve module in python, but I'm not sure if that is a good idea.

When do you guys think it is a good idea to use a database, and when it better to store information in another way , also what are some other ways to store information besides databases and flat file databases.


Solution

  • Most importantly, unless you specifically need performance or high reliability, do whatever will make your code simplest/easiest to write.


    If your data is extremely structured (and you know SQL or are willing to learn) then using a database like sqlite3 might be appropriate. (You should ignore the comment about database size and RAM: there are times when databases are appropriate for even very small data sets, because of how the data is structured.)

    If the data is relatively simple and you don't need the reliability that a database (normally) has then storing it in one of the builtin datatypes while the program is running is probably fine.

    If you'd like the data stored on disk to be human readable (and editable, with a bit of effort), then a format like JSON (there is builtin json module) is nice, since the basic Python objects serialise without any effort. If the data not so simple then YAML is essentially an extended version of JSON (PyYAML is very good.). Similarly, you could use CSV files (the csv modules), although this is not nearly as good as JSON or YAML, or just a custom text format (but this is quite a lot of effort to get error handling and so on implemented neatly).

    Finally, if your data contains more advanced objects (e.g. recursive dictionaries, or complicated custom datatypes) then using one of the builtin binary serialisation techniques (pickle, shelve etc.) might be appropriate, however, YAML can handle many of these things (including recursive data structures).

    Some general points:

    • Plain text formats are nice, as they allow values to be tweaked easily and debugging/testing is easy
    • Binary formats are nice, as they mean that values can't be tweaked without a little bit of extra effort (this is not saying they can't be adjusted though), and the file size is smaller (probably not relevant)