Search code examples
ruby-on-railsversioningstoragedocumentdms

Document management system: what to use as storage backend (docs content repository)?


I want to make a document management system (interface in Ruby).
What do profesional sollutions (Alfresco, Liferay social office, others) use for storing and versioning documents?
What else can I use?

Key points:

  • storage space optimization (deltas, compression ...)
  • versioning
  • ability to index docs (can be external)
  • ability to make backups at runtime (live hot-backup)
  • locking?
  • scalability on large data volume
  • ensure data integrity (hashing?)
  • permissions
  • transactional
  • Workflow support (optional)

Bonus points:

Any books on this issue ?


Solution

  • Most of the enterprise document management solutions I've seen (Cimage, Documentum, LiveLink) definitely don't care about #1. Storage is relatively cheap, especially if it's storage vs processing (store and retreieve). They mostly rely on filesystem based storage - perhaps with name abstraction such that ShoppingList.doc perhaps becomes 20100909100101a.doc.rev1, with a database tracking the given-name, the stored name, revisions, and various other data {MIME type, headers & properties etc}. By not generating deltas + compression you get indexing very easily from any number of existing products/agorithms. Versioning is also extremely simple with this approach.

    Depending on the size and scale you're building, you could also store versioned files within a database.

    An (S)FTP or CIFS storage process would also allow your software to run on an app server with modest space, but store the files+history on a file or cloud server of some sort - although this isn't much different from filesystem based storage.