Search code examples
javadata-storageuser-generated-content

Are there any libraries or components that handle storage and fast retrieval of user-generated content?


Considering the case of having a large and active user base where each user wants to store a profile picture and some additional images or other artifacts, are there any libraries or frameworks that allow for easy storage and query of such data?

A reference implementation would be Facebook's Haystack Photo Infrastructure.

The following characteristics are important

  • Data store should scale well: adding resources should be transparent to the application using the store (similar question had an answer referring to LinkedIn's Voldemort).
  • Ability to add some meta-data alongside the data being stored.
  • Meta-data can be queried with good performance (e.g. stored in configurable index like Lucene/Solr).
  • Quick key-based access and some intermediate caching layer

Any recommendations for libraries or frameworks that can be easily integrated into a Java web application are welcome.

Update: thank you for the first few answers. I have to go into more detail on what type of answers are expected. Tobu's answer, although not java related is very good (just voted up). It is possible to implement a solution with a combination of file system access and a DB and add some layer of caching in between, but I consider it a waste of time, if someone more qualified than me has already designed, implemented and run a better solution. Something based on a solution with underlying DB or JCR implementations is a good fit, but implementing the other infrastructure is not what I want to do.


Solution

  • We've made good experiences with the media repository from Fedora Commons (http://www.fedora-commons.org/), which allows you to store media assets alongside their associated metadata. We did not have any problems with scalability or customization nor was it difficult to exchange the underlying storage layer with a triple store (if this would be needed in your case). If you need to index your data using Solr you can use a predefined meta data field ("RELS-EXT") to store XML based data.