Search code examples
mongodbemailcouchdbarchivenosql

How to implement an IMAP server on top of a couchdb/NoSQL data store?


To summarize my objective here, I am really just looking for a simple, opensource method which allows me to create and maintain a (preferably noSQL db) backup/archieve of one/more remote IMAP email accounts on a per user basis and sync each individual users email accounts using a simple, low cost solution which easily scales out, consumes server resources in an efficient maner with the ADDED ABILITY that each user needs to be able to connect to his central email archive by simply addingba new imap account to his existing email client using an imap server, username and password provided through this archive server/setup.

More specifically:

I have been looking for some type of scalable open source solution which can be run (and thus easily scaled out) in the cloud which allows for the following:

1) allows me to specify a variety of IMAP servers with login information which are used to access those email accounts and download/sync all the emails within each account (hopefully including folders/labels)

2) in regards to the database used for the storage of all the emails for each account I was looking into scalable solutions such as couchdb or mongodb which presumably would maintain a simple index of every email. This index would maintain basic information for each email such as columns for headers including: from, to, data time stamps, subject line, associated folders/labels, first sync date time, last sync date time, status for read/unread, number of attachments, attachment filenames/sizes/types and the associated imap account it belongs to, ....)

3) in terms of the storage of all the original emails including their attachments I was thinking each individual email should be downloaded as an individual file with a unique filename/message-id which would be referenced within the main email index and therefore all these original emails can be stored using Amazons S3 storage solution for virtually unlimited scalability.

Up to this point I believe there are existing Opensource solution which can be used or customized to achieve these goals... Most notably It seems that "offlineIMAP" provides all of these capabilities + more but if your aware of a different alternative please let me know.

Ok, now to the element I am unsure about...

5) what I need here is a way to utilize any type of email client which natively supports email access through an IMAP connection and now gives me the ability to connect to my custom email database as if it were a regular imap email server. As such I need some type of connector I guess which connects the imap protocol to actions preformed on the couchdb (or whatever data store is used). Naturally any standard iMac functions such a search/copy/move/delete/... should be possible as well while retrieving the details of an indicidual email is done by retrieving the associating email from the Amazon S3 storage system. (I am just assuming this method makes the most sence given the reduction in costs doing it this way.

Assuming that my logic and approach is sound in terms of using couchdb/mongodb in this method it would seem to me that this setup should indeed allow me to easily scale this out to multiple users and accessing the archives should be fairly quick...

Does anyone have any experience, suggestions or advice/scripts related to achieving these goals?

The only negative side effect that I could think of regarding this type if email archive setup and the using Amazon S3 to store the actual emails is that users would not be able to search the contents (body) of their archived by keyword. I guess this could be solved by simply adding another column to the couchdb email index which could extract all the actual message text from an email (excluding the content from any previous reply/forward content).


Solution

  • Regarding 5): You may want to look at Apache James, afaik it has many storage engines, you may use/adapt one of them. This way it can provide an IMAP interface to your database. Of course it does not synchronize from other servers, you have to do that using other menthods which were already mentioned.