Search code examples
amazon-web-servicesredditamazon-cloudsearch

Does reddit use Amazon Cloud Search?


I read in reddit wiki that reddit moved to indextrunk but when I reviewed run.py file I found that there is keys like Cloud_Search_Api_key ... So I guessed it is using Amazon cloud search . If this true what are the values that should be changed in run.py to make cloudsearch works? and what is subreddit_cloud_api_key?

Thanks


Solution

  • I'm pretty sure Reddit uses Cloudsearch. Their Github FAQ claims they use Indextank, but IndexTank has been shut down since April 2012. If you search on Reddit and highlight the "δ" symbol, it will show text like "δ converted query to cloudsearch syntax: (and (field text 'search') (field text 'terms'))".

    I'm not too familiar with Python or AWS, but it looks like CLOUDSEARCH_SEARCH_API and the other similar variables are URLs that Amazon calls Endpoints.

    The variable names in reddit/r2/run.ini contain SEARCH and DOC, mirroring Amazon's documentation. Also, cloudsearch.py makes an HTTP connection to that variable:

    search_api = g.CLOUDSEARCH_SEARCH_API
    //...
    connection = httplib.HTTPConnection(search_api, 80)
    

    So you would probably set CLOUDSEARCH_SEARCH_API with the URL to your Cloudsearch endpoint.

    EDIT: Kemitche has answered this on Reddit. Unlike me, he knows what he's talking about, so take a look.