Database structure for my Python application is very similar to Instagram one. I have users, posts, and users can follow each other. There are public and private accounts.
I am indexing this data in ElasticSearch and searching works fine so far. However, there is a problem that search returns all posts, without filtering by criteria if user has access to it (e.g. post is created by another user who has private account, and current user isn't following that user).
My data in ElasticSearch is indexed simply across several indexes in a flat format, one index for users, one for posts.
I can post-process results that ElasticSearch returns, and remove posts that current access doesn't have access to, but this introduces additional query to the database to retrieve that user followers list, and possibly blocklist (I don't want to show posts to users that block each other too).
I can also add list of follower IDs for each user to ElasticSearch upon indexing and then match against them, but in case where user has thousands of followers, these lists will be huge, and I am not sure how convenient it will be to keep them in ElasticSearch.
How can I efficiently do this? My stack is backend Python + Flask, PostgreSQL database and ElasticSearch as search index.
Maybe you already found a solution...
Using elastic "terms lookup" can solve this problem if you have an index with the list of followers you can filter on, as you said here:
I can also add list of follower IDs for each user to ElasticSearch upon indexing and then match against them, but in case where user has thousands of followers, these lists will be huge, and I am not sure how convenient it will be to keep them in ElasticSearch.
More details in the doc: https://www.elastic.co/guide/en/elasticsearch/reference/7.5/query-dsl-terms-query.html#query-dsl-terms-lookup
Note that there's a limitation of 65 536 terms (but it can be overwritten) so if your service don't have millions of users default limit will be fine.