From the course "Text Retrieval and Search Engines" on Coursera I learnt some feedback algorithms in information retrieval system, like Rocchio. But I still can't understand how feedback is used in practical.
Why all feedback algo update the query vector instead of updating the document rank directly?
Are the document click through feedback stored in Postings list?
Thanks
But I still can't understand how feedback is used in practical.
Since you've studied the Rocchio feedback, I'll try to explain with reference to this particular approach although this will be applicable to any other feedback methods as well, e.g. relevance modeling.
The Rocchio algorithm first modifies the current query representation (by adding new terms and re-weighting initial query terms). It then performs a 2nd pass retrieval and obtains a new ranked list.
Why all feedback algo update the query vector instead of updating the document rank directly?
This is because if the initial query representation is not good enough, the initial ranked list wont have a high recall. This means that even reranking the results won't be much useful (unless of course you're doing a highly precision oriented task and all you care about is P@10). Additional terms in the query will often have a significant impact in retrieving more relevant documents in top-1000.
Are the document click through feedback stored in Postings list?
No, the postings list may additionally contain per-document statistics for a particular term (the head of the list), e.g. term positions etc. The information of whether a document was clicked or not is a global information, not pertaining to a specific term. Also, user clicks are not used to modify the ranking of the current query. They could be used, rather, to build user profiles of interest.