Search code examples
solrdistinctsolr4distinct-values

Retrieving distinct documents from Solr


I've had hard time explaining and finding what I need so please put your self in my shoes for a moment.

My requirement comes from a relational database background. I may be using Solr to do something it wasn't designed to do, or may be it can do what I need, I still need to confirm that. Hopefully you can assist me.

After indexing numerous documents into Solr. I need to retrieve distinct documents based on a filter. Just think about it as retrieving distinct rows while also applying a WHERE condition.

For example, in a relational database, I may have the following columns

(Country)  (City)     (Whatever)
 Egypt      Cairo      Hospitals
 Egypt      Alex       Schools
 Egypt      Mansoura   Hospitals
 Egypt      Cairo      Schools

If I perform this query: SELECT DISTINCT Country, City FROM mytable

I should get the following rows

(Country)  (City)
 Egypt      Alex
 Egypt      Mansoura
 Egypt      Cairo

Now after indexing the original table (SELECT * FROM mytable), how can I achieve the SAME output from Solr ? How can I retrieve documents by saying that I need these documents to be distinct based on some fields ? I will also need to apply a not null filter for a specific field.

I don't need statistics of any kind, I only need to get the documents.

I hope I was clear enough. Thank you for your time.


Solution

  • this would be achievable with field collapsing by grouping by multiple fields, but unfortunately only one field is supported right now. There is an open issue, check it out.