Search code examples
algolia

Algolia query by INTERSECTION COUNT with two lists


I need to query some users based on "similar" interests. If a user A has 2 similar interests as user B then it's a match. I want to know if the following query is possible in Algolia.

Lets say I have two objects in some Algolia index with a list property on them (interests):

obj1 = {
    interests: ['A', 'B', 'C'],
}
obj2 = {
    interests: ['B', 'C', 'D'],
}

And I want to query all objects with interests having at least 2 of the following:

interests: ['A', 'B', 'E']

This should return me just obj1 since is the one having 2 interests alike.

Any ideas?


Solution

  • I'm not sure of an easier way to do this. But let me show my findings
    interests: ['A', 'B', 'E']

    now to capture your requirement the filters statement should be something like
    '(interests:A AND interests:B) OR (interests:A AND interests:E) OR (interests:B AND interests:E)' which if take as a boolean expression is in the form of AB+AC+BC.

    But this particular query cannot be used with algolia according to their docs

    For performance reasons, we do not support the following boolean combinations:
    ...

    We limit filter expressions to a conjunction (ANDs) of disjunctions (ORs). For example you can use filter1 AND (filter2 OR filter3)), but not ORs of ANDs (e.g. filter1 OR (filter2 AND filter3).

    But we can convert the AB+AC+BC to a product of sum format. I used https://www.dcode.fr/boolean-expressions-calculator and obtained the equivalent (A+B).(A+C).(B+C) which would then be

    '(interests:A OR interests:B) AND (interests:A OR interests:E) AND (interests:B OR interests:E)'

    The query also depends on how many elements are there in the interests array. For example if interests: ['A', 'E', 'C', 'F'] your final filter query would look in the form
    '(interests:A OR interests:E OR interests:C) AND (interests:A OR interests:E OR interests:F) AND (interests:A OR interests:C OR interests:F) AND (interests:E OR interests:C OR interests:F)'

    Individual product terms have length of interest array-1 terms

    TLDR: '(interests:A OR interests:B) AND (interests:A OR interests:E) AND (interests:B OR interests:E)'

    you can use a use a combination generating code and get the filter query. Here is a JS example based on this solution.

    const k_combinations = (set, k) => {
        let i, j, combs, head, tailcombs;
        if (k > set.length || k <= 0) {
            return [];
        }
        if (k == set.length) {
            return [set];
        }
        if (k == 1) {
            combs = [];
            for (i = 0; i < set.length; i++) {
                combs.push([set[i]]);
            }
            return combs;
        }
        combs = [];
        for (i = 0; i < set.length - k + 1; i++) {
            head = set.slice(i, i+1);
            tailcombs = k_combinations(set.slice(i + 1), k - 1);
            for (j = 0; j < tailcombs.length; j++) {
                combs.push(head.concat(tailcombs[j]));
            }
        }
        return combs;
    }
    const combinations = (set) => {
        let k, i, combs, k_combs;
        combs = [];
        for (k = 1; k <= set.length; k++) {
            k_combs = k_combinations(set, k);
            for (i = 0; i < k_combs.length; i++) {
                combs.push(k_combs[i]);
            }
        }
        return combs;
    }
        
    const generateFilterQuery = (array) => {
      const combinationSize  = array.length - 1
      const combinations = k_combinations(array, combinationSize)
      return combinations.map((comb) => `(${comb.map(c => `interests:${c}`).join(" OR ")})`).join(" AND ")
    }
    
    
    console.log(generateFilterQuery(["A","B","E"]))
    console.log(generateFilterQuery(["A","B","C","D"]))
    console.log(generateFilterQuery(["A","B","C","D","E"]))

    After generating the filter query pass it as the value of filters parameter

    index.search('',{
        filters: generatedQuery
    }
    ).then(({hits}) => console.log(hits))