Search code examples
pythonpandasgroup-bylist-comprehensionapply

Apply function with two arguments on same list and output only lower/upper triangular matrix without diagonal


I want to compare strings without unecessary comparisons, so far I have:

[[[dice_coefficient(x,y) for x in ['a','a','b'][j]] for y in ['a','a','b'][0:j]] for j in [1,2]]

where dice_coefficient is defined here

which gives the expected output but look like it would be the right approach if those strings were the comments of an author in a column of a pandas dataframe.


Solution

  • One thing is that in your case first iteration of your outer loop (for j ...) is the subset of the second iteration. To compare things only once, you can do:

    data = ['a', 'a', 'b']
    [
        [
            dice_coefficient(x, y)
            for x in data[i:]
        ]
        for i, y in enumerate(data[:-1], start=1)
    ]
    

    If you think that you'd have a lot of repetitive values in your data you can use lru_cache on dice_coefficient to avoid repetitive comparisons.