python pandas group-by list-comprehension apply

Apply function with two arguments on same list and output only lower/upper triangular matrix without diagonal

I want to compare strings without unecessary comparisons, so far I have:

[[[dice_coefficient(x,y) for x in ['a','a','b'][j]] for y in ['a','a','b'][0:j]] for j in [1,2]]

where dice_coefficient is defined here

which gives the expected output but look like it would be the right approach if those strings were the comments of an author in a column of a pandas dataframe.

Solution

One thing is that in your case first iteration of your outer loop (for j ...) is the subset of the second iteration. To compare things only once, you can do:

data = ['a', 'a', 'b']
[
    [
        dice_coefficient(x, y)
        for x in data[i:]
    ]
    for i, y in enumerate(data[:-1], start=1)
]

If you think that you'd have a lot of repetitive values in your data you can use lru_cache on dice_coefficient to avoid repetitive comparisons.