Search code examples
pythonlistnlptext-mining

access elements in list of lists


I am new at text mining, I am using Python. I have a list of lists, each list contains clusters of synonyms and each word in the cluster has a list which contains the number of sentences which it appears. The list i have is like this

syn_cluster = [[['Jack', [1]]], [['small', [1, 2]], ['modest', [1, 3]], ['little', [2]]], [['big', [1]], ['large', [2]]]]

I want to assign for each cluster the min and the max from the appearance list, so i want the result to be like this

[[['Jack', [1]]], [['small', 'modest, 'little'], [1, 3]], [['big', large], [1, 2]]]

Solution

  • I am not sure if you are using the best data structure for your problem. But if you do this with lists of lists you could do:

    from itertools import chain
    
    serialized = []
    for syn_words in syn_cluster:
        words = [w[0] for w in syn_words]
        freqs = list(chain.from_iterable([f[1] for f in syn_words]))
        min_max = [min(freqs), max(freqs)]
        # or:  min_max = list({min(freqs), max(freqs)}) if you want [1] instead of [1, 1]
        serialized.append([words, min_max])
    
    serialized
    >>> [[['Jack'], [1, 1]],
        [['small', 'modest', 'little'], [1, 3]],
        [['big', 'large'], [1, 2]]]