Search code examples
pythonlistnamed-entity-recognition

How to group items in list using lables inside items?


There is list like :

list_a = [('B-DATE', '07'),('I-DATE', '/'),('I-DATE', '08'),('I-DATE', '/'),('I-DATE', '20'),('B-LAW', 'Abc'),('I-LAW', 'def'),('I-LAW', 'ghj'),('I-LAW', 'klm')]

I need to get joined list_a[x][1] items according to list_a[x][0] labels: "start with letter B" and all to the next "B-started"-label (list_a[x][0]):

list_b = ['07/08/20','Abcdefghjklm']

Like using stringagg + groupby in Oracle :)


Solution

  • One Line Solution

    Here is a one line answer using list-comprehension. The trick is to use a distinctly identifiable separator (I used '|||') prepended to the value that appears with each new occurrence of 'B'.

    str(''.join([f'|||{v}' if k.startswith("B") else v for (k, v) in list_a])).split('|||')[1:]
    

    Output:

    ['07/08/20', 'Abcdefghjklm']
    

    Algorithm

    1. Create a list of values where the values corresponding to each new occurrence of 'B' are preceded by '|||'.
    2. Join all the items in the list into a single string.
    3. Split the string by the separator, '|||'.
    4. Keep all but the first element for the str.split().