Search code examples
pythondictionarynlp

Assemble separate sentences connected by dictionary attributes


I have the following list of dictionaries and I'm trying to come up with a single connected sentence based on whether the fact that the dictionary of the child sentences have the "ref" attribute connecting it to the father sentence.

clause_list = [
    {"id": "T1", "text": "hi"},
    {"id": "T2", "text": "I'm", "ref": "T1"},
    {"id": "T3", "text": "Simone", "ref": "T2"},
]

Expected output is "hi I'm Simone" but avoiding sentences like "hi I'm" or "I'm Simone"

What I tried so far is the following, but no matter how I flip it, the undesired sentences always get printed.

for c in clause_list:
  for child in clause_list:
    try:
      if c["id"] == child["ref"] and "ref" not in c.keys():
        print(c["text"], child["text"])
      elif c["id"] == child["ref"] and "ref" in c.keys():
        for father in clause_list:
          if father["id"] == c["ref"] and "ref" not in father.keys():
            print(father["text"], c["text"], child["text"])
    except KeyError:
      pass

Solution

  • probablly is best use a class and not a dict, but you could convert to list of dict to list of classes easy. This work. Convert the list of dict to list of Clauses. and the clauses has the method search_ref that print the partial text if there are no referenced object, or add the referenced object and continue if there are. if you have 2 objects i don't know exactly what you want

    clause_list = [
        {"id": "T1", "text": "hi"},
        {"id": "T2", "text": "I'm", "ref": "T1"},
        {"id": "T3", "text": "Simone", "ref": "T2"},
    ]
    class Clause:
        def __init__(self, id, text, ref:None):
            self.id = id
            self.text = text
            self.ref = ref
        
        def search_ref(self, Clauses, text=''):
            parcialText = text + ' ' + self.text
            for clause in Clauses:
                if clause.ref == self.id:
                    return clause.search_ref(Clauses, parcialText)
            print(parcialText)
        
    Clauses = [Clause(id=c['id'], text=c['text'], ref=c.get('ref')) for c in clause_list]
    
    for c in Clauses:
        if c.ref is None:
            c.search_ref(Clauses)