Search code examples
pythonlistdictionarytree

Nested dictionary or list using given unindent data using Python


Hi I have given below data but its unindent only using total keyword I can find the right nodes and can build tree structure. Input:

Current Assets  
Cash  
Checking 583961  
Savings 224600  
Petty Cash 89840  
Total Cash 898402  
Accounts Receivable 3593607  
Work in Process 589791  
Other Current Assets  
Prepaid Rent 164593  
Prepaid Liability Insurance 109728  
Total Other Current Assets 274321  
Total Current Assets 274321  

I am looking for below Output:

{
    "Current Assets": {
        "Cash": {
            "Checking": 583961,
            "Savings": 224600,
            "Petty Cash": 89840,
            "Total Cash": 898402
        },
        "Accounts Receivable": 3593607,
        "Work in Process": 589791,
        "Other Current Assets": {
            "Prepaid Rent": 164593,
            "Prepaid Liability Insurance": 109728,
            "Total Other Current Assets": 274321
        },
        "Total Current Assets": 5356121
    }
}

I tried recursion and node concept but nothing worked, It will be great if someone can help me on that trying to achieve using Python.

Rules:

As an example : Actually work in process is not sub item of Account Receivable' Its item of current asset only. As "work in progress" have digit at its end hence no children of it.

As per input data Cash does not have any numeric value at end hence such entries will have child/children,

cash is ending once having total cash with numeric value.

There will not be any children of work in process or Accounts Receivable as they are ending with Numeric value at end


Solution

  • You can do this with a recursive function or just use a stack to keep track of the nesting. The basic rule is:

    • No number: increase nesting
    • Starts with "Total": decrease nesting.

    With a stack, it might look like:

    import re
    
    s = '''Current Assets  
    Cash  
    Checking 583961  
    Savings 224600  
    Petty Cash 89840  
    Total Cash 898402  
    Accounts Receivable 3593607  
    Work in Process 589791  
    Other Current Assets  
    Prepaid Rent 164593  
    Prepaid Liability Insurance 109728  
    Total Other Current Assets 274321  
    Total Current Assets 274321'''
    
    
    def nest(items):
        res = {}
        stack = [res]
        for item in items:
            components = re.findall(r'(^.*?) (\d+)', item)
            if not components: # no numbers
                cur = {}
                stack[-1][item.strip()] = cur
                stack.append(cur)
            else:
                label, nums = components[0]
                stack[-1][label.strip()] = int(nums)
                if label.startswith("Total"): # end of subdict
                    stack.pop()
        return res
            
    nest(s.split('\n'))
    

    This will return:

    {
      'Current Assets': {
         'Cash': {
               'Checking': 583961,
               'Savings': 224600,
               'Petty Cash': 89840,
               'Total Cash': 898402
          },
          'Accounts Receivable': 3593607,
          'Work in Process': 589791,
          'Other Current Assets': { 
               'Prepaid Rent': 164593,
               'Prepaid Liability Insurance': 109728,
               'Total Other Current Assets': 274321
          },
          'Total Current Assets': 274321
      }
    }