Search code examples
pythonstringstring-matching

Pattern matching for finding siblings


Consider this scenario were I have some sequence number in form of list of String, where a dot represent a level of tree.

I want to find the count of siblings of every sequence number. I tried and below is my code


list = ["33.1", "33.5", "33.15", "33.1.1", "33.1.2", "33.1.11", "34.5", "33.2.9"] 

for item in list:
    numberOfSiblings = 0
    splitText = item.rsplit(".", 1)
    
    j = 0
    while( j < len(list)):
        if( list[j].startswith(splitText[0]) and len(item) == len(list[j]) and item != list[j] ):
            numberOfSiblings +=1
        j+=1
    
    print(numberOfSiblings)

Here according to items in list..starting at zero index.. "33.1" will have sibling "33.5" and "33.15". (Note : "34.5" will not be counted as it does not start with "33") , so count will be 2. Similarly count of siblings for "33.5" and "33.15" will be 2 also.

Moving at index 3, element "33.1.1" will have siblings "33.1.2" and "33.1.11" only and so on for rest of the items accordingly.

Output should be like:

element => No.Of.Siblings

33.1 => 2  
33.5 => 2  
33.15 => 2  
33.1.1 => 2  
33.1.2 => 2  
33.1.11 => 2  
34.5 => 0  
33.2.9 => 0 

Here problem is with the length check in while block. Please help to implement this scenario..Thanks in advance


Solution

  • Easiest way to do this is split each 2 strings into separate parts, and compare all parts piece by piece except the last one element (which can differ)

    l = ["33.1", "33.5", "33.15", "33.1.1", "33.1.2", "33.1.11", "34.5", "33.2.9"] 
    
    for item in l:
        numberOfSiblings = 0
        splitText = item.split(".")
        for otherItem in l:
            otherSplitText = otherItem.split(".")
            if item != otherItem and len(splitText)==len(otherSplitText) and all(a == b for a, b in zip(splitText[:-1], otherSplitText[:-1])):
                numberOfSiblings += 1
        print(f"{item} {numberOfSiblings}")
    

    EDIT: Added O(n) solution based on Counter

    from collections import Counter
    
    l = ["33.1", "33.5", "33.15", "33.1.1", "33.1.2", "33.1.11", "34.5", "33.2.9"] 
    c = Counter()
    for item in l:
        head, tail = item.rsplit(".", 1)
        c[head] += 1
    
    for item in l:
        head, tail = item.rsplit(".", 1)
        print(f"{item} {c[head] - 1}")