Search code examples
pythontime-complexityspace-complexity

Substring permutation complexity


Just had a quick question about a Python implementation for the Leetcode question about substring permutations, as I'm a major newbie with algorithms:

Given two strings s1 and s2, write a function to return true if s2 contains the permutation of s1. In other words, one of the first string's permutations is the substring of the second string.

Example 1:

Input: s1 = "ab" s2 = "eidbaooo"
Output: True
Explanation: s2 contains one permutation of s1 ("ba").

Example 2:

Input:s1= "ab" s2 = "eidboaoo"
Output: False

With one of the solutions given, I believe the following time complexity is O(n), but would the time complexity be O(1)? Additionally, would anyone have a resource for a Python implementation of a brute force approach? Thank you!

from collections import Counter

class Solution:
    def checkInclusion(self, s1: str, s2: str) -> bool:
        s1_count, s2_count = Counter(s1), Counter(s2[:len(s1)])
        
        for i in range(len(s2)-len(s1)):
            if s1_count == s2_count: 
                return True
            
            if s2_count[s2[i]] == 1: 
                del s2_count[s2[i]]
            else: 
                s2_count[s2[i]] -= 1
            
            if s2[i+len(s1)] in s2_count: 
                s2_count[s2[i+len(s1)]] += 1
            else: 
                s2_count[s2[i+len(s1)]] = 1
                
        return s1_count == s2_count
        
sol = Solution()
print(sol.checkInclusion('ab', 'eidbaooo'))


    

Solution

  • The time complexity of your own solution seems to be O(n) because the time is propotional to the length of input. O(1) means that the size of input doesn't matter which is not the case here. Creating and searching every possible permutations as in answer from inputvector is the brute force approach you are asking for and it is very slow when len(s1) is over 10. This has the very fast growing time complexity O(n!).

    I couldn't help trying to beat your algorithm. Turns out I couldn't. Here is my attompt. It's a little faster on short strings up to around 6 characters in s1. It has O(n²) time compexity.

    def checkInclusion2(s1,s2):
        i = -1
        max_i = len(s2)-len(s1)+1
        while i < max_i:
            i += 1
            c = s2[i]
            if not c in s1:
                continue
            temp = list(s1)
            temp.remove(c)
            for j, c in enumerate(s2[i+1:i+len(s1)]):
                if c in temp:
                    temp.remove(c)
                else:
                    if c not in s1:
                        i += j
                    break
            else:
                return True
        return False
    

    So, then I tried to optimise yours and found out its faster to compare defaultdicts than Counters. And saving character from s2 in a temporary variable is faster than fetching it several times. This is twice as fast but that usually doesn't count in terms of time complexity. It's still O(n).

    from collections import defaultdict
    
    def checkInclusion3(s1, s2):
        s1_count, s2_count = defaultdict(int), defaultdict(int)
        for i in range(len(s1)):
            s1_count[s1[i]] += 1
            s2_count[s2[i]] += 1
        for i in range(len(s2)-len(s1)):
            if s1_count == s2_count: 
                return True
            old_c = s2[i]
            if s2_count[old_c] == 1:
                del s2_count[old_c]
            else:
                s2_count[old_c] -= 1
            s2_count[s2[i+len(s1)]] += 1
        return s1_count == s2_count