This is a leetcode question : Given a string s, find the length of the longest substring without repeating characters.
xample 1:
Input: s = "abcabcbb" Output: 3 Explanation: The answer is "abc", with the length of 3. Example 2:
Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.
My code failed when it encountered this string : s = "dvdf"
I know what the problem is but I just can't code it to make it work. I know that my program will not star over from "v" after the char gets reset and will start from "d" but i dont know hot to make it do that.
This is my code :
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
substring_length = {}
i = 0
count = 0
char = ""
while i != len(s) :
if s[i] not in char:
char = char + s[i]
count += 1
i += 1
if i == len(s) :
substring_length[char] = count
elif s[i] in char:
substring_length[char] = count
char = ""
count = 0
if len(substring_length) == 0:
substring_length[s] = len(s)
return max(substring_length.values())
I know there are other solutions to my problem, but I'm sure mine would work if I can just find how to solve this issue.
You effectively answered your own question. Your method requires you to double back each time you hit a duplicate. But you are not decrementing the i
to do that before resetting the count. You need to scan back into the string and start again from just after where the run started.
elif s[i] in char:
substring_length[char] = count
char = ""
i -= (count - 1)
count = 0
Some comments:
elif
line could just be else
since it is the exact negation of the if
condition.(Edited to discuss efficiency - not enough space or formatting capability in the comment section)
But isn't my program still O(n). so it would be as fast as the sliding window approach no?
That depends on what we are counting! The in
operator could be implemented in constant time with reasonable memory and the knowledge that we aren't going to store duplicate characters. So we'll presume we already did that, ignore the real cost of that operator as implemented now, and just focus on how many times we need to call it. This gives a reasonable approximation to the complexity of the algorithm.
(I'm removing your redundant check in the elif
clause for the purposes of the below, and calling this Alg.1
)
Your algorithm calls in
for each character as it works its way forward through the string, but also repeats those checks when it needs to backtrack. The backtracking adds a variable amount of operations depending on the string in question.
Example: abcabcbb
requires your algoritm to check the presence of a character in the string 25 times; and pwwkew
requires 12 checks.
There was briefly another algorithm posted as an answer which seems to have disappeared. It naively checks all substrings that start at each position until it finds a duplicate. So it walks the strings a bit less efficiently. Let's call that Alg.2
.
By contrast, the below algorithm performs the in
operation once for each character in the input, and once more for each character between the start of the current longest substring and the repeated character.
def lengthOfLongestSubstring(self, s: str) -> int:
currentSubstring = ""
longestSubstring = s[0]
for i in s:
if len(longestSubstring) < len(currentSubstring):
longestSubstring = currentSubstring
while i in currentSubstring:
currentSubstring = currentSubstring[1:]
currentSubstring += i
if len(longestSubstring) < len(currentSubstring):
longestSubstring = currentSubstring
return len(longestSubstring), longestSubstring
If we generate many (I did 10000) random strings of 27 characters (first guarantee of a duplicate) and 1000 characters (stress any inefficiencies), then count the calls to the in
operator for each algorithm we get an average operation count of:
Alg.# Ops for 27 chars Ops for 1000 chars Ops for 'a..za' Ops for abab..aba Ops for aabb..zzaa
Alg.1 145 7032 53 77 132
Alg.2 165 7053 378 78 132
Alg.3 48 1994 28 52 105
Also included the number of operations for scenarios designed to be bad for each algorithm.