Given a string s and a dictionary of valid words d, determine the largest number of valid words the string can be split up into.
I tried solving this problem with the code below but it is not giving me the answer I am looking for.
def word_split_dp(s):
n = len(s)
ans = [0]*n
# base case
ans[0] = 0
for i in range(1, len(s)):
ans[i] = float('-inf')
for j in range(1, i-1):
ans[i]= max(ans[i], ans[i-j] + wordcheck(s,i,j))
print(ans)
return ans[n-2]
def wordcheck(s,i,j):
for word in dict:
start = i-j+1
if(s[start:i] == word):
return 1
else:
return float('-inf')
print(word_split_dp(s))
There are a few issues in your attempt. Starting from the top:
Assuming that ans[i]
represents the maximum number of partitions of the substring s[0:i]
into valid words, you'll need to make this list one entry longer, so that there is an ans[n]
, which will eventually contain the answer, i.e. the maximum number of partitions in s[0:n]
which is s
. So change:
ans = [0]*n
to
ans = [0]*(n+1)
Because of the reasoning given in the first bullet point, the final return
should be return ans[n]
instead of return ans[n-2]
.
Again because of the reasoning given in the first billet point, the outer loop should make one more iteration, so change:
for i in range(1, len(s)):
to
for i in range(1, len(s)+1):
Assuming that j
represents the size of the substring to test, the inner loop should allow i-j
to eventually reach back to 0, so j
should be able to reach the value i
. That means the inner loop should make two more iterations. So change:
for j in range(1, i-1):
to
for j in range(1, i+1):
In wordcheck
, the slice from s
should start at i-j
, as j
represents the size of the slice. So change:
start = i-j+1
to
start = i-j
In wordcheck
, the loop will always exit in its first iteration. It cannot loop more as it always executes a return
. You should not exit the loop when there is no success. The return float('-inf')
should only be made when all words have been tried. So remove:
else:
return float('-inf')
and instead add at after the loop:
return float('-inf')
Those are the issues. The code with all those corrections:
def word_split_dp(s):
n = len(s)
ans = [0]*(n+1) #correction
# base case
ans[0] = 0
for i in range(1, len(s) + 1): # correction
ans[i] = float('-inf')
for j in range(1, i + 1): # correction
ans[i] = max(ans[i], ans[i-j] + wordcheck(s,i,j))
print(ans)
return ans[n] # correction
def wordcheck(s,i,j):
words=("war","month","on","the","heat","eat","he","arm","at")
for word in words:
start = i-j # correction
if s[start:i] == word:
return 1
# don't interrupt loop otherwise
return float('-inf')
s="warmontheheat"
print(word_split_dp(s)) # -inf
s="warontheheat"
print(word_split_dp(s)) # 5