I know that this question has already been asked more than once.My doubt is not finding the solution of this problem, but the correct complexity. I read an answer here:
Minimum window width in string x that contains all characters in string y
In this solution,the complexity proposed is O(n) using hash table.Now till mapping the characters of string T in hash table is fine,but when it comes to finding the minimum and maximum element from the hash table,in almost every answer,I read a linked list is used and they write that the complexity in updating a linked list is O(1).This is where the doubt is.
Please consider the following example:
S:abcbae
T:abc
Initially t["a"],t["b"] and t["c"] will be -1,after 3 passes the values will be 0,1,2 respectively(in hash table) and the double linked list will be containing the elements [(a,0)-(b,1)-(c,2)].
Now the fourth character in string S is b,so before appending new value of index for b in DLL,I need to remove it's previous instance for which I will have to traverse the DLL.
My question is then how come is the solution for updating DLL is O(1),isn't it O(n),which would lead to a total complexity of O(n^2)?
[O(n) for traversing string S,then O(n) for updating DLL]
Please correct me if I am wrong.
The data structure described is a doubly-linked list, visualised as follows, using the example in the answer:
HEAD <=> ('c', 0) <=> ('b', 3) <=> ('a', 5) <=> TAIL
coupled with an array, which is visualised as follows:
'a' | 'b' | 'c'
{5, &list[2]} | {3, &list[1]} | {0, &list[0]}
(notice the duplication of information)
At any point during the described algorithm, the following operations are necessary:
The time-complexity of #1 is O(1) (trivial), and of #2 also O(1) because of our choice of doubly-linked list.
The time-complexity of #3 is also O(1), because of our choice of array. It would still be O(1) for a hash-table-backed map.
The time-complexity of #4 is O(1) because our array holds a pointer to each node in the list, and each character only ever has one node in the list, therefore removal is also trivial.
It's important to note (and understand) here that at any iteration of the proposed algorithm, the minimum window cannot be made larger than what it already was in the last step (why? prove it), and therefore we only need at most one node per character in the string Y (or T in your question).
Hope this helps.