If we arrange all the distinct sub-strings of a string lexicographically and we need the i'th substring
1.) Is it possible to find it using it's suffix array and LCP array?
2.) If Yes, how do we do it ? can it be done in O(Nlog^N) while creating the suffix array using Manber & Myers which has a time complexity of O(Nlog^2N) or while creating it's LCP array using kasai's algorithm which has a time complexity of O(N)?
Yes it can be done using Suffix array and LCP array.
Assuming you know how to calculate Suffix array and LCP array.
Let p[]
denote suffix array lcp[]
denote LCP array.
create a array which store the number of distinct sub string till i'th
rank suffix. This can be calculated using this formula. For more details see Here
Let cum[]
denote cumulative array which can be calculated as follow:
cum[0] = n - p[0];
for i = 1 to n do:
cum[i] = cum[i-1] + (n - p[i] - lcp[i])
Now for finding i'th
sub string just find the lower bound of i
in cumulative array cum[]
that will give you rank of suffix from where your sub string should start and print all character till length of
i - cum[pos-1] + lcp[pos] // i lies between cum[pos-1] and cum[pos] so for finding
// length of sub string starting from cum[pos-1] we should
// subtract cum[pos-1] from i and add lcp[pos] as it is
// common string between current rank suffix and
// previous rank suffix.
where pos
is value return by lower bound.
Whole above process can be summarized as follow:
string ithSubstring(int i){
pos = lower_bound(cum , cum + n , i);
return S.substr(arr[pos] , i - cum[pos-1] + lcp[pos]);// considering S as original character string
}
For full implementation of Suffix array , LCP and above logic you can see Here