Z-Function. String algorithms. Optimize for large strings

The problem:

Given a string s. For each i from 1 to |s|, find the number of occurrences of its prefix of length i in the string.

Input:
The first line of input contains an integer q (1≤q≤10^5) — the number of datasets in the test.

Each dataset consists of a string s. The length of the string s is from 1 to 10^6 characters. The string consists exclusively of lowercase Latin alphabet letters.

The sum of the lengths of strings s across all q datasets in the test does not exceed 10^6.

Output:
For each dataset, output |s| integers c1, c2, ..., c|s|, where c[i] is the number of occurrences of the prefix of length i in the string s.

Example

Input:
5
abacaba
eeeee
abcdef
ababababa
kekkekkek
Output:
4 2 2 1 1 1 1 
5 4 3 2 1 
1 1 1 1 1 1 
5 4 4 3 3 2 2 1 1 
6 3 3 2 2 2 1 1 1
The task must be solved exclusively using the Z-function, and the total time for a string of length 10^6 characters should not exceed 2 seconds.

My solution looks like this:

    #include <iostream>
    #include <string>
    #include <vector>
    
    std::vector<int> ZFunc(const std::string& s) {
        const int sz = s.size();
        std::vector<int> z(sz, 0);
    
        for (int i = 1, l = 0, r = 0; i != sz; ++i) {
            if (r >= i)
                z[i] = std::min(z[i - l], r - i + 1);
    
            while (z[i] + i < sz && s[i + z[i]] == s[z[i]])
                z[i]++;
    
            if (z[i] > r - i + 1) {
                l = i;
                r = i + z[i] - 1;
            }
        }
    
        return z;
    }
    
    int main() {
        int n;
        std::cin >> n;
    
        std::vector<std::vector<int>> res(n);
    
        for (int k = 0; k != n; ++k) {
            std::string s;
            std::cin >> s;
    
            res[k].resize(s.size(), 1);
            std::vector<int> z = ZFunc(s);
    
            for (int i = 1; i != z.size(); ++i) {
                while (z[i]--)
                    res[k][z[i]]++;
            }
        }
    
        for (const auto& ivec : res) {
            for (int i : ivec)
                std::cout << i << " ";
            std::cout << std::endl;
        }
    
        return 0;
    }

Its only issue is that I am exceeding the allowed time on tests, and I do not know how to optimize this algorithm. I suspect that the problem lies in strings of the form aaaaaa...aaabcd, where the number of identical characters is quite large, causing the algorithm to have a complexity of O(n^2).

Algorithm Description:

The essence of the proposed algorithm is as follows: we find the Z-function for each string. The i-th value of the Z-function represents the length of the matching prefix and substring at position i. More about the Z-function can be found here.

For each string, a prefix of length i occurs at least once, so initially, res[k].resize(s.size(), 1); is set to 1 for all prefixes of the string.

Since the task is to specify, for all i from 1 to |s|, the number of occurrences of a prefix of length i in the string s, when we look at a non-zero value of the Z-function at position i, it means that the prefix of length z[i] matches the substring of length z[i], starting at position i. This means we have found at least one more occurrence for all substrings of the prefix of length i.

For example:

s = abacaba
z = 0010301

When we reach i = 2, z[i] = 1, we can confidently increment the result res[0]++, as s[2]=s[0]='a'. We have found the first occurrence of the prefix of length 1. Next, z[4]=3, which means we have found a match for the prefix 'aba', including 'a', 'ab', and 'aba'. Therefore, for prefixes of length 3, 2, and 1, we add +1.

This is done in the following lines:

while (z[i]--)
    res[k][z[i]]++;

Solution

Here's right way to solve this problem. Instead of my while loop, I could use cumulative sums for frequencies of prefix length i: answer