Search code examples
arraysdata-structuressegment-tree

Segment Tree to compute frequencies


Is there any way to use a Segment Tree structure to compute the frequency of a given value in an array?

Suppose there is an array A of size N, and every element A[i] of the array contains the value 0, 1 or 2. I want to perform the following operations:

  • Compute the amount of zeroes in any range [a,b] of the array
  • Increment (mod 3) every element in in any range [a,b] of the array

Example: If A = [0,1,0,2,0]:

  • Query[2,4] must return 1 , since there is one 0 in the range [2,4]
  • Increment[2,4] updates A to [0,2,1,0,0]

This looks really similar to the Range Sum Query problem, which can be solved using Segment Trees (in this case using Lazy Propagation because of range updates), but i had no success adapting my seg tree code to this problem, because if i store the values in the tree like in a normal RSQ, any parent node which contains the value "3" (for example) wouldn't mean nothing, since with this information i can't extract how much zeroes are present in this range.

Thanks in advance!

--

EDIT:

Segment Trees are binary tree structures that store intervals related to an array in its nodes. The leaf nodes store the actual array cells, and each parent node stores a function f(node->left, node->right) of its children. Segment Trees are commonly used to perform Range Sum Queries, in which we want to compute the sum of all elements in a range [a,b] of the array. In this case, the function computed by the parent nodes is the sum of the value in its children nodes. We want to use segtrees to solve the Range Sum Query problem because it allows to solve it in O(log n) (we only need to descend the tree until we find the nodes that are completely covered by our range query), much better than the naive O(n) algorithm.


Solution

  • Since actual array values are stored in the leaves (level L), let the nodes at level L - 1 store how many zeros they contain (which will be a value in the range [0, 2]). Other than that, everything is the same, the rest of the nodes will compute f(node->left, node->right) as node->left + node->right and the count of zeros will be propagated to the root.

    After incrementing a range, if that range contained no zeros than nothing needs to be done. If however that range had zeros, then all those zeros will now be ones and the function value of current node (call it F) now becomes just zero. That change in the value now needs to be propagated upwards to the root, each time subtracting F from the function values.