Search code examples
algorithmlanguage-agnosticmodulo

Algorithm to determine whether two periodic sequences of intervals have a nonempty intersection


I am looking for an algorithm that, given natural parameters l1, n1, m1, l2, n2, m2, and size, answers "true" if and only if there exists natural numbers k1, k2, r1, r2 such that:

l1 + k1*m1 + r1 = l2 + k2*m2 + r2

with the constraints k1 <= n1, k2 <= n2, r1 < size, r2 < size and r1 <> r2.

The obvious solution is linear in min(n1, n2). I am looking for something a little more efficient.

Context

I am trying to implement in a static analyzer a check for C99 rule 6.5.16.1:3

If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact [...] otherwise, the behavior is undefined.

When the analyzer encounters an assignment *p1 = *p2; where p1 and p2 may point into the same block, it must check that the zones pointed by p1 and p2 do not overlap in a way forbidden by the above rule. The parameter "size" above corresponds to the size of the type pointed by p1 and p2. This size is statically known. The offsets that p1 are known to point to inside the block are represented as a set l1 + k1*m1 with l1 and m1 fixed, known natural integers and k1 varying between 0 and n1 (n1 is also fixed and known). Similarly, the offsets p2 is known to point to are of the form l2 + k2*m2 for some known l2, m2 and n2. The equation l1 + k1*m1 + r1 = l2 + k2*m2 + r2 corresponds to the existence of some overlapping offsets. The condition r1 <> r2 corresponds to the case where the overlap is not exact, when the analyzer must warn.


Solution

  • It seems you are looking for the solution to a linear system of congruences. The Chinese remainder theorem should be applicable. It won't apply your bounds checking, but if it finds a solution you can then check bounds yourself trivially.

    EDIT: Forget the CRT.

    Assuming size <= m1 and size <= m2, model the low (inclusive) and high (exclusive) edges of your memory regions as linear relations:

    addr1low = l1 + k1*m1
    addr1high = addr1low + size = l1 + k1*m1 + size
    addr2low = l2 + k2*m2
    addr2high = addr2low + size = l2 + k2*m2 + size
    

    You want to know if there exist k1, k2 in range such that addr1low < addr2low < addr1high or addr1low < addr2high < addr1high. Note the exclusive inequalities; this way we avoid exactly overlapping ranges.

    Suppose m1 = m2 = m. Consider:

    addr1low < addr2low
    l1 + k1*m < l2 + k2*m
    (k1 - k2) * m < l2 - l1
    k1 - k2 < (l2 - l1) / m
    
    addr2low < addr1high
    l2 + k2*m < l1 + k1*m + size
    l2 - l1 < (k1 - k2) * m + size
    (l2 - l1 - size) < (k1 - k2) * m
    (l2 - l1 - size) / m < k1 - k2
    

    The progression above is reversible. Assuming k1, k2 may be 0, -n2 <= k1 - k2 <= n1. If there is an integer in range between (l2 - l1 - size) / m and (l2 - l1) / m, then the system holds and there is an overlap. That is, if ceil(max((l2 - l1 - size) / m, -n2)) <= floor(min((l2 - l1) / m, n1)). The other case (addr1low < addr2high < addr1high) proceeds similarly:

    addr1low < addr2high
    l1 + k1*m < l2 + k2*m + size
    // ..
    (l1 - l2 - size) / m < k2 - k1
    
    addr2high < addr1high
    addr2low + size < addr1low + size
    addr2low < addr1low
    // ..
    k2 - k1 < (l1 - l2) / m
    

    Now the test becomes ceil(max((l1 - l2 - size) / m, -n1)) <= floor(min((l1 - l2) / m, n2)).

    Now consider m1 <> m1, and without loss of generality take m1 < m2.

    Treating the variables as continuous, solve the intersections:

    addr1low < addr2low
    l1 + k*m1 < l2 + k*m2
    (l1 - l2) < k * (m2 - m1)
    (l1 - l2) / (m2 - m1) < k
    
    addr2low < addr1high
    l2 + k*m2 < l1 + k*m1 + size
    l2 - l1 - size < k * (m1 - m2)
    (l2 - l1 - size) / (m1 - m2) > k  // m1 - m2 < 0
    

    Again, the steps are reversible, so any integer k < min(n1, n2) that satisfies the bounds will make the system hold. That is, it holds if ceil(max((l1 - l2) / (m2 - m1), 0)) <= floor(min((l2 - l1 - size) / (m1 - m2), n1, n2)). The other case:

    addr1low < addr2high
    l1 + k*m1 < l2 + k*m2 + size
    l1 - l2 - size < k * (m2 - m1)
    (l1 - l2 - size) / (m2 - m1) < k
    
    addr2high < addr1high
    addr2low + size < addr1low + size
    addr2low < addr1low
    l2 + k*m2 < l1 + k*m1
    l2 - l1 < k * (m1 - m2)
    (l2 - l1) / (m1 - m2) > k   // m1 - m2 < 0
    

    Here the test becomes ceil(max((l1 - l2 - size) / (m2 - m1), 0)) <= floor(min((l2 - l1) / (m1 - m2), n1, n2)).

    The final pseudo code might look something like this:

    intersectos?(l1, n1, m1, l2, n2, m2, size) {
      if (m1 == m2) {
        return ceil(max((l2 - l1 - size) / m, -n2)) <= floor(min((l2 - l1) / m, n1)) ||
               ceil(max((l1 - l2 - size) / m, -n1)) <= floor(min((l1 - l2) / m, n2));
      }
    
      if (m1 > m2) {
        swap the arguments
      }
    
      return ceil(max((l1 - l2) / (m2 - m1), 0)) <= floor(min((l2 - l1 - size) / (m1 - m2), n1, n2)) ||
             ceil(max((l1 - l2 - size) / (m2 - m1), 0)) <= floor(min((l2 - l1) / (m1 - m2), n1, n2));
    }