Search code examples
multidimensional-arraynested-listsmembershipapldyalog

Finding matching rows


Given two matrices A and B with the same number of columns I would like to know if there are any rows which are the same in A and B. In Dyalog APL I can use the function split like this:

(↓A) ∊ ↓B

Is there a way to calculate the same result without the split function?


Solution

  • What you've found is a design flaw in Membership in that it implies that the right argument is a set of scalars rather than looking at it as a collection of major cells. This precluded extension according to Leading axis theory. However, Index of was extended thus, and so we can use the fact that it returns the index beyond the end of of the lookup array when a major cell isn't found:

          ⎕← A ← 4 2⍴2 7 1 8 2 8 1 8
    2 7
    1 8
    2 8
    1 8
          ⎕← B ← 5 2⍴1 6 1 8 0 3 3 9 8 9
    1 6
    1 8
    0 3
    3 9
    8 9
          (↓A) ∊ ↓B
    0 1 0 1
          Membership ← {(≢⍵) ≥ ⍵⍳⍺}
          A Membership B
    0 1 0 1
    

    Try it online!

    This can also be written tacitly as Membership ← ⊢∘≢ ≥ ⍳⍨.

    Either way, note that avoiding the detour of nested arrays leads to significant speed gains:

          A←?1000 4⍴10
          B←?1000 4⍴10
          ]runtime -compare "(↓A) ∊ ↓B" "A Membership B"
                                                                              
      (↓A) ∊ ↓B      → 1.6E¯4 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ 
      A Membership B → 8.9E¯6 | -95% ⎕⎕