Search code examples
phparrayssearchpartial-matches

How to match rows in array to an array of masks?


I have array like this:

array('1224*', '543*', '321*' ...) which contains about 17,00 "masks" or prefixes.

I have a second array:

array('123456789', '123456788', '987654321' ....) which contain about 250,000 numbers.

Now, how can I efficiently match every number from the second array using the array of masks/prefixes?

[EDIT]

The first array contains only prefixes and every entry has only one * at the end.


Solution

  • Well, here's a solution:

    Prelimary steps:

    1. Sort array 1, cutting off the *'s.

    Searching:

    1. For each number in array 2 do
      1. Find the first and last entry in array 1 of which the first character matches that of number (binary search).
      2. Do the same for the second character, this time searching not the whole array but between first and last (binary search).
      3. Repeat 2 for the nth character until a string is found.

    This should be O(k*n*log(n)) where n is the average number length (in digits) and k the number of numbers.


    Basically this is a 1 dimensional Radix tree, for optimal performance you should implement it, but it can be quite hard.