Search code examples
phpalgorithmip-addresstext-search

Fast file search algorithm for IP addresses


Question

What is the fastest way to find if an IP address exists in a file that contains IP addresses sorted as:

219.93.88.62
219.94.181.87
219.94.193.96
220.1.72.201
220.110.162.50
220.126.52.187
220.126.52.247

Constraints

  • No database (e.g., MySQL, PostgreSQL, Oracle, etc.)
  • Infrequent pre-processing is allowed (see possibilities section)
  • Would be nice not to have to load the file each query (131Kb)
  • Uses under 5 megabytes of disk space
  • No extra PHP modules

File Details

  • One IP address per line
  • 9500+ lines

Possible Solutions

  • Create a directory hierarchy (radix tree?) then use is_dir() (sadly, this uses 87 megabytes)

Solution

  • Scanning the file line by line to find an IP seems like a pain if you have 9,000 non-matches to check before you get to 232.0.17.1

    Is your file constrained to a single file? e.g. lets say this list is banned IPs and you just want to see if one is "in" the list.

    What if you made a DIR to contain multiple files:

    BannedIPs
      +- 0.ips
      +- 1.ips
      +- 37.ips
      +- 123.ips
      +- 253.ips
      +- 254.ips
    

    Each file only contains IP addresses that start with that number.

    If you were lucky enough to have even distribution... you'd have 256 files, but each would only have ~37 entries.

    Thus when you want to test: 232.0.17.1 you look in the 232.ips file and scan for it.