I have a list of ~1,000,000 ip address strings. I want to get the set of these ip addresses that are in three cidrs (each cidr is a string like this: "1.0.0.0/25"). What is the fastest way to do this?
A) Convert the three cidrs into sets containing all ip addresses contained in the cidrs. For each ip address in my list, I check if the ip address is in the wanted ip address set.
B) Convert each cidr into min & max ip address. Convert each ip address into a tuple of ints and check if ip > min and ip < max.
If you're on Python 3.3 or higher, a decent solution is to use the ipaddress
module. Convert your CIDRs to network objects with ipaddress.ip_network
up front, then convert your addresses to address objects (with ipaddress.ip_address
if they might be IPv4 or IPv6, or just ipaddress.IPv4Address
/ipaddress.IPv6Address
directly if they are of known type (skips a layer of wrapping).
You can test for membership relatively cheaply with the in
operator, e.g. if you stored your networks in a sequence (e.g. list
/tuple
) you could do:
for address in map(ipaddress.ip_address, stream_of_string_addresses):
if any(address in network for network in networks):
... got a match ...
There are more efficient solutions (particularly if you're talking about many networks, not just three), but this is straightforward, relatively memory efficient, and leaves you with a useful object (not just the raw address string) for further processing.