Search code examples
postgresqlipv6ipv4

Efficient way to see if an IP address is in a list of subnets


I basically have a 2 column table with an IP subnet as the key/index and a description as the value. For example:

10.20.30.0/30  "Subnet 1"

I need to write a REST service that will return the description of the subnet containing the given IP address, or list of subnets if the IP address matches more than one.

At first, I thought I would simply use a backend database (Postgres) and expand all the subnets because I wasn't dealing with large amounts of data. So the above example would be expanded to:

10.20.30.0  "Subnet 1"
10.20.30.1  "Subnet 1"
10.20.30.2  "Subnet 1"
10.20.30.3  "Subnet 1"

This is inefficient in terms of storage, especially as subnets get large. However, it is quick and easy to do, and existing databases have really efficient ways to lookup IP addresses as indexes, so this was my first idea. I am mostly concerned with lookup efficiency since I will have about 1 million entries in my DB.

However, I found out I also have requirements for IPv6, and the subnets involved are HUGE. This means I no longer have the option of expanding all the subnets.

Before I go ahead and start writing my custom REST API I wanted to know if POSTGRES queries for checking IP addresses using subnets as an index are efficient. The check is not trivial and I don't know how it would be done internally to maintain efficiency.

Does anyone know how POSTGRES checks for IP addresses in tables indexed by subnet?

EDIT: Here is an example of IP address lookup in POSTGRES that I am talking about (using online https://extendsclass.com/postgresql-online.html)

drop table ipdesc;
create table ipdesc (addr inet, category varchar(20));
insert into ipdesc (addr, category) values ('10.10.10.0/24', 'tens');
insert into ipdesc (addr, category) values ('20.20.20.0/24', 'twenties');
insert into ipdesc (addr, category) values ('50.50.50.0/24', 'fifties');
insert into ipdesc (addr, category) values ('50.50.50.0/30', 'sub-fifty');
select * from ipdesc where inet '50.50.50.1' << addr;

Results:

addr           category
----           --------
50.50.50.0/24  fifties
50.50.50.0/30  sub-fifty

Thanks!


Solution

  • Since CIDRs are contiguous, you could add two columns: the minimum and maximum IP addresses in range, and create indexes on those. I wrote a gist here if you want to see the whole thing and play with it.

    Basically:

    create table mod_ipdesc
    as select addr, category,
      inet(host(network(addr))) as amin,
      inet(host(broadcast(addr))) as amax
    from ipdesc;
    
    create index mod_ipdesc_addr on mod_ipdesc(amin, amax);
    

    Then:

    select *
    from test a
    left outer join mod_ipdesc b
    on (a.ip between b.amin and b.amax);
    

    On your table definition, plus a test table with a few ip values (as inet), we get:

    ip addr category amin amax
    168.192.1.10 null null null null
    10.10.10.20 10.10.10.0/24 tens 10.10.10.0 10.10.10.255
    50.50.50.1 50.50.50.0/24 fifties 50.50.50.0 50.50.50.255
    50.50.50.1 50.50.50.0/30 sub-fifty 50.50.50.0 50.50.50.3
    50.50.50.10 50.50.50.0/24 fifties 50.50.50.0 50.50.50.255

    Update: Query plan comparison

    One thing I couldn't tell initially with such small tables was whether the index was used or not (on the small tables, there was no use of the index, but it could simply be a case where Postgresql determined that a scan was faster). [YES IT IS] (for larger tables).

    The second question was: is Postgres having some magic that lets it use an index on inet column for << type queries? [NO] (at least as far as I can tell, using Postgres 9.6).

    I created another gist with tables of 1000 entries each.

    You can see that the amin, amax index is used (second plan below), but an index directly on addr is not (first plan below: a full table scan):

    explain
    select * from test a
    left outer join ipdesc b
    on (b.addr >> a.ip);
    
    | QUERY PLAN                                                              |
    | :---------------------------------------------------------------------- |
    | Nested Loop Left Join  (cost=0.00..20442.10 rows=6800 width=68)         |
    |   Join Filter: (b.addr >> a.ip)                                         |
    |   ->  Seq Scan on test a  (cost=0.00..23.60 rows=1360 width=32)         |
    |   ->  Materialize  (cost=0.00..21.00 rows=1000 width=36)                |
    |         ->  Seq Scan on ipdesc b  (cost=0.00..16.00 rows=1000 width=36) |
    
    explain
    select * from test a
    left outer join mod_ipdesc b
    on (a.ip between b.amin and b.amax);
    
    | QUERY PLAN                                                                                   |
    | :------------------------------------------------------------------------------------------- |
    | Nested Loop Left Join  (cost=0.28..8001.60 rows=151111 width=132)                            |
    |   ->  Seq Scan on test a  (cost=0.00..23.60 rows=1360 width=32)                              |
    |   ->  Index Scan using mod_ipdesc_addr on mod_ipdesc b  (cost=0.28..4.76 rows=111 width=100) |
    |         Index Cond: ((a.ip >= amin) AND (a.ip <= amax))                                      |