I have something like a netflow table and would like to group it in such a way that it groups it by (src_ip, src_port, dst_ip, dst_port) where the values could be swapped between the src and dst fields.
src_ip | src_port | dst_ip | dst_port | bytes_sent |
---|---|---|---|---|
192.168.1.1 | 123 | 192.168.10.5 | 321 | 111 |
192.168.10.5 | 321 | 192.168.1.1 | 123 | 222 |
10.0.0.5 | 50 | 172.0.0.5 | 55 | 500 |
172.0.0.5 | 55 | 10.0.0.5 | 50 | 300 |
192.168.1.1 | 123 | 192.168.10.5 | 321 | 1000 |
192.168.1.1 | 123 | 192.168.10.5 | 20 | 999 |
I would like to have the following result from this table:
src_ip | src_port | dst_ip | dst_port | bytes_sent | bytes_recv |
---|---|---|---|---|---|
192.168.1.1 | 123 | 192.168.10.5 | 321 | 1111 | 222 |
10.0.0.5 | 50 | 172.0.0.5 | 55 | 500 | 300 |
192.168.1.1 | 123 | 192.168.10.5 | 20 | 999 | 0 |
Basically, trying to capture the traffic going both ways in a single row. So something like grouping by (src_ip, src_port) and (dst_ip, dst_port) where those values could be inverted. What would be the best way to achieve this?
In order to decide which IP, port and direction you must have a rule who you consider sender and who receiver in your aggregated result. Let's make the lesser IP the source and the greater IP the destination. Then it's just about the same CASE
expression again and again to decide which original column to put in which result column. Once this is done, aggregate your data.
with
data as
(
select
case when src_ip < dst_ip then src_ip else dst_ip end as source_ip,
case when src_ip < dst_ip then dst_ip else src_ip end as dest_ip,
case when src_ip < dst_ip then src_port else dst_port end as source_port,
case when src_ip < dst_ip then dst_port else src_port end as dest_port,
case when src_ip < dst_ip then bytes_sent else 0 end as sent,
case when src_ip < dst_ip then 0 else bytes_sent end as received
from mytable
)
select
source_ip, source_port, dest_ip, dest_port,
sum(sent) as bytes_sent,
sum(received) as bytes_received
from data
group by source_ip, source_port, dest_ip, dest_port
order by source_ip, source_port, dest_ip, dest_port;