Search code examples
pythonsqlpymssql

Extract MIN MAX of SQL table in python


I have a db with the following info:

+----+------+-----+-----+------+------+-----------+    
| id | time | lat | lon | dep  | dest |  reg      |    
+----+------+-----+-----+------+------+-----------+    
| a  |    1 |  10 |  20 | home | work | alpha     |    
| a  |    6 |  11 |  21 | home | work | alpha     |    
| a  |   11 |  12 |  22 | home | work | alpha     |    
| b  |    2 |  70 |  80 | home | cine | beta      |    
| b  |    8 |  70 |  85 | home | cine | beta      |    
| b  |   13 |  70 |  90 | home | cine | beta      |    
+----+------+-----+-----+------+------+-----------+    

Is it possible to extract the following info:

+----+------+------+----------+----------+----------+----------+------+------+--------+    
| id | tmin | tmax | lat_tmin | lon_tmin | lat_tmax | lon_tmax | dep  | dest |   reg  |    
+----+------+------+----------+----------+----------+----------+------+------+--------+    
| a  |    1 |   11 |       10 |       20 |       12 |       22 | home | work | alpha  |    
| b  |    2 |   13 |       70 |       80 |       70 |       90 | home | cine | beta   |    
+----+------+------+----------+----------+----------+----------+------+------+--------+

what if dep & dest were varying - how would tou select them?

Thks


Solution

  • You could use a window function for that:

    SELECT     t0.id,
               t0.time as     tmin, t1.time as     tmax,
               t0.lat  as lat_tmin, t1.lat  as lat_tmax,
               t0.lon  as lon_tmin, t1.lon  as lon_tmax,
               t0.dep,
               t0.dest,
               t0.reg
    FROM       ( SELECT *, 
                        row_number() over (partition by id order by time asc) as rn
                 FROM   t) AS t0
    INNER JOIN ( SELECT *, 
                        row_number() over (partition by id order by time desc) as rn
                 FROM   t) AS t1
            ON t0.id = t1.id
    WHERE      t0.rn = 1 
           AND t1.rn = 1
    

    This will return the data from the first and last row for each id, when sorted by id, time.

    The values for dep, dest and reg are taken from the first row (per id) only.

    If you want to also have separate rows for when -- for the same id -- you have different values of dep or dest, then just add those in the partition by clauses. All depends on which output you expect in that case:

    SELECT     t0.id,
               t0.time as     tmin, t1.time as     tmax,
               t0.lat  as lat_tmin, t1.lat  as lat_tmax,
               t0.lon  as lon_tmin, t1.lon  as lon_tmax,
               t0.dep,
               t0.dest,
               t0.reg           
    FROM       ( SELECT *, 
                        row_number() over (partition by id, dep, dest, reg 
                                           order by time asc) as rn
                 FROM   t) AS t0
    INNER JOIN ( SELECT *, 
                        row_number() over (partition by id, dep, dest, reg 
                                           order by time desc) as rn
                 FROM   t) AS t1
            ON t0.id = t1.id
    WHERE      t0.rn = 1 
           AND t1.rn = 1
    

    Please consider using a different name for the column time because of this remark in the documentation:

    The following keywords could be reserved in future releases of SQL Server as new features are implemented. Consider avoiding the use of these words as identifiers.

    ... TIME ...