Search code examples
pythonpy-datatable

How to select observations based on Index specified in pydatatable?


I have a datatable as -

DT = dt.Frame(
     A=[1, 3, 2, 1, 4, 2, 1], 
     B=['A','B','C','A','D','B','A'],
     C=['myamulla','skumar','cary','myamulla','api','skumar','myamulla'])
Out[14]: 
   |  A  B   C       
-- + --  --  --------
 0 |  1  A   myamulla
 1 |  3  B   skumar  
 2 |  2  C   cary    
 3 |  1  A   myamulla
 4 |  4  D   api     
 5 |  2  B   skumar  
 6 |  1  A   myamulla

[7 rows x 3 columns]

I'm now going to select an observation which has api in column C as -

DT[f.C=="api",:]
Out[12]: 
   |  A  B   C  
-- + --  --  ---
 0 |  4  D   api

OK,Now i would like to find an index related to this observation so that I can select the observation from this index onwards in datatable,

For example the above observation has got a row number 4 in DT, I can select the observations from 4th onwards as -

DT[4:,:]
Out[15]: 
   |  A  B   C       
-- + --  --  --------
 0 |  4  D   api     
 1 |  2  B   skumar  
 2 |  1  A   myamulla

But what if have millions of observations in DT, i can't figure out the required observation index.


Solution

  • One way around it is to create a temporary index column:

    from datatable import dt, f, update
    DT[:, update(index = range(DT.nrows))]
    
    In [8]: DT
    Out[8]: 
       |     A  B      C         index
       | int32  str32  str32     int32
    -- + -----  -----  --------  -----
     0 |     1  A      myamulla      0
     1 |     3  B      skumar        1
     2 |     2  C      cary          2
     3 |     1  A      myamulla      3
     4 |     4  D      api           4
     5 |     2  B      skumar        5
     6 |     1  A      myamulla      6
    [7 rows x 4 columns]
    

    Now create a filter in i to select the index downwards:

    In [11]: DT[DT[f.C=='api', 'index'][0,0]:, :-1]
    Out[11]: 
       |     A  B      C       
       | int32  str32  str32   
    -- + -----  -----  --------
     0 |     4  D      api     
     1 |     2  B      skumar  
     2 |     1  A      myamulla
    [3 rows x 3 columns]