Search code examples
pythonpython-3.xpandasmulti-index

How do I create multi indexed series based on columns and range of values and check if it sorted properly?


I created a series based on Letters: D, E, F and inserted values in this way from 0 to 9 on each row:

 df = pd.DataFrame({'letters': list('DDDDDDDDDDEEEEEEEEEEFFFFFFFFFF'), 'numbers': [0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9]})

  Output: 


  letters  numbers  
  0        D        0
  1        D        1
  2        D        2
  ....
  9        D        9
  10       E        0
  .....
  18       E        8
  19       E        9
  20       F        0
  ......
  28       F        8
  29       F        9

Then I created multi index on this Df with this code ( I wanted to check if this is correct way or there are other ways to create multi index)

  latestone = df.set_index(['letters', 'numbers'],drop=False)

output: 


                letters  numbers
 letters numbers                 
    D       0             D        0
            1             D        1
            ....
            9             D        9
    E       0             E        0
            1             E        1
            ...
            9             E        9
    F       0             F        0
            1             F        1
            ...
            9             F        9

Based on this or if there is any better way to create multi index , I would like to know if these values are being sorted lexicographically or not (True or False if possible).

Also, I would like to get the rows with index [2,5,7] for letter E.


Solution

  • One thing you would like to know is whether the values are sorted. No the values are not sorted. They appear exactly as arranged.

    To be able to get the rows [2,5,7]you can use:

    df.loc[[('E',2),('E',5),('E',7)]]
                    letters  numbers
    letters numbers                 
    E       2             E        2
            5             E        5
            7             E        7
    

    or

    In [578]: df.loc[('E',[2,5,7])]
    Out[578]: 
                    letters  numbers
    letters numbers                 
    E       2             E        2
            5             E        5
            7             E        7