Search code examples
pythonpandasmulti-indexslicers

how to make indexer expression to slice a Multilndex and slect values in python when the choosen indexers are continous integers?


I use a MuiliIndexed dataframe in python to process some measured data with time information.

I use ‘h’ as the name of the first level index indicating the hour when the data appearred, and 'min' as the second level indicating the minute. When I want to get the mean value of the data during 10:03 to 10:15 or a time duration even longer, I could not find a proper representation to slice data. The code always comes out with SyntaxError: invalid syntax.

The dataframe is as followed and naming as 'means':

               L = 0.96m    L = 1.46m
h   min     
10  3   -0.116562   -0.110844
        4   -0.113849   -0.134462
        5   -0.140548   -0.132054
        6   -0.139505   -0.134903
        7   -0.124237   -0.116645
        8   -0.119559   -0.120527
        9   -0.136731   -0.159849
        10  -0.124228   -0.118011
        11  -0.137301   -0.124688
        12  -0.166075   -0.137226
        13  -0.124688   -0.126409
        14  -0.129269   -0.126247
        15  -0.104269   -0.126129
        16  -0.132237   -0.135247
        17  -0.124815   -0.148978
        18  -0.110742   -0.116591
        19  -0.124419   -0.124731
        20  -0.117151   -0.135806
        21  -0.135688   -0.124796
        22  -0.130656   -0.121968
        23  -0.142452   -0.141645
        24  -0.112304   -0.121370
        25  -0.115796   -0.134624
        26  -0.126860   -0.122817
        27  -0.120161   -0.115043
        28  -0.117656   -0.107355
        29  -0.127645   -0.138753
        30  -0.135054   -0.120380
        31  -0.142022   -0.110409
        32  -0.132978   -0.115677
        ...

The code I use now is:

means.loc(axis=0)[10,[3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15]]

It works.

                L = 0.96m   L = 1.46m
h   min     
10  3   -0.116562   -0.110844
        4   -0.113849   -0.134462
        5   -0.140548   -0.132054
        6   -0.139505   -0.134903
        7   -0.124237   -0.116645
        8   -0.119559   -0.120527
        9   -0.136731   -0.159849
        10  -0.124228   -0.118011
        11  -0.137301   -0.124688
        12  -0.166075   -0.137226
        13  -0.124688   -0.126409
        14  -0.129269   -0.126247
        15  -0.104269   -0.126129

BUT when I use the following code for convenience:

means.loc(axis=0)[10,[3:14]]

It comes out with syntaxerror:invalid syntax. So is there any another convenient way as silcing a range of values in pandas, instead of listing all the needed indexers? In the case like getting values in the duration from 10:03 to 10:59, it would be difficult for me to list all needed indexers in the 'min' level.


Solution

  • To fix your immediate error, try:

    means.loc(axis=0)[10, slice(3, 14)]
    

    On a more general level, it might be easier to use a single-level DatetimeIndex instead of a MultiIndex, as that would allow you to use pandas' datetime indexing and slicing features.