Search code examples
pythonpandasmulti-index

Accessing pd.Series Multi-index with get() does not find element


Below is a minimal example of my code.

I have a grouped pd.Series with Multiindices and can access single elements using grouped["stallion", "london"], but when using grouped.get(["stallion", "london"]) the result is None (or -1 when a default value is given).

import pandas as pd

a = pd.DataFrame({"breed": ["stallion", "stallion", "stallion", "stallion", "pony", "pony", "pony"],
                  "stable": ["hogwarts", "hogwarts", "london", None, "hogwarts", "london", "london"],
                  "weight": [800, 900, 982, 400, 230, 300, 500]})

grouped = a.groupby(["breed", "stable"], dropna=False)["weight"].mean()
grouped = grouped.append(a.groupby("breed", dropna=False)["weight"].mean())
grouped = grouped.append(pd.Series(a["weight"].mean(), index=["all_breeds"]))

print(grouped)
print()
print(grouped["stallion", "london"])
print(grouped.get(["stallion", "london"]))
print(grouped.get(["stallion", "london"], -1))

print(f'The same? {grouped["stallion", "london"] == grouped.get(["stallion", "london"]) == grouped.get(["stallion", "london"], -1)}')

Expected Behavior I am expecting that all three lines give me the same result: grouped["stallion", "london"] == grouped.get(["stallion", "london"]) == grouped.get(["stallion", "london"], -1)

Reason for using get() is that I want to get the best result of the entry that I can find:

grouped.get(["stallion", "london"], grouped.get("stallion", grouped["all_breeds"]))

Solution

  • You have to use tuple to get values because your index contains tuples (this is not a MultiIndex)

    >>> grouped.index
    Index([    ('pony', 'hogwarts'),       ('pony', 'london'),
           ('stallion', 'hogwarts'),   ('stallion', 'london'),
                  ('stallion', nan),                   'pony',
                         'stallion',             'all_breeds'],
          dtype='object')
    
    >>> grouped.get(["stallion", "london"])
    None
    
    >>> grouped.get(("stallion", "london"))
    982.0
    
    ###
    
    >>> grouped.get(["stallion", "london"], -1)
    -1
    
    >>> grouped.get(("stallion", "london"), -1)
    982.0
    

    Note grouped["stallion", "london"] is equivalent to grouped[("stallion", "london")] but tuple is implicit.

    Final output:

    >>> grouped.get(("stallion", "london"), grouped.get("stallion", grouped["all_breeds"]))
    982.0