Search code examples
pythonexcelpandasdata-analysis

Problems with MultiIndex of imported Excel-File in pandas


I imported a sheet of an Excel-File and parsed it in a panda DataFrame.

path = 'bla.xls'
x = pd.ExcelFile(path)
sheets = x.sheet_names
table = x.parse(sheets[36], header=2)

which seemed to work and DataFrame-object has the expected MultiIndex-index:

In[180]: table.index
Out[180]: MultiIndex(levels=[[u'Gesamt', u'Studiengang Hochschulbenennung'], [u'Bekleidungstechnik', u'Betriebswirtschaft', u'Biomedical Engineering', u'Ernährungs- und Hygienetechnik', u'Facility Management', u'Kommunikations- und Softwaretechnik', u'Lebensmittel, Ernährung, Hygiene', u'Maschinenbau', u'Pharmatechnik', u'Systems Engineering', u'Textil- und Bekleidungsmanagement', u'Wirtschaftsinformatik', u'Wirtschaftsingenieurwesen', u'Wirtschaftsingenieurwesen - Produktionsmanagement insbes. Fahrzeugwirtschaft'], [u'% innerhalb von Studiengang Hochschulbenennung', u'Anzahl']],
       labels=[[1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, -1], [1, -1, 3, -1, 6, -1, 7, -1, 5, -1, 9, -1, 0, -1, 10, -1, 8, -1, 11, -1, 2, -1, 12, -1, 13, -1, 4, -1, -1, -1], [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]])

Unfortunately, I cannot select a row:

In [181]: table.ix[('Studiengang Hochschulbenennung', 'Betriebswirtschaft')]
Out[182]: KeyError: 'Key length (2) was greater than MultiIndex lexsort depth (0)'

And, in fact:

In [182]: table.index.lexsort_depth
Out[182]: 0

whereas

In [183]:len(table.index.levels)
Out[183]: 3

... and that's incoherent, is it?

How can I fix this and increase table.index.lexsort_depth to 3?

Edit: There is a connection to another question where a similar Problem is answered very well.


Solution

  • You'll need to sort your dataframe before selecting row:

    table.sort(inplace=True)
    table.ix[('Studiengang Hochschulbenennung', 'Betriebswirtschaft')]
    

    After sorting,

    In [21]: df.index.lexsort_depth 
    Out[21]: 3