I have an excel sheet that looks like this and I would like to read it as a multi-indexed Pandas Dataframe:
desired result:
M1 M2
site conc LQ conc LQ
C1 A 1 0.5 10 0.5
B 2 0.6 5 0.6
C <LQ 1.0 14 1.0
C2 A <LQ 1.1 12 1.1
C 3 0.5 14 0.5
D 1 0.5 11 0.5
The following code used to produce the desired result:
df = pd.read_excel("test.xlsx",
sheet_name="test",
header=[1,2],
index_col=[0,1],
usecols='A:B,D:G',
parse_dates = False,
)
But now (pandas version 0.25.1) it produces the following error:
ValueError: cannot specify usecols when specifying a multi-index header
Any ideas for a simple work-around?
To reproduce the excel table:
;;start;M1;;M2;;end;M1;;M2;
cam;site;bla;conc;LQ;conc;LQ;bla;conc;LQ;conc;LQ
C1;A;bla;1;0.5;10;0.5;bla;1;0.5;10;0.5
C1;B;bla;2;0.6;5;0.6;bla;2;0.6;5;0.6
C1;C;bla;<LQ;1;14;1;bla;<LQ;1;14;1
C2;A;bla;<LQ;1.1;12;1.1;bla;<LQ;1.1;12;1.1
C2;C;bla;3;0.5;14;0.5;bla;3;0.5;14;0.5
C2;D;bla;1;0.5;11;0.5;bla;1;0.5;11;0.5
Hello I had the same problem, you can't use usecols
when headers
is a multiple index
I solved the problem by doing this:
df = pd.read_excel("test.xlsx",
sheet_name="test",
header=2,
index_col=[0,1],
usecols='A:B,D:G',
parse_dates=False)
index = pd.read_excel("test.xlsx",
sheet_name="test",
header=None,
skiprows=1,
index_col=[0,1],
nrows=2,
usecols='A:B,D:G',
parse_dates=False)
index = index.fillna(method='ffill', axis=1)
df.columns = pd.MultiIndex.from_arrays(index.values)
df